为什么我不应该将我所有的 PL/SQL-only VARCHAR2 设为 32767 字节?

2021-12-24 00:00:00 oracle plsql

或者我应该?

(标题的灵感来自 Gary Myers 在 为什么 Oracle varchar2 将强制大小作为定义参数?)

考虑以下变量:

声明-- 数据库表列接口变量v_a tablex.a%type;-- tablex.a 是 varchar2-- PL/SQL 唯一变量v_b varchar2(32767);——这是一个糟糕的约定吗?开始从 tablex 中选择一个到 v_a 中,其中 id = 1;v_b := '一些任意字符串:' ||v_a;-- 忽略潜在的 ORA-06502插入 tabley(id, a) values(1, v_a);-- tablex.a 和 tabley.a 类型匹配v_b := v_b ||'更多任意字符';结尾;/

变量 v_a 用于连接数据库表列,因此使用 %type 属性.但是如果我知道数据类型是 varchar2 为什么我不应该使用 varchar2(4000)varchar2(32767) 也保证字符串从数据库列读取将始终适合 PL/SQL 变量?除了 %type 属性的优越性之外,还有其他反对这个约定的论据吗?

变量 v_b 仅用于 PL/SQL 代码,通常返回到 JDBC 客户端(Java/Python 程序、Oracle SOA/OSB 等)或转储到平面文件中(带有 <代码>UTL_FILE).如果 varchar2 出现例如csv-line 为什么我应该费心计算确切的最大可能长度(除了验证该行在所有情况下都适合 32767 字节,所以我不需要 clob)并每次重新计算我的数据模型发生了变化?

有很多问题涵盖了 SQL 中的 varchar2 长度语义,并解释了为什么 varchar2(4000) 在 SQL 中是一种糟糕的做法.SQL 和 PL/SQL varchar2-type 之间的区别也有很好的介绍:

  • Oracle 中 varchar2 PL/SQL 子程序参数的大小限制是多少?
  • Oracle 中的 VARCHAR(MAX) 与 VARCHAR(n)
  • 为什么 Oracle varchar2 有一个强制大小作为定义参数?
  • 为什么 VARCHAR 需要长度规范?
  • Oracle 存储过程的 varchar2 输入的默认大小是多少,是否可以更改?
  • 为什么使用除 VARCHAR2(4000) 以外的其他方法在 Oracle 数据库中存储字符串?
  • 为什么 oracle plsql varchar2 变量需要大小而参数不需要?
  • 问汤姆 问题:我在一个建模组工作,他们想定义每个 varchar2 字段的最大长度."

我在 APC 的 answer 中看到的唯一讨论这个问题的地方是第 3 点和第 4 点:

<块引用>

数据库在为 PL/SQL 集合分配内存时使用变量的长度.由于内存来自 PGA 超大尺寸,因此变量声明可能会导致程序失败,因为服务器内存不足.

PL/SQL 程序中的单个变量的声明也存在类似的问题,只是集合往往会使问题成倍增加.

例如Oracle PL/SQL Programming, 5th Edition By Steven Feuerstein 没有提到声明太长 varchar2 变量的任何缺点,所以它不会是一个严重的错误,对吧?

更新

经过更多的谷歌搜索后,我发现 Oracle 文档在发布期间有所发展:

引自 PL/SQL 用户指南和参考 10g 第 2 版 第 3 章 PL/SQL 数据类型:

<块引用>

小的 VARCHAR2 变量针对性能进行了优化,而较大的变量针对有效的内存使用进行了优化.截止点是 2000 字节.对于 2000 字节或更长的 VARCHAR2,PL/SQL 仅动态分配足够的内存来保存实际值.对于短于 2000 字节的 VARCHAR2 变量,PL/SQL 会预分配变量的完整声明长度.例如,如果将相同的 500 字节值分配给 VARCHAR2(2000 BYTE) 变量和 VARCHAR2(1999 BYTE) 变量,则前者占用 500 个字节,后者占用 1999 个字节.

引自 PL/SQL 用户指南和参考 11g 第 1 版 第 3 章 PL/SQL 数据类型:

<块引用>

对于 CHAR 变量或最大大小小于 2,000 字节的 VARCHAR2 变量,PL/SQL 会在编译时为最大大小分配足够的内存.对于最大大小为 2,000 字节或更多的 VARCHAR2,PL/SQL 会分配足够的内存来存储运行时的实际值.通过这种方式,PL/SQL 优化了较小的 VARCHAR2 变量以提高性能,并优化较大的变量以提高内存使用效率.

例如,如果将相同的 500 字节值分配给 VARCHAR2(1999 BYTE) 和 VARCHAR2(2000 BYTE) 变量,则 PL/SQL 在编译时为前一个变量分配 1999 个字节,在编译时为后一个变量分配 500 个字节运行时间.

但是 PL/SQL 用户指南和参考 11g 第 2 版 第 3 章 PL/SQL 数据类型 不再提及内存分配,我根本找不到有关内存分配的任何其他信息.(我正在使用这个版本,所以我只检查 11.2 文档.)同样适用于 PL/SQL 用户指南和参考 12c 第 1 版 第 3 章 PL/SQL 数据类型.

我还找到了 Jeffrey Kemp 的答案,也解决了这个问题.然而,Jeffrey 的回答参考了 10.2 文档,问题根本与 PL/SQL 无关.

解决方案

当 Oracle 实施了不同的优化时,这似乎是 PL/SQL 功能在版本中不断发展的领域之一.

请注意,这也意味着 OP 中列出的一些答案也是特定于版本的,即使这些问题/答案中没有明确提及.随着时间的流逝和旧版 Oracle 的使用结束(我在做白日梦?),这些信息就会过时(可能需要几十年的时间).

以上结论得到以下引用的支持 来自 PL/SQL 语言参考 11g R1 的第 12 章 Tuning PL/SQL Applications for Performance:

<块引用>

声明 4000 个或更多字符的 VARCHAR2 变量

当您不确定表达式结果有多大时,您可能需要分配大的 VARCHAR2 变量.您可以通过声明大尺寸的 VARCHAR2 变量(例如 32000)来节省内存,而不是仅在偏高的地方进行一些估计,例如指定 256 或 1000.PL/SQL 有一个优化,可以轻松避免溢出问题和仍然节省内存.为 VARCHAR2 变量指定超过 4000 个字符的大小;PL/SQL 会一直等到您分配变量,然后才分配所需的存储空间.

11g R2 也不是 12c R1 版本的文档.这符合第 3 章 PL/SQL 数据类型的演变.

答案:

从 11gR2 开始,使用 varchar2(10)varchar2(32767) 从内存使用的角度来看没有区别.Oracle PL/SQL 编译器将以最佳方式为您处理脏细节!

对于 11gR2 之前的版本,有一个使用不同内存管理策略的截止点,这在每个版本的 PL/SQL 语言参考中都有明确记录.

当没有可以从问题域导出的自然长度限制时,以上仅适用于 PL/SQL-only 变量.如果一个 varchar2 变量代表一个 GTIN-14 那么应该将其声明为 varchar2(14).

当 PL/SQL 变量与表列接口时,使用 %type-attribute,因为这是保持 PL/SQL 代码和数据库结构同步的零努力方式.

内存测试结果:

我在 Oracle 数据库 11g 企业版 11.2.0.3.0 版中运行内存分析,结果如下:

str_size 迭代 UGA PGA————————————————————————10 100 65488 010 1000 65488 6553610 10000 65488 65536032767 100 65488 032767 1000 65488 6553632767 10000 65488 655360

因为 PGA 更改是相同的,并且仅取决于 iterations 而不是 str_size 我得出的结论是 varchar2 声明的大小无关紧要.不过这个测试可能太天真了 - 欢迎评论!

测试脚本:

-- plsql_memory 是包装 sys.v_$mystat 和-- sys.v_$statname 表由 Steven Feuerstein 编写,可在-- 他的书随附的代码压缩文件.设置验证关闭定义 str_size=&1定义迭代=&2宣布类型 str_list_t 是 varchar2(&str_size) 的表;开始plsql_memory.start_analysis;宣布v_strs str_list_t := str_list_t();开始对于 i in 1 .. &iterations环形v_strs.extend;v_strs(i) := rpad(to_char(i), 10, to_char(i));结束循环;plsql_memory.show_memory_usage;结尾;结尾;/出口

试运行示例:

$ sqlplus -SL @memory-test.sql 32767 10000UGA内存变化:65488(当前=1927304)PGA 内存变化:655360(当前 = 3572704)PL/SQL 过程成功完成.$

Or should I ?

(The title is inspired by Gary Myers' comment in Why does Oracle varchar2 have a mandatory size as a definition parameter?)

Consider the following variables:

declare
  -- database table column interfacing variable
  v_a tablex.a%type; -- tablex.a is varchar2
  -- PL/SQL only variable
  v_b varchar2(32767); -- is this a poor convention ?
begin
  select a into v_a from tablex where id = 1;
  v_b := 'Some arbitrary string: ' || v_a; -- ignore potential ORA-06502
  insert into tabley(id, a) values(1, v_a); -- tablex.a and tabley.a types match
  v_b := v_b || ' More arbitrary characters';
end;
/

Variable v_a is used to interface a database table column and therefore uses a %type attribute. But if I know the data type is varchar2 why shouldn't I use varchar2(4000) or varchar2(32767) that also guarantee the string read from database column will always fit to the PL/SQL variable ? Is there any other argument against this convention except the superiority of %type attribute ?

Variable v_b is only used in PL/SQL code and is usually returned to a JDBC client (Java/Python program, Oracle SOA/OSB etc.) or dumped into a flat file (with UTL_FILE). If the varchar2 presents e.g. csv-line why I should bother to calculate the exact maximum possible length (except to verify the line will fit into 32767 bytes in all cases so I don't need a clob) and re-calculate every time my data model changes ?

There is plenty of questions that covers varchar2 length semantics in SQL and explains why varchar2(4000) is a poor practice in SQL. Also the difference between SQL and PL/SQL varchar2-type is well covered:

  • What is the size limit for a varchar2 PL/SQL subprogram argument in Oracle?
  • VARCHAR(MAX) versus VARCHAR(n) in Oracle
  • Why does Oracle varchar2 have a mandatory size as a definition parameter?
  • Why does VARCHAR need length specification?
  • What is the default size of a varchar2 input to Oracle stored procedure, and can it be changed?
  • Why using anything else but VARCHAR2(4000) to store strings in an Oracle database?
  • Why does an oracle plsql varchar2 variable need a size but a parameter does not?
  • Ask Tom question: "I work with a modelers group, who would like to define every varchar2 field with the maximum length."

The only place where I have seen this issue discussed is the points #3 and #4 in an answer by APC:

The database uses the length of a variable when allocating memory for PL/SQL collections. As that memory comes out of the PGA supersizing the variable declaration can lead to programs failing because the server has run out of memory.

There are similar issues with the declaration of single variables in PL/SQL programs, it is just that collections tend to multiply the problem.

E.g. Oracle PL/SQL Programming, 5th Edition By Steven Feuerstein doesn't mention any drawbacks of declaring too long varchar2 variables, so it can't be a critical mistake, right ?

Update

After some more googling I found out that Oracle documentation has evolved during releases:

A quote from PL/SQL User's Guide and Reference 10g Release 2 Chapter 3 PL/SQL Datatypes:

Small VARCHAR2 variables are optimized for performance, and larger ones are optimized for efficient memory use. The cutoff point is 2000 bytes. For a VARCHAR2 that is 2000 bytes or longer, PL/SQL dynamically allocates only enough memory to hold the actual value. For a VARCHAR2 variable that is shorter than 2000 bytes, PL/SQL preallocates the full declared length of the variable. For example, if you assign the same 500-byte value to a VARCHAR2(2000 BYTE) variable and to a VARCHAR2(1999 BYTE) variable, the former takes up 500 bytes and the latter takes up 1999 bytes.

A quote from PL/SQL User's Guide and Reference 11g Release 1 Chapter 3 PL/SQL Datatypes:

For a CHAR variable, or for a VARCHAR2 variable whose maximum size is less than 2,000 bytes, PL/SQL allocates enough memory for the maximum size at compile time. For a VARCHAR2 whose maximum size is 2,000 bytes or more, PL/SQL allocates enough memory to store the actual value at run time. In this way, PL/SQL optimizes smaller VARCHAR2 variables for performance and larger ones for efficient memory use.

For example, if you assign the same 500-byte value to VARCHAR2(1999 BYTE) and VARCHAR2(2000 BYTE) variables, PL/SQL allocates 1999 bytes for the former variable at compile time and 500 bytes for the latter variable at run time.

But PL/SQL User's Guide and Reference 11g Release 2 Chapter 3 PL/SQL Datatypes doesn't mention memory allocation any more and I fail to find any other information about memory allocation at all. (I'm using this release so I check only 11.2 documentation.) The same holds also for PL/SQL User's Guide and Reference 12c Release 1 Chapter 3 PL/SQL Datatypes.

I also found an answer by Jeffrey Kemp that addresses this question too. However Jeffrey's answer refers to 10.2 documentation and the question is not about PL/SQL at all.

解决方案

It looks like this is one of the areas where the PL/SQL functionality has evolved over releases when Oracle has implemented different optimizations.

Note this also means some of the answers listed in the OP are also release specific even that is not explicitly mentioned in those questions/answers. When the time pass by and use of older Oracle releases ends (me daydreaming ?) that information will became outdated (might take decades thought).

The conclusion above is backed with the following quote from chapter 12 Tuning PL/SQL Applications for Performance of PL/SQL Language Reference 11g R1:

Declare VARCHAR2 Variables of 4000 or More Characters

You might need to allocate large VARCHAR2 variables when you are not sure how big an expression result will be. You can conserve memory by declaring VARCHAR2 variables with large sizes, such as 32000, rather than estimating just a little on the high side, such as by specifying 256 or 1000. PL/SQL has an optimization that makes it easy to avoid overflow problems and still conserve memory. Specify a size of more than 4000 characters for the VARCHAR2 variable; PL/SQL waits until you assign the variable, then only allocates as much storage as needed.

This issue is no longer mentioned in 11g R2 nor 12c R1 version of the document. This is in line with the evolution of the chapter 3 PL/SQL Datatypes.

Answer:

Since 11gR2 it makes no difference from memory use of point of view to use varchar2(10) or varchar2(32767). Oracle PL/SQL compiler will take care of the dirty details for you in an optimal fashion !

For releases prior to 11gR2 there is a cutoff-point where different memory management strategies are used and this is clearly documented in each release's PL/SQL Language Reference.

The above only applies to PL/SQL-only variables when there is no natural length restriction that can be derived from the problem domain. If a varchar2-variable represents a GTIN-14 then one should declare that as varchar2(14).

When PL/SQL-variable interfaces with a table column use %type-attribute as that is the zero-effort way to keep you PL/SQL-code and database structure in sync.

Memory test results:

I run a memory analysis in Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 with the following results:

str_size iterations UGA   PGA
-------- ---------- ----- ------
10       100        65488 0
10       1000       65488 65536
10       10000      65488 655360
32767    100        65488 0
32767    1000       65488 65536
32767    10000      65488 655360

Because the PGA changes are identical and depend only on iterations and not str_size I conclude the varchar2 declared size doesn't matter. The test might be too naïve though - comments welcome !

The test script:

-- plsql_memory is a convenience package wrapping sys.v_$mystat s and
-- sys.v_$statname tables written by Steven Feuerstein and available in the
-- code-zip file accompanying his book.

set verify off

define str_size=&1
define iterations=&2

declare
  type str_list_t is table of varchar2(&str_size);
begin
  plsql_memory.start_analysis;

  declare
    v_strs str_list_t := str_list_t();
  begin
    for i in 1 .. &iterations
    loop
      v_strs.extend;
      v_strs(i) := rpad(to_char(i), 10, to_char(i));
    end loop;
    plsql_memory.show_memory_usage;
  end;

end;
/

exit

Test run example:

$ sqlplus -SL <CONNECT_STR> @memory-test.sql 32767 10000

Change in UGA memory: 65488 (Current = 1927304)
Change in PGA memory: 655360 (Current = 3572704)

PL/SQL procedure successfully completed.

$

相关文章