查询成本:全局临时表与集合(虚拟数组)

我有一个查询,其结果存储在 GTT(全局临时表)和集合中.

I have a query whose results are stored in a GTT (Global Temporary Table) and in a Collection.

再次从GTT中选择数据,我得到一个很小的代价:103.

Selecting the data from the GTT again, I get a very small cost: 103.

SELECT
    ...
FROM my_table_gtt
JOIN table2 ...
JOIN table3 ...

但是当将其从 GTT 切换到集合(VA - 虚拟阵列)时,成本会飙升(78.000),但两者之间的执行时间差异非常小.

But when switching this from a GTT to a Collection (VA - Virtual Array), the cost skyrockets (78.000), but the difference in execution times between the two is very small.

SELECT
    ...
FROM TABLE(CAST(my_table_va as my_table_tt))
JOIN table2 ...
JOIN table3 ...

我的问题是为什么这两种方法的成本差异如此之大?据我所知,GTT 不存储表统计信息,那么为什么它返回的成本比 VA 好?

My question is why is there such a big difference in cost between the two approaches? From my knowledge, GTTs don't store table statistics, so why is it returning a better cost than the VA?

推荐答案

全局临时表可以像任何其他表一样具有统计信息.事实上,它们就像任何其他表一样,它们有数据段,只是在临时表空间中.

Global temporary tables can have statistics as any other table. In fact they are like any other table, they have data segments, just in temporary tablespace.

在 11g 中,统计信息是全局的,因此它们有时会导致执行计划出现问题.在 12c 中,它们是基于会话的,因此每个会话都会获得适当的会话(如果可用).

In 11g the statistics are global so they sometimes cause issues with execution plans. In 12c they are session based so each session gets proper ones (if available).

集合类型基数基于 DB 块大小,默认 8 kB 块为 8168.集合内容存储在 PGA 中.在复杂查询中使用集合类型来提示优化器时,提示基数是很常见的.您也可以使用扩展的优化器接口来实现自己的成本计算方式.

The collection type cardinality is based on DB block size and for default 8 kB block is 8168. Collection content is stored in PGA. It's quite common to hint the cardinality when using collection types in complex queries to hint the optimizer. You can also use extended optimizer interface for implementing own way for calculating cost.

编辑 - 添加测试:

CREATE TYPE STRINGTABLE IS TABLE OF VARCHAR2(255);
CREATE GLOBAL TEMPORARY TABLE TMP (VALUE VARCHAR2(255));

INSERT INTO TMP SELECT 'Value' || LEVEL FROM DUAL CONNECT BY LEVEL <= 1000000;

DECLARE
    x STRINGTABLE;
    cnt NUMBER;
BEGIN
    SELECT VALUE BULK COLLECT INTO x FROM TMP;

    DBMS_OUTPUT.PUT_LINE(TO_CHAR(SYSTIMESTAMP, 'MI:SS.FF3'));

    SELECT SUM(LENGTH(VALUE)) INTO cnt FROM TMP;

    DBMS_OUTPUT.PUT_LINE(TO_CHAR(SYSTIMESTAMP, 'MI:SS.FF3'));

    SELECT SUM(LENGTH(COLUMN_VALUE)) INTO cnt FROM TABLE(x);

    DBMS_OUTPUT.PUT_LINE(TO_CHAR(SYSTIMESTAMP, 'MI:SS.FF3'));
END;

在这种情况下,访问 GTT 的速度大约是收集速度的两倍,在我的测试机器上为 200 毫秒与 400 毫秒.当我将行数增加到 10 000 000 时,出现 ORA-22813:第二次查询时操作数值超出系统限制.

In this case is the access to GTT about twice as fast then to collection, cca 200 ms vs. 400 ms on my test machine. When I increased the number of rows to 10 000 000, I got ORA-22813: operand value exceeds system limits on the second query.

相关文章