为什么我似乎不能强制 Oracle 11g 为单个 SQL 查询消耗更多 CPU

2021-12-05 00:00:00 sql parallel-processing oracle

我有一些在巨大的表上运行的庞大查询.这些查询似乎是 CPU 瓶颈，并且运行了数小时.我知道 Oracle 11g 第 2 版具有许多新特性，可以在内部并行化查询的执行.然而，无论我在查询中放入什么样的提示，我似乎都不能在数据库框上使用超过 1 个 CPU.我有一台非常不错的 Solaris 机器，有 8 个 CPU，但是每次我运行这个查询时，我最终只是将一个 CPU 推到 100%，然后在那里坐了几个小时.

我尝试过的提示是:

SELECT/*+ 并行 */...选择/*+ 并行 (5) */...选择/*+ 并行 (10) */...

在查看盒子上的整体 CPU 消耗时，这些似乎都不起作用.它似乎总是将一个 CPU 固定在 100%.不幸的是，即使解释计划似乎也需要永远运行.我将尝试使用不同的提示获得不同的解释计划，看看是否有帮助.是否有可能某些查询根本无法并行，即使它们的运行时间在几小时内?！！?此查询中的主表有 3.35 亿行.

SQL 查询文本:

除外.

PARALLEL_ADAPTIVE_MULTI_USER

PARALLEL_AUTOMATIC_TUNING

PARALLEL_DEGREE_LIMIT

PARALLEL_DEGREE_POLICY

PARALLEL_FORCE_LOCAL

PARALLEL_INSTANCE_GROUP

PARALLEL_IO_CAP_ENABLED

PARALLEL_MAX_SERVERS 这是整个系统的上限.这里有一个权衡.一次运行过多的并行服务器对系统不利.但是将查询降级为串行对于某些查询来说可能是灾难性的.

PARALLEL_MIN_PERCENT

PARALLEL_MIN_SERVERS

PARALLEL_MIN_TIME_THRESHOLD

PARALLEL_SERVERS_TARGET

PARALLEL_THREADS_PER_CPU

RAC 节点数默认 DOP 的另一个乘数.

CPU_COUNT 如果使用默认 DOP.

RECOVERY_PARALLELISM

FAST_START_PARALLEL_ROLLBACK

配置文件 SESSIONS_PER_USER 还限制了并行服务器.

资源管理器

系统负载如果 parallel_adaptive_multi_user 为真.可能无法猜测 Oracle 何时开始节流.

流程

并行 DML 限制如果出现以下任何一种情况，并行 DML 将不起作用:

兼容<9.2 用于分区内
INSERT VALUES，带有触发器的表
复制
自引用完整性或删除级联或延迟完整性约束
访问对象列
带有 LOB 的非分区表
具有 LOB 的分区内并行性
分布式事务
集群表
临时表

标量子查询不能并行运行?这在手册中，我希望是，但我的测试表明并行性在 11g 中有效.

ENQUEUE_RESOURCES 10g 中的隐藏参数，这是否还有相关性?

索引组织表无法并行直接路径插入到 IOT?(这还是真的吗?)

并行流水线函数要求必须使用 CURSOR(?).待办事项.

函数必须是 PARALLEL_ENABLE

语句类型旧版本根据分区限制了 DML 的并行性.当前的一些手册中仍然包含这一点，但肯定不再是这样了.

分区数仅适用于旧版本上的分区连接.(?)

错误具体来说，我在解析方面看到了很多错误.Oracle 将分配正确数量的并行服务器，但不会发生任何事情，因为它们都在等待诸如 cursor: pin s wait on x 之类的事件.

这个列表肯定不完整，也不包括 12c 特性.它没有解决操作系统和硬件问题.它没有回答这个极其困难的问题，什么是最好的并行度?"(简短的回答:更多通常更好，但会牺牲其他流程.)希望它至少能让您了解这些问题的难度，并提供一个开始寻找的好地方.

I have some humongous queries that run on gigantic tables. These queries seem to be CPU bottlenecked, and run for hours. I understand that Oracle has a lot of new features with 11g, release 2 to internally paralellize the execution of a query. However no matter what kinds of hints I put in the query, I cant seem to use more than 1 CPU on the database box. I have a very respectable Solaris machine with 8 CPUs, however everytime I run this query, I end up just pushing one CPU to 100% and then sit there for hours.

The hints I have tried are:

SELECT /*+ PARALLEL */ ...
SELECT /*+ PARALLEL(5) */ ...
SELECT /*+ PARALLEL(10) */ ...

None of that appeared to work when looking at overall CPU consumption on the box. It always seemed to peg one CPU at 100%. Unfortunately even explain plan seems to take forever to run. I will try and get different explain plans with the different hints and see if that helps. Is it possible that some queries are simply un-paralleable, even if their runtime is in the hours?!!? The main table in this query has 335 million rows.

SQL Query Text:

http://pastie.org/8634380

System Parameters:

http://pastie.org/8634383

Edit:

Detailed Explain Plan - No Parallelism:

http://pastebin.com/HkZgbPpf

Optimizer related system parameters:

http://pastie.org/8639841

Further EDIT: We have reached out to Oracle to understand why EXPLAIN PLAN takes more than 2 hours. We are timing out trying to run the various explain plans.

解决方案

The most important thing to understand about Oracle parallelism is that it's complicated. Optimizing parallelism requires a lot of Oracle knowledge, reading the manuals, checking many parameters, testing long-running queries, and a lot of skepticism.

Ask the Right Questions

Parallel problems really involve three different questions:

How many parallel servers were requested?
How many parallel servers were allocated?
How many parallel servers were meaningfully used?

Use the Best Tools

Go straight to the best tool - SQL Monitoring with active reports. Find your SQL_ID and generate the HTML report: select dbms_sqltune.report_sql_monitor(sql_id => 'your_sql_id', type => 'active') from dual;. This is the only way to know how much time was spent on each step in the execution plan. And it will tell you how much parallelism was effectively used, and where. For example:

Another good options is type => 'text'. It doesn't have quite as much information but it's a quicker to look at and easier to share.

SQL Monitoring also includes the DOP requested and the DOP allocated:

A 100-line parallel select may run beautifully, but then everything halts at a single step because of an uncached sequence. You can stare at an explain plan, a trace, or an AWR report for hours and not see the problem. The active report makes the slow steps almost trivial to find. Do not waste time guessing where the problem lies.

However, other tools are still required. An explain plan generated with explain plan for ... and select * from table(dbms_xplan.display); will provide a few key pieces of information. Specifically the Notes section can include many reasons why the query did not request parallelism.

But WHY did I get that number of parallel servers?

The relevant information is spread over several different manuals, which are very useful but occasionally inaccurate or misleading. There are many myths and much bad advice about parallelism. And the technology changes significantly with each release.

When you put together all of the reputable sources, the list of factors influencing the number of parallel servers is astonishingly large. The list below is ordered roughly by what I think are the most important factors:

Inter-operation parallelism Any query using sorting or grouping will allocate twice as many parallel servers as the DOP. This is probably responsible for the myth "Oracle allocates as many parallel servers as possible!".
Query hint Preferably a statement-level hint like /*+ parallel */, or possibly an object-level hint like /*+ noparallel(table1) */. If a specific step of a plan is running in serial it is usually because of object-level hints on only part of the query.
Recursive SQL Some operations may run in parallel but can be effectively serialized by recursive SQL. For example, an uncached sequence on a large insert. Recursive SQL generated to parse the statement will also be serial; for example dynamic sampling queries.
Alter session alter session [force|enable] parallel [query|dml|ddl]; Note that parallel DML is disabled by default.
Table degree
Index degree
Index was cheaper Parallel hints only tell the optimizer to consider a full table scan with a certain DOP. They do not actually force parallelism. The optimizer is still free to use a serial index-access if it think it's cheaper. (The FULL hint may help solve this issue.)
Plan management SQL Plan Baselines, outlines, profiles, advanced rewrite, and SQL Translators can all change the degree of parallelism behind your back. Check the Note section of the plan.
Edition Only Enterprise and Personal Editions allow parallel operations. Except for the package DBMS_PARALLEL_EXECUTE.
PARALLEL_ADAPTIVE_MULTI_USER
PARALLEL_AUTOMATIC_TUNING
PARALLEL_DEGREE_LIMIT
PARALLEL_DEGREE_POLICY
PARALLEL_FORCE_LOCAL
PARALLEL_INSTANCE_GROUP
PARALLEL_IO_CAP_ENABLED
PARALLEL_MAX_SERVERS This is the upper limit for the whole system. There's a trade-off here. Running too many parallel servers at once is bad for the system. But downgrading a query to serial can be disastrous for some queries.
PARALLEL_MIN_PERCENT
PARALLEL_MIN_SERVERS
PARALLEL_MIN_TIME_THRESHOLD
PARALLEL_SERVERS_TARGET
PARALLEL_THREADS_PER_CPU
Number of RAC nodes Another multiplier for default DOP.
CPU_COUNT If the default DOP is used.
RECOVERY_PARALLELISM
FAST_START_PARALLEL_ROLLBACK
Profile SESSIONS_PER_USER also limits parallel servers.
Resource Manager
System load If parallel_adaptive_multi_user is true. Probably impossible to guess when Oracle will start throttling.
PROCESSES
Parallel DML restrictions Parallel DML will not work if any of these cases:
1. COMPATIBLE < 9.2 for intra-partition
2. INSERT VALUES, tables with triggers
3. replication
4. self-referential integrity or delete cascade or deferred integrity constraints
5. accessing an object column
6. non-partitioned table with LOB
7. intra-partition parallelism with a LOB
8. distributed transaction
9. clustered tables
10. temporary tables
Scalar subqueries do not run in parallel? This is in the manual, and I wish this was true, but my tests indicate that parallelism works here in 11g.
ENQUEUE_RESOURCES Hidden parameter in 10g, is this relevant any more?
Index-organized tables Cannot direct-path insert to IOTs in parallel? (Is this still true?)
Parallel pipelined function requirements Must use a CURSOR(?). TODO.
Functions must be PARALLEL_ENABLE
Type of statement Older versions restricted parallelism on DML depending on partitioning. Some of the current manuals still include this but it is certainly not true anymore.
Number of partitions Only for partition-wise joins on older versions.(?)
Bugs Specifically I've seen a lot of bugs with parsing. Oracle will allocate the right number of parallel servers but nothing will happen as they all wait for events like cursor: pin s wait on x.

This list is certainly not complete, and does not include 12c features. And it doesn't address operating system and hardware issues. And it doesn't answer the horribly difficult question, "what is the best degree of parallelism?" (Short answer: more is usually better, but at the expense of other processes.) Hopefully it at least gives you a sense of how difficult these problems can be, and a good place to start looking.

相关文章