视图需要相互依赖的逻辑:没有模型可能吗?

2021-12-30 00:00:00 sql oracle11g oracle

我正在尝试编写一些 Oracle 11g SQL，但遇到了一些先有鸡还是先有蛋的问题.我正在寻找类似电子表格的行为.我找到了一个确实使用 Oracle 的 MODEL 子句的解决方案，但性能并不好.所以我想知道非MODEL"解决方案在技术上是否可行.

I'm trying to write some Oracle 11g SQL, but I'm running into a bit of a chicken-and-egg problem. I'm looking for spreadsheet-like behavior. I've found a solution that does use Oracle's MODEL clause, but the performance isn't great. So I'm wondering if a "non-MODEL" solution is even technically feasible.

这是一个演示我正在尝试做的事情的玩具示例.鉴于此表:

Here's a toy example that demonstrates what I'm trying to do. Given this table:

CREATE TABLE t (id NUMBER PRIMARY KEY, n NUMBER); INSERT INTO t (id, n) VALUES (2, 0); INSERT INTO t (id, n) VALUES (3, 1); INSERT INTO t (id, n) VALUES (5, 1); INSERT INTO t (id, n) VALUES (7, 2); INSERT INTO t (id, n) VALUES (11, 3); INSERT INTO t (id, n) VALUES (13, 5); INSERT INTO t (id, n) VALUES (17, 8); INSERT INTO t (id, n) VALUES (19, 13);

我想计算两个额外的派生列，将它们称为 X 和 Y.

I want to compute two additional, derived columns, call them X and Y.

以下是如何计算 X 和 Y 的规则:

Here are the rules for how X and Y are to be computed:

X:对于第一行，由 ID 的最小值定义，将 X 设置为 N.对于所有后续行，X 的值应比前一个 Y 的值小 1，按 ID 排序.

X: For the very first row, as defined by the minimum value of ID, set X to N. For all subsequent rows, the value of X should be one less than the value of the previous Y, as sorted by ID.

你:两次 N 加上 X.

接下来的几个步骤展示了如果我要手动执行此操作，我将如何填写所需的视图.首先，给定数据的前几行:

These next few steps show how I'd fill out my desired view if I were to do this by hand. First, the first few rows of the given data:

ID N X Y --- --- --- --- 2 0 3 1 5 1 7 2 ....

由于我们在第一行，X 应该设置为 N 或 0.Y 应该是 2 * N + X，或者 0.

Since we're in the first row, X should be set to N, or 0. Y should be 2 * N + X, or 0.

ID N X Y --- --- --- --- 2 0 0 0 3 1 5 1 7 2 ....

现在，由于我们不再位于第一行，从现在开始，X 应该总是比前一行的 Y 小一个.在第二行，这意味着 X = (previous Y) - 1 = 0 - 1 = <代码>-1.第二行的 Y 将是 2 * N + X，或 2 * (1) + (-1) = 1.

Now, since we're not in the first row any longer, X should always be one less than the previous row's Y from here on out. Here in the second row, that means X = (previous Y) - 1 = 0 - 1 = -1. And the second row's Y will be 2 * N + X, or 2 * (1) + (-1) = 1.

ID N X Y --- --- --- --- 2 0 0 0 3 1 -1 1 5 1 7 2 ....

如果你继续数学，这里是你想要的结果:

If you continue with the math, here's the desired outcome:

ID N X Y --- --- --- --- 2 0 0 0 3 1 -1 1 5 1 0 2 7 2 1 5 11 3 4 10 13 5 9 19 17 8 18 34 19 13 33 59

给定如何计算 X 和 Y 的规则，是否可以在不必求助于 MODEL 子句的情况下获得此结果?

Given the rules for how X and Y are computed, is it possible to get this outcome without having to resort to the MODEL clause?

我不是在寻找基于此特定示例的数学简化；这只是我想出的一个玩具示例，它展示了我在实际问题中面临的那种相互依赖.

I'm not looking for a mathematical simplification based on this particular example; this is just a toy example I came up with that demonstrates the kind of interdependence I'm facing in my actual problem.

P.S.:这是一个 MODEL 示例，我可以拼凑起来生成此输出；也许有可能进行修改以提高性能?

P.S.: Here's a MODEL example I was able to cobble together that does generate this output; maybe there are modifications possible to improve performance?

SQL> WITH u AS ( 2 SELECT ROW_NUMBER() OVER (ORDER BY t.id) r 3 , t.id 4 , t.n 5 FROM t 6 ) 7 SELECT r 8 , id 9 , n 10 , x 11 , y 12 FROM u 13 MODEL 14 DIMENSION BY (r) 15 MEASURES (id 16 , n 17 , CAST(NULL AS NUMBER) x 18 , CAST(NULL AS NUMBER) y) RULES AUTOMATIC ORDER 19 ( x[1] = n[cv()] 20 , y[r] = 2 * n[cv()] + x[cv()] 21 , x[r > 1] ORDER BY r = y[cv() - 1] - 1 22 ) 23 ; R ID N X Y ---------- ---------- ---------- ---------- ---------- 1 2 0 0 0 2 3 1 -1 1 3 5 1 0 2 4 7 2 1 5 5 11 3 4 10 6 13 5 9 19 7 17 8 18 34 8 19 13 33 59 8 rows selected. SQL>

谢谢.

推荐答案

您可以使用递归子查询分解(也称为递归 CTE):

You could use recursive subquery factoring (also known as a recursive CTE):

with tmp as ( select t.*, row_number() over (order by t.id) as rn from t ), r (id, n, x, y, rn) as ( select id, n, 0, 0, rn from tmp where rn = 1 union all select tmp.id, tmp.n, r.y - 1, (tmp.n * 2) + r.y - 1, tmp.rn from r join tmp on tmp.rn = r.rn + 1 ) select id, n, x, y from r order by rn; ID N X Y ---------- ---------- ---------- ---------- 2 0 0 0 3 1 -1 1 5 1 0 2 7 2 1 5 11 3 4 10 13 5 9 19 17 8 18 34 19 13 33 59

SQL 小提琴.

它基本上是通过您的手动步骤.锚成员是您的第一个手动步骤，将第一行的 x 和 y 都设置为零.然后递归成员执行您指定的计算.(在计算该行的 y 时，您不能引用新计算的 x 值，因此您必须将其重复为 (tmp.n * 2)+ ry - 1).rn 只是按照 ID 保持行的顺序，同时更容易找到下一行 - 所以你可以寻找 rn + 1 而不是找到下一个最高的 ID直接取值.

It's basically walking through your manual steps. The anchor member is your first manual step, setting x and y both to zero for the first row. The recursive member then does the calculation you specified. (You can't refer to the new-calculated x value when calculating that row's y, so you have to repeat that as (tmp.n * 2) + r.y - 1). The rn is just to keep the rows orders by ID while making it easier to find the next row - so you can look for rn + 1 instead of find the next highest ID value directly.

您的示例数据没有显着的性能差异，但是添加了一千行后，模型子句大约需要 5 秒，递归 CTE 大约需要 1 秒；另一千行模型需要约 20 秒，CTE 需要约 3 秒；另一千行模型需要约 40 秒，CTE 需要约 6 秒；再有一千行(总共 4,008 行)模型用了大约 75 秒，CTE 用了大约 10 秒.(我厌倦了等待具有更多行的模型版本；五分钟后用 10,000 杀死了它).我真的不能说这对您的真实数据的表现如何，但在此基础上，它可能值得一试.

There's no significant performance difference with your sample data, but with a thousand rows added the model clause takes about 5 seconds and the recursive CTE takes about 1 second; with another thousand rows model takes ~20 seconds and the CTE takes ~3 seconds; with another thousand rows model took ~40 seconds and CTE took ~6 seconds; and with another thousand rows (so 4,008 in total) model took ~75 seconds and CTE took ~10 seconds. (I got bored waiting for the model version with more rows than that; killed it after a five minutes with 10,000). I can't really say how this will perform with your real data, but on that basis, it's probably worth trying.

相关文章