T-SQL 返回表的所有组合
在另一个问题中,我询问了表格的所有可能的 3 向组合,假设有 3 篇文章.在这个问题中,我想进一步扩展问题以返回具有 n 个不同文章的表的所有 n 向组合.一篇文章可以有多个供应商.我的目标是有一组组合,每个组都有每篇文章.
In a different question, I asked about all possible 3-way combinations of a table, assuming that there are 3 articles. In this question, I would like to further extend the problem to return all n-way combinations of a table with n distinct articles. One article can have multiple suppliers. My goal is to have groups of combinations with each group having each article.
下面是一个示例表,但请记住,可能不止这 3 篇文章.
Below is a sample table but keep in mind that there could be more than those 3 articles.
+---------+----------+
| Article | Supplier |
+---------+----------+
| 4711 | A |
| 4712 | B |
| 4712 | C |
| 4712 | D |
| 4713 | C |
| 4713 | E |
+---------+----------+
对于上面的示例,18 个数据集可能有 6 个组合对.下面是上面示例的解决方案应该是什么样子:
For the example above, there would be 6 combination pairs possible with 18 datasets. Below is how the solution for the example above is supposed to look like:
+----------------+---------+----------+
| combination_nr | article | supplier |
+----------------+---------+----------+
| 1 | 4711 | A |
| 1 | 4712 | B |
| 1 | 4713 | C |
| 2 | 4711 | A |
| 2 | 4712 | B |
| 2 | 4713 | E |
| 3 | 4711 | A |
| 3 | 4712 | C |
| 3 | 4713 | C |
| 4 | 4711 | A |
| 4 | 4712 | D |
| 4 | 4713 | E |
| 5 | 4711 | A |
| 5 | 4712 | D |
| 5 | 4713 | C |
| 6 | 4711 | A |
| 6 | 4712 | D |
| 6 | 4713 | E |
+----------------+---------+----------+
我在另一个问题中问这个,因为在另一个问题中我没有指定我需要这是动态的,而不仅仅是上面的 3 篇文章.在这里您可以找到旧问题.
I am asking this in a different question because in the other question I have not specified that I need this to be dynamic and not only the 3 articles from above. Here you can find the old question.
推荐答案
要创建此结果,您需要确定两件事:
To create this result you'll need to determine two things:
- 一组包含可能组合数量的 ID 的数据.
- 一种确定商品/供应商对何时对应于组合 ID 的方法.
为了找出组合的集合,您需要先弄清楚组合的数量.为此,您需要确定为每件商品设置的每个供应商的规模.使用 count 聚合函数可以轻松获取大小,但为了找出组合,我需要所有值的乘积,这并不容易.幸运的是,在另一个 SO 问题中有一个关于如何做到这一点的答案.
In order to figure out the set of combinations, you need to first figure out the number of combinations. To do that, you need to figure out the size of each supplier set for each article. Getting the size is accomplished easily with a count aggregate function, but in order to figure out the combinations, I needed a product of all the values, which is not easily done. Fortunately there was an answer on how to do this in another SO questions.
既然确定了组合的数量,就必须生成 id.在 TSQL 中没有很好的方法可以做到这一点.我最终使用了一个简单的递归 CTE.这样做的缺点是最多只能产生 32767 个组合.如果您需要更多,还有其他方法可以产生这些值.
Now that the number of combinations is determined, the ids have to be generated. There is no great way to do this in TSQL. I ended up using a simple recursive CTE. This has the drawback of only being able to produce up to 32767 combos. There are other methods to produce these values if you're going to need more.
为了确定物品/供应商对何时与组合对齐,我使用 ROW_NUMBER 窗口函数按文章分区并按供应商排序,以获得每对的序列号.然后它使用了一个老技巧,通过序列号使用组合数的模数来确定是否显示一对.
To determine when an article/supplier pair lines up with a combination I use ROW_NUMBER window function partition by Article and sorted by Supplier to get a sequence number for each pair. Then it's using an old trick of using Modulo of the combination number by the sequence number to determine if a pair is displayed.
仍然存在一个问题,即无法保证冗余对不会配对在一起.为了解决这个问题,添加了一个 CTE,用于计算出现在文章之前的可能组合的数量.目的是在显示下一个顺序之前,将后面文章中的值重复组合的次数.我称之为乘数(即使我用它除以 comboId),这将确保不同的结果.
There is still an issue in that there is no guarantee that redundant pairs won't be paired together. In order to address this a CTE was added that calculates the number of possible combinations that come before an article. The intention is so that a value in a later article is repeated the number of times for a combination before the next in sequence is displayed. I called it multiplier (even though I divide the comboId by it) and this is what will ensure distinct results.
WITH ComboId AS (
-- Product from https://stackoverflow.com/questions/3912204/why-is-there-no-product-aggregate-function-in-sql
SELECT 0 [ComboId], exp (sum (log (SequenceCount))) MaxCombos
FROM (
SELECT COUNT(*) SequenceCount
FROM src
GROUP BY Article
) x
UNION ALL
SELECT ComboId + 1, MaxCombos
FROM ComboId
WHERE ComboId + 1 < MaxCombos
)
, Multiplier AS (
SELECT s.Article
, COALESCE(exp (sum (log (SequenceCount)) OVER (ORDER BY Article ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)), 1) [Multiplier]
FROM (
SELECT Article, COUNT(*) SequenceCount
FROM src
GROUP BY Article
) s
)
, Sequenced AS (
SELECT s.Article, s.Supplier, m.Multiplier
, ROW_NUMBER() OVER (PARTITION BY s.Article ORDER BY s.Supplier) - 1 ArtSupplierSeqNum
, COUNT(*) OVER (PARTITION BY s.Article) MaxArtSupplierSeqNum
FROM src s
INNER JOIN Multiplier m ON m.Article = s.Article
)
SELECT c.ComboId + 1 [ComboId], s.Article, s.Supplier
FROM ComboId c
INNER JOIN Sequenced s ON s.ArtSupplierSeqNum = CAST(c.ComboId / Multiplier as INT) % s.MaxArtSupplierSeqNum
ORDER BY ComboId, Article
OPTION (MAXRECURSION 32767)
相关文章