Mysql group_concat 的重复键和 1 个查询中多列的重复计数(查询优化)

2022-01-15 00:00:00 join sql mariadb mysql

这个问题是关于查询优化以避免通过 PHP 多次调用数据库.

This question is regarding query optimization to avoid multiple call to database via PHP.

所以这是场景,我有两张表,一张包含您可以称之为参考表的信息,另一张是数据表,字段 key1key2 在两个表,基于这些字段,我们可以加入它们.

So Here is scenario, I have two tables one contains information you can call this as reference table and another one is data table, fields key1 and key2 are common in both table, based on these fields, we can join them.


I don't know whether query can be made even simpler than what I am doing right now, what I want to achieve is as follows :

我想从 main_info 中找到不同的 key1,key2,info1,info2表,只要序列值小于 10 和 key1,key2 两者表匹配,然后按info1,info2分组,同时分组计算 info1,info2 字段重复的重复 key1,key2group_concat 那些键

I would like to find distinct key1,key2,info1,info2 from main_info table, whenever serial value is less than 10 and key1,key2 of both table matches, and then group them by info1,info2, while grouping count the repeated key1,key2 for duplicates of info1,info2 fields and group_concat those keys


Contents of table main_info

MariaDB [demos]> select * from main_info;
| key1 | key2 | info1 | info2 | date     |
|    1 |    1 |    15 |    90 | 20120501 |
|    1 |    2 |    14 |    92 | 20120601 |
|    1 |    3 |    15 |    82 | 20120801 |
|    1 |    4 |    15 |    82 | 20120801 |
|    1 |    5 |    15 |    82 | 20120802 |
|    2 |    1 |    17 |    90 | 20130302 |
|    2 |    2 |    17 |    90 | 20130302 |
|    2 |    3 |    17 |    90 | 20130302 |
|    2 |    4 |    16 |    88 | 20130601 |
9 rows in set (0.00 sec) 


MariaDB [demos]> select * from product1;
| key1 | key2 | serial | product_data |
|    1 |    1 |      0 | NaN          |
|    1 |    1 |      1 | NaN          |
|    1 |    1 |      2 | NaN          |
|    1 |    1 |      3 | NaN          |
|    1 |    2 |      0 | 12.556       |
|    1 |    2 |      1 | 13.335       |
|    1 |    3 |      1 | NaN          |
|    1 |    3 |      2 | 13.556       |
|    1 |    3 |      3 | 14.556       |
|    1 |    4 |      3 | NaN          |
|    1 |    5 |      3 | NaN          |
|    2 |    1 |      0 | 12.556       |
|    2 |    1 |      1 | 13.553       |
|    2 |    1 |      2 | NaN          |
|    2 |    2 |     12 | 129          |
|    2 |    3 |     22 | NaN          |
16 rows in set (0.00 sec)

通过 PHP 我在当前上下文 serial 中对表 main_info 的字段 info1info2 进行分组,product_dataproduct1,一个接一个多次(这里我运行查询两次,你可以看到)

Via PHP I group fields info1 and info2 of table main_info, in current context serial,product_data of table product1, multiple times one after another (here I am running query twice as you can see)

对于字段 serial - 第一次查询

For field serial - 1st query

MariaDB [demos]> select * , count(*) as serial_count,GROUP_CONCAT(key1,' ',key2) as serial_ids from 
    -> (
    -> SELECT distinct 
    -> if(b.serial  < 10,a.key1,null) AS `key1`,
    -> if(b.serial  < 10,a.key2,null) AS `key2`,
    -> if(b.serial  < 10,a.info1,null) AS `info1`, 
    ->         if(b.serial  < 10,a.info2,null) AS `info2`
    -> FROM main_info a inner join product1 b on  a.key1 = b.key1 AND a.key2= b.key2
    -> ) as sub group by info1,info2
    -> ;
| key1 | key2 | info1 | info2 | serial_count | serial_ids  |
| NULL | NULL |  NULL |  NULL |            1 | NULL        |
|    1 |    2 |    14 |    92 |            1 | 1 2         |
|    1 |    3 |    15 |    82 |            3 | 1 3,1 4,1 5 |
|    1 |    1 |    15 |    90 |            1 | 1 1         |
|    2 |    1 |    17 |    90 |            1 | 2 1         |
5 rows in set (0.00 sec)

对于字段 product_data - 第二次查询

For field product_data - 2nd query

MariaDB [demos]> select * , count(*) as product_data_count,GROUP_CONCAT(key1,' ',key2) as product_data_ids from 
    -> (
    -> SELECT distinct 
    -> if(b.product_data IS NOT NULL,a.key1,null) AS `key1`,
    -> if(b.product_data IS NOT NULL,a.key2,null) AS `key2`,
    -> if(b.product_data IS NOT NULL,a.info1,null) AS `info1`, 
    ->         if(b.product_data IS NOT NULL,a.info2,null) AS `info2`
    -> FROM main_info a inner join product1 b on  a.key1 = b.key1 AND a.key2= b.key2
    -> ) as sub group by info1,info2
    -> ;
| key1 | key2 | info1 | info2 | product_data_count | product_data_ids |
|    1 |    2 |    14 |    92 |                  1 | 1 2              |
|    1 |    3 |    15 |    82 |                  3 | 1 3,1 4,1 5      |
|    1 |    1 |    15 |    90 |                  1 | 1 1              |
|    2 |    2 |    17 |    90 |                  3 | 2 2,2 3,2 1      |
4 rows in set (0.01 sec)

我想使用一个查询获得这样的输出,按 info1、info2 分组

| key1 | key2 | info1 | info2 | serial_count | serial_ids  | product_data_count | product_data_ids |
| NULL | NULL |  NULL |  NULL |            1 | NULL        |               NULL | NULL             |
|    1 |    2 |    14 |    92 |            1 | 1 2         |                  1 | 1 2              |
|    1 |    3 |    15 |    82 |            3 | 1 3,1 4,1 5 |                  3 | 1 3,1 4,1 5      |
|    1 |    1 |    15 |    90 |            1 | 1 1         |                  1 | 1 1              |
|    2 |    1 |    17 |    90 |            1 | 2 1         |                  3 | 2 2,2 3,2 1      |


CREATE TABLE `main_info` (
  `key1` int(11) NOT NULL,
  `key2` int(11) NOT NULL,
  `info1` int(11) NOT NULL,
  `info2` int(11) NOT NULL,
  `date` int(11) NOT NULL

LOCK TABLES `main_info` WRITE;
INSERT INTO `main_info` VALUES (1,1,15,90,20120501),(1,2,14,92,20120601),(1,3,15,82,20120801),(1,4,15,82,20120801),(1,5,15,82,20120802),(2,1,17,90,20130302),(2,2,17,90,20130302),(2,3,17,90,20130302),(2,4,16,88,20130601);

CREATE TABLE `product1` (
  `key1` int(11) NOT NULL,
  `key2` int(11) NOT NULL,
  `serial` int(11) NOT NULL,
  `product_data` varchar(1000) DEFAULT NULL

INSERT INTO `product1` VALUES (1,1,0,'NaN'),(1,1,1,'NaN'),(1,1,2,'NaN'),(1,1,3,'NaN'),(1,2,0,'12.556'),(1,2,1,'13.335'),(1,3,1,'NaN'),(1,3,2,'13.556'),(1,3,3,'14.556'),(1,4,3,'NaN'),(1,5,3,'NaN'),(2,1,0,'12.556'),(2,1,1,'13.553'),(2,1,2,'NaN'),(2,2,12,'129'),(2,3,22,'NaN');


Someone please help me to get result in one query.



     key1, key2, info1, info2, 
     SUM(Scount) AS serial_count, GROUP_CONCAT(Skey1, ' ', Skey2) AS serial_ids,
     SUM(Pcount) AS product_data_count, GROUP_CONCAT(Pkey1, ' ', Pkey2) AS product_data_ids 

     IF(b.serial  < 10 OR b.product_data IS NOT NULL,a.key1, NULL) AS `key1`,
     IF(b.serial  < 10 OR b.product_data IS NOT NULL,a.key2, NULL) AS `key2`,
     IF(b.serial  < 10 OR b.product_data IS NOT NULL,a.info1, NULL) AS `info1`, 
     IF(b.serial  < 10 OR b.product_data IS NOT NULL,a.info2, NULL) AS `info2`,
     IF(b.serial  < 10,a.key1, NULL) AS `Skey1`,
     IF(b.serial  < 10,a.key2, NULL) AS `Skey2`,
     IF(b.product_data IS NOT NULL,a.key1, NULL) AS `Pkey1`,
     IF(b.product_data IS NOT NULL,a.key2, NULL) AS `Pkey2`,
     IF(b.serial < 10, 1, NULL) AS `Scount`,
     IF(b.product_data IS NOT NULL, 1, NULL) AS `Pcount`
   FROM main_info a INNER JOIN product1 b ON  a.key1 = b.key1 AND a.key2= b.key2


     NULL AS `key1`,
     NULL AS `key2`,
     NULL AS `info1`,
     NULL AS `info2`,
     NULL AS `Skey1`,
     NULL AS `Skey2`,
     NULL AS `Pkey1`,
     NULL AS `Pkey2`,
     IF(serial > 9, 1, NULL) AS `Scount`,
     IF(product_data IS NULL, 1, NULL) AS `Pcount`
   FROM product1 WHERE serial > 9 xor product_data IS NULL

) AS sub GROUP BY info1,info2


| key1 | key2 | info1 | info2 | serial_count | serial_ids  | product_data_count | product_data_ids |
| NULL | NULL | NULL  | NULL  | 1            | NULL        | NULL               | NULL             |
| 1    | 2    | 14    | 92    | 1            | 1 2         | 1                  | 1 2              |
| 1    | 3    | 15    | 82    | 3            | 1 3,1 4,1 5 | 3                  | 1 3,1 4,1 5      |
| 1    | 1    | 15    | 90    | 1            | 1 1         | 1                  | 1 1              |


| key1 | key2 | info1 | info2 | serial_count | serial_ids  | product_data_count | product_data_ids |
| NULL | NULL | NULL  | NULL  | 1            | NULL        | 1                  | NULL             |
| 1    | 2    | 14    | 92    | 1            | 1 2         | 1                  | 1 2              |
| 1    | 3    | 15    | 82    | 3            | 1 3,1 4,1 5 | 3                  | 1 3,1 4,1 5      |
| 1    | 1    | 15    | 90    | 1            | 1 1         | 1                  | 1 1              |
| 2    | 4    | 16    | 88    | 1            | 2 4         | 1                  | 2 4              |
| 2    | 1    | 17    | 90    | NULL         | NULL        | 3                  | 2 1,2 2,2 3      |


关于问题背后的基本逻辑,我确实能理解一些东西,所以主要根据预期结果来回答.例如如果组字段(info1info2)为空,除了serial_countproduct_data_count 可以是 1 或 null,你真的想得到吗?请注意,此答案使用另一个带有 UNION ALL 的子查询来满足这一点.

There is something that I can really understand about the base logic behind the question, so answer mainly base on expected result. Such as if group field (info1 and info2) are null, the other result will always null except for serial_count and product_data_count that can be 1 or null, did you really meant to get that? Notice that this answer use another sub query with UNION ALL to satisfy that.
