我可以用纯 mysql 解决这个问题吗?(加入一列中的 ';' 分隔值)

2021-11-20 00:00:00 join mysql

长话短说:我需要将多个表格中的数据汇总在一起,为了不必绘制大表格,我简化了它们.

To make a long story short: I have data in several tables that I need to grab together, I have simplified them for the purpose of not having to draw a huge table.

我需要在一个查询中执行此操作,但我无法使用 PHP 或任何其他语言来处理结果.(如果我可以简单地解决这个问题,我会使用 PHP)

I need to do so in one query, and I can't use PHP or any other langauge to work with the results. (I would have used PHP if I could to simply work my way around this)

如果我有一个将 t1 行连接到 t2 的链接表,这不会是一个问题,但不幸的是我没有也不能引入一个.

This wouldn't have been an issue if I were to have a link table that connects the t1 rows to t2 but unfortunately I don't and can't introduce one either.

User table: (alias t1)
user(varchar 150),resources(varchar 250)
+-------+-------+
| user1 | 1;2;4 |
+-------+-------+
| user2 | 2     |
+-------+-------+
| user3 | 3;4   |
+-------+-------+

Resources table: (alias t2)
id(int 11 AI), data(text)
+---+-------+
| 1 | data1 |
+---+-------+
| 2 | data2 |
+---+-------+
| 3 | data3 |
+---+-------+
| 4 | data4 |
+---+-------+
| 5 | data5 |
+---+-------+

多个用户可以连接到同一个资源,用户可以访问一个或多个资源.

Multiple users can be connected to the same resources, and users can access one or more resources.

我希望结果接近:

user,data
+-------+-------+
| user1 | data1 |
+-------+-------+
| user1 | data2 |
+-------+-------+
| user1 | data4 |
+-------+-------+
| user2 | data2 |
+-------+-------+

....等等.

我有基本的 mysql 知识,但这超出了我的知识范围.有什么办法可以内部加入 t2 吗?

I have basic mysql knowledge but this one is out of my knowledge scope. Is there any way I can inner join t2 ?

在发这篇文章之前我读过的主题:如何加入两个在连接字段中使用逗号分隔列表的表

Threads I've read before making this post: How to join two tables using a comma-separated-list in the join field

mysql用逗号分隔的id连接两个表

推荐答案

如果 user_resources (t1) 是一个规范化表",每个 user =>资源 组合,那么获得答案的查询就像将表joining 一样简单.

If the user_resources (t1) was a 'normalized table' with one row for each user => resource combination then the query to get the answer would be as simple as just joining the tables together.

唉,它是非规范化,将resources列设为:资源ID列表",以;"分隔字符.

Alas, it is denormalized by having the resources column as a: 'list of resource id' separated by a ';' character.

如果我们可以将资源"列转换为行,那么随着表连接变得简单,很多困难就会消失.

If we could convert the 'resources' column into rows then a lot of the difficulties go away as the table joins become simple.

生成要求的输出的查询:

SELECT user_resource.user, 
       resource.data

FROM user_resource 
     JOIN integerseries AS isequence 
       ON isequence.id <= COUNT_IN_SET(user_resource.resources, ';') /* normalize */

     JOIN resource 
       ON resource.id = VALUE_IN_SET(user_resource.resources, ';', isequence.id)      
ORDER BY
       user_resource.user,  resource.data

输出:

user        data    
----------  --------
sampleuser  abcde   
sampleuser  azerty  
sampleuser  qwerty  
stacky      qwerty  
testuser    abcde   
testuser    azerty  

如何:

技巧"是有一个包含从 1 到某个限制的数字的表格.我称之为integerseries.它可用于转换水平"的东西,例如:';'分隔的字符串rows.

The 'trick' is to have a table that contains the numbers from 1 to some limit. I call it integerseries. It can be used to convert 'horizontal' things such as: ';' delimited strings into rows.

这种工作方式是,当您使用 integerseries 'join' 时,您正在执行 cross join,这就是使用 'inner joins' '自然地' 发生的事情.

The way this works is that when you 'join' with integerseries, you are doing a cross join, which is what happens 'naturally' with 'inner joins'.

每一行都使用来自 integerseries 表的不同序列号"进行复制,我们将其用作列表中要用于该 .

Each row gets duplicated with a different 'sequence number' from the integerseries table which we use as an 'index' of the 'resource' in the list that we want to use for that row.

我们的想法是:

  • 计算列表中的项目数.
  • 根据其在列表中的位置提取每个项目.
  • 使用 integerseries 将一行转换为一组行,从 user.resources 中提取单独的资源 ID".
  • count the number of items in the list.
  • extract each item based on its the position in the list.
  • Use integerseries to convert one row into a set of rows extracting the individual 'resource id' from user.resources as we go along.

我决定使用两个函数:

  • 给定分隔字符串列表"和索引"的函数将返回列表中该位置的值.我称之为:VALUE_IN_SET.即给定 'A;B;C' 和 'index' 为 2 然后它返回 'B'.

  • function that given a 'delimited string list' and an 'index' will return the value at the position in the list. I call it: VALUE_IN_SET. i.e. given 'A;B;C' and an 'index' of 2 then it returns 'B'.

给定分隔字符串列表"的函数将返回列表中项目数的计数.我称之为:COUNT_IN_SET.即给定 'A;B;C' 将返回 3

function that given a 'delimited string list' will return the count of the number of items in the list. I call it: COUNT_IN_SET. i.e. given 'A;B;C' will return 3

事实证明,这两个函数和 integerseries 应该为列中的分隔项列表提供通用解决方案.

It turns aout that those two functions and integerseries should provide a general solution to delimited items list in a column.

有用吗?

从<代码>'创建'规范化'表的查询;'列中的分隔字符串.它显示了所有列,包括由于cross_join"(isequence.id as resources_index)而生成的值:

The query to create a 'normalized' table from a ';' delimited string in column. It shows all the columns, including the generated values due to the 'cross_join' (isequence.id as resources_index):

SELECT user_resource.user, 
       user_resource.resources,
       COUNT_IN_SET(user_resource.resources, ';')                AS resources_count, 
       isequence.id                                              AS resources_index,
       VALUE_IN_SET(user_resource.resources, ';', isequence.id)  AS resources_value
FROM 
     user_resource 
     JOIN  integerseries AS isequence 
       ON  isequence.id <= COUNT_IN_SET(user_resource.resources, ';')
ORDER BY
       user_resource.user, isequence.id

标准化"表输出:

user        resources  resources_count  resources_index  resources_value  
----------  ---------  ---------------  ---------------  -----------------
sampleuser  1;2;3                    3                1  1                
sampleuser  1;2;3                    3                2  2                
sampleuser  1;2;3                    3                3  3                
stacky      2                        1                1  2                
testuser    1;3                      2                1  1                
testuser    1;3                      2                2  3                

使用上面的标准化"user_resources 表,这是一个简单的连接来提供所需的输出:

Using the above 'normalized' user_resources table, it is a simple join to provide the output required:

需要的功能(这些是通用功能,可以在任何地方使用)

注意:这些函数的名称与mysql有关 FIND_IN_SET 函数.即他们在字符串列表方面做类似的事情?

note: The names of these functions are related to the mysql FIND_IN_SET function. i.e. they do similar things as regards string lists?

COUNT_IN_SET 函数:返回列中字符分隔项的计数.

DELIMITER $$

DROP FUNCTION IF EXISTS `COUNT_IN_SET`$$

CREATE FUNCTION `COUNT_IN_SET`(haystack VARCHAR(1024), 
                               delim CHAR(1)
                               ) RETURNS INTEGER
BEGIN
      RETURN CHAR_LENGTH(haystack) - CHAR_LENGTH( REPLACE(haystack, delim, '')) + 1;
END$$

DELIMITER ;

VALUE_IN_SET 函数:将分隔列表视为一个one based array并返回值在给定索引".

The VALUE_IN_SET function: treats the delimited list as a one based array and returns the value at the given 'index'.

DELIMITER $$

DROP FUNCTION IF EXISTS `VALUE_IN_SET`$$

CREATE FUNCTION `VALUE_IN_SET`(haystack VARCHAR(1024), 
                               delim CHAR(1), 
                               which INTEGER
                               ) RETURNS VARCHAR(255) CHARSET utf8 COLLATE utf8_unicode_ci
BEGIN
      RETURN  SUBSTRING_INDEX(SUBSTRING_INDEX(haystack, delim, which),
                     delim,
                     -1);
END$$

DELIMITER ;

相关信息:

  • 终于找到了如何编译SQLFiddle - 工作代码功能.

有一个版本适用于 SQLite 数据库以及 SQLite- 规范化连接的字段并加入它?

There is a version of this that works for SQLite databases as well SQLite- Normalizing a concatenated field and joining with it?

表格(包含数据):

CREATE TABLE `integerseries` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=500 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

/*Data for the table `integerseries` */

insert  into `integerseries`(`id`) values (1);
insert  into `integerseries`(`id`) values (2);
insert  into `integerseries`(`id`) values (3);
insert  into `integerseries`(`id`) values (4);
insert  into `integerseries`(`id`) values (5);
insert  into `integerseries`(`id`) values (6);
insert  into `integerseries`(`id`) values (7);
insert  into `integerseries`(`id`) values (8);
insert  into `integerseries`(`id`) values (9);
insert  into `integerseries`(`id`) values (10);

资源:

CREATE TABLE `resource` (
  `id` int(11) NOT NULL,
  `data` varchar(250) COLLATE utf8_unicode_ci DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

/*Data for the table `resource` */

insert  into `resource`(`id`,`data`) values (1,'abcde');
insert  into `resource`(`id`,`data`) values (2,'qwerty');
insert  into `resource`(`id`,`data`) values (3,'azerty');

用户资源:

CREATE TABLE `user_resource` (
  `user` varchar(50) COLLATE utf8_unicode_ci NOT NULL,
  `resources` varchar(250) COLLATE utf8_unicode_ci DEFAULT NULL,
  PRIMARY KEY (`user`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

/*Data for the table `user_resource` */

insert  into `user_resource`(`user`,`resources`) values ('sampleuser','1;2;3');
insert  into `user_resource`(`user`,`resources`) values ('stacky','3');
insert  into `user_resource`(`user`,`resources`) values ('testuser','1;3');

相关文章