MySQL连接表基于主表中的MAX(日期)和带限制的连接表中的MAX(Id)

2022-04-10 00:00:00 join limit mysql

如果标题没有任何意义,下面是我需要在坚果壳中做的事情。我需要在主表中"by date"选择最近的X个记录量,然后通过在连接表中"by id"选择最近的记录来连接属于这些记录的数据。

以下是一些示例输出..

表:LEAD_UNIQUE(此表中只有唯一SSN)

+-----------+--------------+
| ssn       | created_date |
+-----------+--------------+
| 111111111 | 2015-03-01   |
| 999999999 | 2015-03-03   |
| 555555555 | 2015-02-08   |
+-----------+--------------+

表:Lead_Data

+----+-----------+-------+----------------+-------------+-------+-------+
| id | ssn       | name  | address        | city        | state | zip   |
+----+-----------+-------+----------------+-------------+-------+-------+
|  1 | 111111111 | Bob1  | 1234 Test Ln   | Mound       | CA    | 55555 |
|  2 | 111111111 | Bob2  | 1234 Test Ln   | Mound       | CA    | 55555 |
|  3 | 999999999 | Jane1 | 5432 Lola Blvd | Patton      | NJ    | 33333 |
|  4 | 999999999 | Jane2 | 5432 Lola Blvd | Patton      | NJ    | 33333 |
|  5 | 555555555 | Jack1 | 832 92nd Ave N | Bright View | AL    | 88888 |
|  6 | 999999999 | Jane3 | 5432 Lola Blvd | Patton      | NJ    | 33333 |
+----+-----------+-------+----------------+-------------+-------+-------+

所需输出(可以是asc/desc日期列,无关紧要)

+--------------+-----------+-------+
| created_date | ssn       | name  |
+--------------+-----------+-------+
| 2015-03-03   | 999999999 | Jane3 |
| 2015-03-01   | 111111111 | Bob2  |
| 2015-02-08   | 555555555 | Jack1 |
+--------------+-----------+-------+

所需输出(限制2)

+--------------+-----------+-------+
| created_date | ssn       | name  |
+--------------+-----------+-------+
| 2015-03-03   | 999999999 | Jane3 |
| 2015-03-01   | 111111111 | Bob2  |
+--------------+-----------+-------+

查询可能如下所示,但我也可能走错了路,因为我在这里请求帮助,但运气不佳。

SELECT   
    lead_unique.created_date, 
    lead_unique.ssn,
    lead_data.name
FROM      
    lead_unique
JOIN      
    (
        SELECT    
            ...
        FROM      
            lead_data
        ...
    ) lead_data 
        ...
...
LIMIT 2

我以前只用过一次堆栈溢出,所以如果还有什么我可以添加的,请让我知道!谢谢!!


解决方案

我倾向于对数据片段使用相关子查询--您的问题只提到一列:

select u.created_date, u.ssn,
       (select d.name
        from lead_data d
        where d.ssn = u.ssn
        order by d.id desc
        limit 1
       ) as name
from lead_unique u
order by u.created_date desc
limit 2;

实际上,出于性能原因,我会将唯一组件放入子查询中:

select u.created_date, u.ssn,
       (select d.name
        from lead_data d
        where d.ssn = u.ssn
        order by d.id desc
        limit 1
       ) as name
from (select u.*
      from lead_unique u
      order by u.created_date desc
      limit 2
     ) u;

编辑:

即使有多个列,性能最好的方法可能仍然是使用子查询:

select created_date, ssn, d.*
from (select u.created_date, u.ssn,
             (select d.id
              from lead_data d
              where d.ssn = u.ssn
              order by d.id desc
              limit 1
             ) as id
      from (select u.*
            from lead_unique u
            order by u.created_date desc
            limit 2
           ) u
     ) u join
     lead_data d
     on u.id = d.id;

顺便说一句,如果性能有问题,您需要以下索引:lead_unique(created_date)lead_data(id)。使用这两个索引应该会相当快。

相关文章