SQL:加入表后 SUM() 函数返回错误值
我目前正在处理一个数据库项目,并且在连接表时遇到了一些问题.初始情况是:
I am currently working on a database project and have some issues with joining tables. The initial situation is:
四个表:
- 任务 t ~ 50000 条记录
- 项目 p ~ 1000 条记录
- workson w ~ 30000 条记录
- employees e ~ 10000 条记录
表 w 有一个类似于WORKLOAD"的属性,不幸的是 SUM(w.WORKLOAD) 的结果不是预期的:
Table w has an attribute called something like "WORKLOAD", unfortunately the result of SUM(w.WORKLOAD) is not the one expected:
SELECT
p.NAME,
SUM(w.WORKLOAD) AS "Total Workload",
COUNT(DISTINCT w.ESSN) AS "Total Employees",
COUNT(DISTINCT t.NAME) AS "Finished Tasks" --t.NAME is unique
from p
JOIN w ON(p.PNUMBER = w.PNO)
JOIN t ON(p.PNUMBER = t.PNO)
WHERE t.END_DATE is NOT NULL
GROUP BY p.PNUMBER, p.NAME
加入这些表后,SUM() 函数返回一个太大的值.我猜这是因为 SUM() 函数多次计算每个 w.WORKLOAD 值.
After joining these tables, the SUM() function returns a far too big value. I guess that's because the SUM() function counts each w.WORKLOAD value multiple times.
那么有没有像内连接这样的连接操作可以在不使用子查询的情况下解决这个问题?
提前致谢:-)
So is there any join operation like inner join that can fix the issue without using subqueries?
Thanks in Advance :-)
推荐答案
问题是笛卡尔积(其中一个表中的行与其他表中的行相乘).以下方法所做的假设是,每个项目都有一个工作负载,其中分配了员工(所有这些都占所有员工,因为您的查询未显示与员工表的连接)和任务.如果不是这种情况,则考虑使用外连接与内连接.
At issue is a Cartesian product (where rows in one table are being multiplied by the rows in the other tables). The assumption the following approach is making is that every project has a workload with employees assigned (all of which account for all employees since your query doesn't show the join to the employee table) and tasks. If this isn't the case, then consider doing outer joins versus the inner join.
这个想法是根据项目编号在其自己的派生表中执行每个聚合.然后我们可以通过项目编号连接每个派生表以获得有意义的结果.
The idea is to perform each aggregation in its own derived table based on project number. We can then join each derived table by project number to obtain meaningful results.
SELECT
p.NAME,
w.workload_sum AS "Total Workload",
e.employee_count AS "Total Employees",
t.task_count AS "Finished Tasks"
from p
JOIN (select pno, sum(workload) as workload_sum
from w
group by pno) w ON (w.pno=p.pnumber)
JOIN (select pno, count(distinct w.essn) as employee_count
from w
group by pno) e ON (e.pno=p.pnumber)
JOIN (select pno, count(distinct t.name) as task_count
from t
group by pno) t ON (t.pno=p.pnumber)
WHERE t.END_DATE is NOT NULL;
相关文章