SQL join following foreign key:静态检查 LHS 是否保留键
通常您会在两个表的外键之后连接它们,以便始终可以找到 RHS 表中的行.添加连接不会影响受查询影响的行数.例如
Often you join two tables following their foreign key, so that the row in the RHS table will always be found. Adding the join does not affect the number of rows affected by the query. For example
create table a (x int not null primary key)
create table b (x int not null primary key, y int not null)
alter table a add foreign key (x) references b (x)
现在,假设你在这两个表中设置了一些数据,你可以从a中获取一定数量的行:
Now, assuming you set up some data in these two tables, you can get a certain number of rows from a:
select x from a
在外键之后向 b 添加连接不会改变这一点:
Adding a join to b following the foreign key does not change this:
select a.x from a join b on a.x = b.x
但是,一般的连接并非如此,它可能会过滤掉一些行或(通过笛卡尔积)添加更多:
However, that is not true of joins in general, which may filter out some rows or (by Cartesian product) add more:
select a.x from a join b on a.x = b.x and b.y != 42 -- probably gives fewer rows
select a.x from a join b on a.x != b.y -- probably gives more rows
在阅读 SQL 代码时,没有明显的方法来判断 join
是否是保留键的类型,它可能会添加额外的列但不会改变返回的行数,或者它是否具有其他效果.随着时间的推移,我制定了一个我主要坚持的编码约定:
When reading SQL code there is no obvious way to tell whether a join
is the key-preserving kind, which may add extra columns but does not change the number of rows returned, or whether it has other effects. Over time I have developed a coding convention which I mostly stick to:
- 如果是保留键的连接,请使用
join
- 如果要过滤行,请将过滤条件放在
where
子句中 - 如果想要更多的行,有时笛卡尔积的
cross join
是最清晰的方法
- if a key-preserving join, use
join
- if wanting to filter rows, put the filter condition in the
where
clause - if wanting more rows, sometimes
cross join
for Cartesian product is the clearest way
这些通常只是样式问题,因为您通常可以将谓词放入 join
子句或 where
子句中.
These are usually just style issues, since you can often put a predicate into either the join
clause or the where
clause, for example.
我的问题
有没有办法在编译查询时让数据库服务器静态检查这些保留键的连接?我知道查询优化器已经知道外键上的连接总是会在外键指向的表中准确找到一行.但为了人类读者的利益,我想在我的 SQL 代码中标记它.例如,假设新语法 fkjoin
用于外键后的连接.那么下面的SQL片段会不会报错:
Is there some way to have these key-preserving joins statically checked by the database server when the query is compiled? I understand that the query optimizer already knows that a join on a foreign key will always find exactly one row in the table pointed to by the foreign key. But I would like to tag it in my SQL code for the benefit of human readers. For example, suppose the new syntax fkjoin
is used for a join following a foreign key. Then the following SQL fragments will give errors or not:
a fkjoin b on a.x = b.x -- OK
a fkjoin b on a.x = b.x and b.y = 42 -- "Error, join can fail due to extra predicate"
a fkjoin b on a.x = b.y -- "Error, no foreign key from a.x to b.y"
这对我在编写 SQL 时以及在稍后返回阅读它时是一个有用的检查.我理解并接受更改数据库中的外键会更改在此方案下合法的 SQL - 对我来说,这是一个理想的结果,因为如果必要的 FK 不再存在,那么查询的键保留语义是不存在的保证时间更长,我想了解一下.
This would be a useful check for me when writing the SQL, and also when returning to read it later. I understand and accept that changing the foreign keys in the database would change what SQL is legal under this scheme - to me, that is a desired outcome, since if a necessary FK ceases to exist then the key-preserving semantics of the query are no longer guaranteed, and I'd like to find out about it.
可能会有一些外部 SQL 静态检查器工具来完成这项工作,并且可以使用特殊的注释语法而不是新的关键字.检查器工具需要访问数据库模式才能查看存在哪些外键,但它不需要实际执行查询.
Potentially, there could be some external SQL static checker tool that does the work, and special comment syntax could be used rather than a new keyword. The checker tool would need access to the database schema to see what foreign keys exist, but it would not need to actually execute the query.
有什么东西可以满足我的需求吗?我正在使用 MSSQL 2008 R2.(微软 SQL Server 为迂腐)
Is there something that does what I want? I am using MSSQL 2008 R2. (Microsoft SQL Server for the pedantic)
推荐答案
我意识到您有兴趣指出特定列上的特定连接是否在 FK 上,或者是限制,或者可能是其他情况,或以上都不是.(并且不清楚您所说的连接的成功"或失败"或它的相关性是什么意思.)而关注该信息,如下所述,就是错过关注更重要和基本的事情.
I realize that you are interested in indicating whether a particular join on particular columns is on a FK, or is a restriction, or perhaps is of some other case, or none of the preceding. (And it's not clear what you mean by "success" or "failure" of a join, or its relevance.) Whereas focusing on that information, as explained below, is to miss focusing on more important and fundamental things.
基表具有含义"或谓词(表达式)",它是 DBA 给出的填空(命名)语句.语句的空白名称是表的列.填空以提出关于世界的真实命题的行放在表中.填补空白以提出关于世界的虚假命题的行被排除在外.即一个表包含满足其语句的行.在不知道它的语句、观察世界并将适当的行放入表中的情况下,您无法将基表设置为某个值.你无法从基表中了解世界,除非知道它的陈述并将当前行命题视为真而将缺席行命题视为假.即你需要它的语句来使用数据库.
A base table has a "meaning" or "predicate (expression)" that is a fill-in-the-(named-)blanks statement given by the DBA. The names of the blanks of the statement are the columns of the table. Rows that fill in the blanks to make a true proposition about the world go in the table. Rows that fill in the blanks to make a false proposition about the world are left out. Ie a table holds the rows that satisfy its statement. You cannot set a base table to a certain value without knowing its statement, observing the world and putting the appropriate rows into the table. You cannot know about the world from base tables except by knowing its statement and taking present-row propositions to be true and absent-row propositions to be false. Ie you need its statement to use the database.
请注意,表声明的典型语法看起来像是其语句的简写:
Notice that the typical syntax for a table declaration looks like a shorthand for its statement:
-- employee [eid] is named [name] and lives at [address] in ...
EMPLOYEE(eid,name,address,...)
您可以通过在其他语句之间/周围放置逻辑运算符 AND、OR、AND NOT、EXISTS name、AND condition 等来生成更大的语句.如果您通过转换将语句转换为关系/SQL 表达式
You can make bigger statements by putting logic operators AND, OR, AND NOT, EXISTS name, AND condition, etc between/around other statements. If you translate a statement to a relation/SQL expression by converting
- 一个表对其名称的声明
- AND 到
JOIN
- OR 到
UNION
- 而不是
EXCEPT
/MINUS
- 存在 C,... [...] 到
SELECT
除 C,.. 之外的所有列.
FROM
...
- AND 条件到
ON/WHERE
条件
- 暗示
SUBSETOF
- IFF 到
=
- a table's statement to its name
- AND to
JOIN
- OR to
UNION
- AND NOT to
EXCEPT
/MINUS
- EXISTS C,... [...] to
SELECT
all columns but C,...
FROM
...
- AND condition to
ON/WHERE
condition
- IMPLIES to
SUBSETOF
- IFF to
=
然后您会得到一个关系表达式,该表达式计算使语句为真的行.(UNION
和 EXCEPT
/MINUS
的参数需要相同的列.)因此,正如每个表都包含满足其语句的行一样,查询表达式包含满足其语句的行.您无法从查询结果中了解世界,除非知道它的陈述并将其当前行命题为真,而将缺席行命题为假.即你需要它的语句来组成或解释一个查询.(请注意,无论有什么约束条件都是如此.)
then you get a relation expression that calculates the rows that make the statement true. (Arguments of UNION
& EXCEPT
/MINUS
need the same columns.) So just as every table holds the rows satisfying its statement, a query expression holds the rows that satisfy its statement. You cannot know about the world from a query result except by knowing its statement and taking its present-row propositions to be true and absent-row propositions to be false. Ie you need its statement to compose or interpret a query. (Observe that this is true regardless of what constraints hold.)
这是关系模型的基础:表表达式计算满足相应语句的行.(SQL有所不同,从字面上看是不合逻辑的.)
This is the foundation of the relational model: table expressions calculate rows satisfying corresponding statements. (To the extent that SQL differs, it is literally illogical.)
例如:如果表 T
包含使语句 T(...,T.Ci,...) 为真的行,而表 U
包含使语句 T(...,T.Ci,...) 为真的行使语句 U(...,U.Cj,...) 为真,然后表 T JOIN U
包含使语句 T(...,T.Ci,...) 和U(...,U.Cj,...) 为真.这就是 JOIN
的语义,它对使用数据库很重要.你总是可以加入,并且一个联接总是有一个意义,它总是它的操作数意义的与.任何表是否碰巧对其他表有 FK 对推理更新或查询并不是特别有帮助.(当您犯错误时,DBMS 会使用约束.)
Eg: If table T
holds the rows that make statement T(...,T.Ci,...) true and table U
holds the rows that make statement U(...,U.Cj,...) true then table T JOIN U
holds the rows that make statement T(...,T.Ci,...) AND U(...,U.Cj,...) true. That is the semantics of JOIN
that is important to using a database. You can always join, and a join always has a meaning, and it is always the AND of its operands' meanings. Whether any tables happen to have FKs to others just isn't particularly helpful for reasoning about updates or queries. (The DBMS uses constraints for when you make mistakes.)
约束表达式只对应于一个命题,也就是关于世界的永远正确的陈述,同时也对应于一个关于基表的命题.例如对于C
UNIQUE NOT NULL
in U
,下面三个表达式是等价的:
A constraint expression just corresponds to a proposition aka always-true statement about the world and simultaneusly to one about base tables. Eg for C
UNIQUE NOT NULL
in U
, the following three expressions are equivalent to each other:
外键 T (C) REFERENCES U (C)
-
EXISTS
C 以外的列
T(...,C,...)
IMPLIES EXISTS
C 以外的列
U(...,C,...)
(SELECT C FROM T) SUBSETOF (SELECT C FROM U)
这确实意味着 SELECT C FROM T JOIN U ON TC = UC
= SELECT C FROM U
,即 FK 上的连接返回相同的数字的行.但那又怎样?连接的含义仍然是其参数的相同功能.
It is true that this implies that SELECT C FROM T JOIN U ON T.C = U.C
= SELECT C FROM U
, ie a join on a FK returns the same number of rows. But so what? The join's meaning is still the same function of its arguments'.
特定列集上的特定联接是否涉及外键与理解查询的含义无关.
Whether a particular join on a particular column set involves a foreign key is just not germane to understanding the meaning of a query.
相关文章