具有以下场景的复合主键或主代理键?
我知道有人问过很多次,但我想解释一下我的场景,看看使用 Identity 列作为主键而不是使用复合主键是否有任何好处.
我目前正在阅读两个文本文件:File1 有 Make &汽车型号,而 File2 有品牌、型号、汽车年份.制作,文件 2 中的模型组合将始终在文件 1 中.
所以我创建了由MakeId(身份)、Make
和Model
列组成的表[Car].表 [Car] 的数据如下所示.[Car] 中的数据是 File1 的精确副本:
[MakeId] [Make] [Model]1 本田雅阁2 本田思域3 福特焦点4 福特护航
对于 File2,我创建了表 [CarYear],其中包含 CarYearId
(身份)、Make
、Model
、Year代码>.[CarYear] 中的数据是 File2 的精确副本:
[CarYearId] [Make] [Model] [Year]1 本田雅阁 20022 本田雅阁 20013 本田雅阁 20044 本田思域 19985 福特福克斯 19986 福特护航 20017 福特护航 2002
为什么我不应该使用 Make、Model 复合主键?鉴于我的情况,因为我有 Make &在两个表中建模,我可以轻松地直接搜索第二个表,而不必进行内连接.
解决方案File1 有 Make &汽车型号...
所以感兴趣的数据看起来像这样.
<前>制作模型——本田雅阁本田思域福特福克斯福特护航make"列显然不是候选键.从这个数据样本中你可以看出,模型"看起来就像一个候选键.实际上,几年前我不得不研究这个问题,我发现只有几个模型是由多个制造商制造的,而且没有一个是最新的.但这并不重要.
无论这里的候选键是{make, model}还是{model},这个表都是6NF.1如果我们假设唯一的候选键是{make, model},我可能在标准 SQL 中像这样实现它.
创建表 car_models (使 varchar(15) 不为空,模型 varchar(15) 不为空,主键(品牌、型号));
<块引用>
File2 包含汽车制造商、型号、年份.
所以感兴趣的数据看起来像这样.
<前>制造车型年——本田雅阁 2002本田雅阁 2001本田雅阁 2004本田思域 1998福特焦点 1998福特护航 2001福特护航 2002按照上表中键的假设,这张表只有一个候选键,并且只有一个附加属性.它也属于 6NF.SQL 版本可能如下所示.
创建表 car_model_years (使 varchar(15) 不为空,模型 varchar(15) 不为空,model_year 整数不为空检查(1886 年和 2099 年之间的模型年份),主键(品牌、型号、型号年份),外键 (make, model) 引用 car_models (make, model));
这些表没有冗余数据.您不能在不破坏语义或损害数据完整性的情况下删除任何列.外键在car_model_years"的行中重复,但这不是多余的——这正是外键用于.
<块引用>为什么我不应该使用 Make、Model 复合主键?
作为一个理论(关系)问题,不,没有.如果您从 6NF 开始,添加代理 ID 号会非规范化该表.(6NF 需要一个单个候选键.)即使您确实添加了代理 ID 号,您仍然必须声明 {make, model}作为 not null unique
.未能声明该约束会使表格最终看起来像这样.
作为一个实际问题,而不是理论(关系)问题,这些 6NF 表可能比使用代理 ID 号对它们进行非规范化表现更好.例如,基于品牌和型号的car_model_years"查询通常使用仅索引扫描——他们根本不需要读取基表.
作为另一个实际问题,一些应用程序框架处理除 id 号之外的任何键都很糟糕.恕我直言,这证明使用更好的框架是合理的,但不会影响数据库的结构.
<小时>1. ……一个‘常规’相关变量在 6NF 中当且仅当它由一个键组成,最多加上一个附加属性."Date,CJ,深入数据库:从业者的关系理论,第 147 页.常规 relvar 是非时间 relvar.
I know it's been asked many times, but I wanted to explain my scenario, and see if there are any benefits of using Identity column as primary key instead of using a composite primary key.
I'm currently reading two text files: File1 has Make & Model of car, while File2 has Make, Model, Year of car. Make, Model combination in File2 will always be in File1.
So I created table [Car] composed of columns of MakeId (identity), Make
and Model
. Data for table [Car] looks like this. The data in [Car] is an exact replica of File1:
[MakeId] [Make] [Model]
1 HONDA ACCORD
2 HONDA CIVIC
3 FORD FOCUS
4 FORD ESCORT
For File2, I created table [CarYear] with columns CarYearId
(identity), Make
, Model
, Year
. The data in [CarYear] is an exact replica of File2:
[CarYearId] [Make] [Model] [Year]
1 HONDA ACCORD 2002
2 HONDA ACCORD 2001
3 HONDA ACCORD 2004
4 HONDA CIVIC 1998
5 FORD FOCUS 1998
6 FORD ESCORT 2001
7 FORD ESCORT 2002
Is there any reason why I shouldn't use Make, Model a composite primary key? Given my case, since I have Make & Model in both tables, I can easily just search the 2nd table directly instead of having to do inner joins.
解决方案File1 has Make & Model of car . . .
So the data of interest looks like this.
make model -- HONDA ACCORD HONDA CIVIC FORD FOCUS FORD ESCORT
The column "make" is clearly not a candidate key. As far as you can tell from this sample of data, "model" looks like a candidate key. I actually had to research this issue several years ago, and I found only a couple of models that were built by more than one manufacturer, and none of those were current. But that doesn't really matter.
Whether the candidate key here is {make, model} or {model}, this table is in 6NF.1 If we assume that the only candidate key is {make, model}, I might implement it like this in standard SQL.
create table car_models (
make varchar(15) not null,
model varchar(15) not null,
primary key (make, model)
);
File2 has Make, Model, Year of car.
So the data of interest looks like this.
make model year -- HONDA ACCORD 2002 HONDA ACCORD 2001 HONDA ACCORD 2004 HONDA CIVIC 1998 FORD FOCUS 1998 FORD ESCORT 2001 FORD ESCORT 2002
Following the assumptions about the key in the previous table, this table has only one candidate key, and it has only one additional attribute. It, too, is in 6NF. A SQL version might look like this.
create table car_model_years (
make varchar(15) not null,
model varchar(15) not null,
model_year integer not null
check (model_year between 1886 and 2099),
primary key (make, model, model_year),
foreign key (make, model) references car_models (make, model)
);
These tables have no redundant data. You can't remove any columns without breaking the semantics or compromising the integrity of the data. Foreign keys are repeated down the rows of "car_model_years", but that's not redundant--that's exactly what foreign keys are for.
Is there any reason why I shouldn't use Make, Model a composite primary key?
As a theoretical (relational) matter, no, there isn't. If you start in 6NF, adding a surrogate ID number denormalizes that table. (6NF requires a single candidate key.) Even if you do add a surrogate ID number, you still have to declare {make, model} as not null unique
. Failure to declare that constraint makes a table liable to end up looking like this.
model_id make model -- 1 Honda Accord 2 Honda Accord 3 Honda Accord
As a practical matter, not a theoretical (relational) matter, these 6NF tables will probably perform better than denormalizations of them using surrogate ID numbers. For example, queries on "car_model_years" that are based on make and model will generally use an index-only scan--they won't have to read the base table at all.
As another practical matter, some application frameworks deal poorly with any key besides an id number. IMHO, this justifies using a better framework, though, not compromising the structure of your database.
1. "... a 'regular' relvar is in 6NF if and only if it consists of a single key, plus at most one additional attribute." Date, CJ, Database in Depth: Relational Theory for Practitioners, p 147. A regular relvar is a nontemporal relvar.
相关文章