如何在一个表中存储多个选项?

2022-01-30 00:00:00 mysql database-design

我想设计一个用于结果计算的应用程序.

I want to design an application for result computation.

首先,我需要知道如何在 MySQL 数据库中存储记录,以便学生可以拥有尽可能多的课程,例如学生 A 可以附加 6 个科目,而学生 B 可以附加 12 个科目.

First, I need to know how to store record in a MySQL database in such a way that students can have as many courses attached to them, e.g. student A can have 6 subjects attached to him, while student B can have 12 subject attached to him.

在这种情况下,我需要知道如何设计一个数据库结构,以允许字段以 数组 的形式存储尽可能多的主题.

In this case, I need to know how I could possibly design a database structure that will allow a field to store as many subject as possible in form of an array.

我们将不胜感激任何建议或更好的处理方法.

Any suggestion or a better way to handle this will highly be appreciated.

推荐答案

请阅读 数据规范化、一般索引 概念和 外键 约束以保持数据干净和参照完整性.这会让你继续前进.

Please read up on Data Normalization, General Indexing concepts, and Foreign Key constraints to keep data clean with referential integrity. This will get you going.

在纸面上将数据存储在数组中对您来说似乎很自然,但对于 db 引擎而言,其性能主要是不使用索引.此外,您会在第 2 天发现,获取和维护您的数据将是一场噩梦.

Storing data in arrays may seem natural to you on paper, but to the db engine the performance with mostly be without index use. Moreover, you will find on Day 2 that getting to and maintaining your data will be a nightmare.

以下内容应该能让您在修补时有一个良好的开端.加入.

The following should get you going with a good start as you tinker. Joins too.

create table student
(   studentId int auto_increment primary key,
    fullName varchar(100) not null
    -- etc
);

create table dept
(   deptId int auto_increment primary key,
    deptName varchar(100) not null -- Economics
    -- etc
);

create table course
(   courseId int auto_increment primary key,
    deptId int not null,
    courseName varchar(100) not null,
    -- etc
    CONSTRAINT fk_crs_dept FOREIGN KEY (deptId) REFERENCES dept(deptId)
);

create table SCJunction
(   -- Student/Course Junction table (a.k.a Student is taking the course)
    -- also holds the attendance and grade
    id int auto_increment primary key,
    studentId int not null,
    courseId int not null,
    term int not null, -- term (I am using 100 in below examples for this term)
    attendance int not null, -- whatever you want, 100=always there, 0=he must have been partying,
    grade int not null, -- just an idea   
    -- See (Note Composite Index) at bottom concerning next two lines.
    unique key(studentId,courseId,term), -- no duplicates allowed for the combo (note student can re-take it next term)
    key (courseId,studentId),
    CONSTRAINT fk_sc_student FOREIGN KEY (studentId) REFERENCES student(studentId),
    CONSTRAINT fk_sc_courses FOREIGN KEY (courseId) REFERENCES course(courseId)
);

创建测试数据

insert student(fullName) values ('Henry Carthage'),('Kim Billings'),('Shy Guy'); -- id's 1,2,3
insert student(fullName) values ('Shy Guy');

insert dept(deptName) values ('History'),('Math'),('English'); -- id's 1,2,3

insert course(deptId,courseName) values (1,'Early Roman Empire'),(1,'Italian Nation States'); -- id's 1 and 2 (History dept)
insert course(deptId,courseName) values (2,'Calculus 1'),(2,'Linear Algebra A'); -- id's 3 and 4 (Math dept)
insert course(deptId,courseName) values (3,'World of Chaucer'); -- id 5 (English dept)

-- show why FK constraints are important based on data at the moment
insert course(deptId,courseName) values (66,'Fly Fishing 101'); -- will generate error 1452. That dept 66 does not exist
-- That error is a good error to have. Better than faulty data

-- Have Kim (studentId=2) enrolled in a few courses
insert SCJunction(studentId,courseId,term,attendance,grade) values (2,1,100,-1,-1); -- Early Roman Empire, term 100 (made up), unknown attendance/grade
insert SCJunction(studentId,courseId,term,attendance,grade) values (2,4,100,-1,-1); -- Linear Algebra A
insert SCJunction(studentId,courseId,term,attendance,grade) values (2,5,100,-1,-1); -- World of Chaucer

-- Have Shy Guy (studentId=3) enrolled in one course only. He is shy
insert SCJunction(studentId,courseId,term,attendance,grade) values (3,5,100,-1,-1); -- Early Roman Empire, term 100 (made up), unknow attendance/grade
-- note if you run that line again, the Error 1062 Duplicate entry happens. Can't take same course more than once per term

一些简单的问题.

全部显示,使用表格别名(缩写)来减少打字次数,提高可读性(有时)

show all, uses table aliases (abbreviations) to make typing less, readability (sometimes) better

select c.courseId,c.courseName,d.deptId,d.deptName
from course c
join dept d
on c.deptId=d.deptId
order by d.deptName,c.courseName -- note the order
+----------+-----------------------+--------+----------+
| courseId | courseName            | deptId | deptName |
+----------+-----------------------+--------+----------+
|        5 | World of Chaucer      |      3 | English  |
|        1 | Early Roman Empire    |      1 | History  |
|        2 | Italian Nation States |      1 | History  |
|        3 | Calculus 1            |      2 | Math     |
|        4 | Linear Algebra A      |      2 | Math     |
+----------+-----------------------+--------+----------+

谁在这个学期参加乔叟世界课程?

(知道 courseId=5)

Who is taking the World of Chaucer course this term?

(knowing the courseId=5)

以下内容受益于我们在 SCJunction 中的综合索引之一.复合是多列上的索引.

The below benefits from one of our composite indexes in SCJunction. A composite is an index on more than one column.

select s.StudentId,s.FullName
from SCJunction j
join student s
on j.studentId=s.studentId
where j.courseId=5 and j.term=100
+-----------+--------------+
| StudentId | FullName     |
+-----------+--------------+
|         2 | Kim Billings |
|         3 | Shy Guy      |
+-----------+--------------+

Kim Billings 就读于这个学期的哪个学期?

select s.StudentId,s.FullName,c.courseId,c.courseName
from SCJunction j
join student s
on j.studentId=s.studentId
join course c
on j.courseId=c.courseId
where s.studentId=2 and j.term=100
order by c.courseId DESC -- descending, just for the fun of it
+-----------+--------------+----------+--------------------+
| StudentId | FullName     | courseId | courseName         |
+-----------+--------------+----------+--------------------+
|         2 | Kim Billings |        5 | World of Chaucer   |
|         2 | Kim Billings |        4 | Linear Algebra A   |
|         2 | Kim Billings |        1 | Early Roman Empire |
+-----------+--------------+----------+--------------------+

Kim 不知所措,所以放弃数学课

Kim is overwhelmed, so drop drop the math class

delete from SCJunction
where studentId=2 and courseId=4 and term=100

运行上面的选择语句,显示 Kim 正在服用什么:

run that above select statement showing what Kim is taking:

+-----------+--------------+----------+--------------------+
| StudentId | FullName     | courseId | courseName         |
+-----------+--------------+----------+--------------------+
|         2 | Kim Billings |        5 | World of Chaucer   |
|         2 | Kim Billings |        1 | Early Roman Empire |
+-----------+--------------+----------+--------------------+

啊,更简单的术语.不过爸爸不会高兴的.

Ah, much easier term. Dad won't be happy though.

注意诸如 SCJunction.term 之类的东西.关于这一点可以写很多,我现在主要跳过它,除了说它也应该在某个 FK 中.您可能希望您的术语看起来更像 SPRING2015 而不是 int.

Note such things as SCJunction.term. Much can written about that, I will skip over it at the moment mostly, other than to say it should also be in an FK somewhere. You may want your term to look more like SPRING2015 and not an int.

就 id 而言.这就是我会做的方式.这是个人喜好.这将需要知道 id #'s,查找它们.其他人可以选择使用类似 HIST101 而不是 17 的 courseId.它们的可读性更高(但索引速度较慢(几乎没有).所以做最适合你的吧.

And as far as id's go. This is the way I would do it. It is personal preference. It would require knowing id #'s, looking them up. Others could choose to have a courseId something like HIST101 and not 17. Those are highly more readable (but slower in the index (barely). So do what is best for you.

复合索引(INDEX 表示 KEY,反之亦然)是一种组合多个列以实现快速数据检索的索引.SCJunction 表中两个组合的顺序被翻转,因此,根据查询数据的范围,数据库引擎可以根据您要处理的最左侧列选择用于最快检索的索引.

A Composite Index (INDEX means KEY, and vice-versa) is one that combines multiple columns for fast data retrieval. The orders are flipped for the two composites in the SCJunction table so that, depending on the universe of queries that go after your data, the db engine can choose which index to use for fastest retrieval based on the left-most column you are going after.

至于唯一键#1,它旁边的注释说明不强制重复(意思是垃圾数据)是不言自明的.例如,学生 1 course 1 term 1 不能在该表中存在两次.

As for the unique key, #1, the comment next to it stating enforcing no duplicates (meaning junk data) is rather self-explanatory. For instance, student 1 course 1 term 1 cannot exist twice in that table.

要理解的一个关键概念是索引中列名的left-most排序概念.

A crucial concept to understand is the concept of left-most ordering of column names in an index.

对于 studentId only 之后的查询,则首先列出 studentId 的键(left-most) 被使用.在 courseId only 之后的查询中,使用最左侧具有 courseId 的键.在同时关注 studentId 和 courseId 的查询中,数据库引擎可以决定使用哪个组合键.

For queries that go after studentId only, then the key that has studentId listed first (left-most) is used. In queries that go after courseId only, then the key that has courseId left-most is used. In queries that go after both studentId and courseId, the db engine can decide which composite key to use.

当我说继续"时,我的意思是在 on 子句where 子句 条件中.

When I say "go after", I mean in the on clause or where clause condition.

如果没有这两个复合键(其中的第 1 列和第 2 列被翻转),那么在所查找的列没有 left-most 索引的查询中,您将不会受益于键的使用,并遭受缓慢的 tablescan 数据返回.

Were one not to have those two composite keys (with the column 1 and 2 in them flipped), then in queries where the column sought is not left-most indexed, you would not benefit with key usage, and suffer a slow tablescan for data to return.

所以,这两个索引结合了以下两个概念

So, those two indexes combine the following 2 concepts

  • 基于最左侧或两者(studentId 和 courseId 列)的快速数据检索
  • 根据 studentId、courseId 和学期值强制该表中的数据不重复

重要的要点是,Junction 表可以实现快速索引检索,以及对数据的合理管理与以逗号分隔的数据(数组思维方式)塞进一列,以及使用这种构造.

The important takeaway is that Junction tables make for quick index retrieval, and sane management of data versus comma-delimited data (array mindset) crammed into a column, and all the misery of using such a construct.

相关文章