MySQL:多张表还是一张多列的表?

2021-11-20 00:00:00 mysql database-design

所以这更像是一个设计问题.

我有一个主键(比如用户的 ID)，而且我有大量与该用户相关的信息.

我应该根据信息将多个表分成几类，还是应该只有一个包含多列的表?

我过去的做法是拥有多个表，例如，一张用于应用程序使用数据的表，一张用于配置文件信息的表，一张用于后端令牌等的表，以使事情看起来井井有条.

最近有人告诉我，最好不要那样做，而且有一张有很多列的表格也不错.问题是，所有这些列都具有相同的主键.

我对数据库设计很陌生，所以哪种方法更好，优缺点是什么?

传统的做法是什么?

解决方案

任何时候信息都是一对一的(每个用户都有一个名字和密码)，那么最好是一张桌子，因为这样可以减少数据库需要执行的连接数才能检索结果.我觉得有些数据库对每表的列数有限制，但正常情况下我不会担心，如果需要，您可以随时拆分.

如果数据是一对多的(每个用户有几千行的使用信息)，那么就应该拆分成单独的表来减少重复数据(重复数据浪费存储空间、缓存空间，并使数据库更难维护).

您可能会发现关于数据库规范化的维基百科文章很有趣，因为它深入讨论了原因:><块引用>

数据库规范化是组织关系数据库的字段和表以最小化冗余和依赖性的过程.规范化通常涉及将大表划分为较小(且冗余较少)的表并定义它们之间的关系.目标是隔离数据，以便在一个表中添加、删除和修改字段，然后通过定义的关系传播到数据库的其余部分.

反规范化也是需要注意的，因为在某些情况下重复数据更好(因为它减少了数据库在读取数据时需要做的工作量).我强烈建议您在开始时尽可能规范化数据，并且只有在您意识到特定查询中的性能问题时才进行非规范化.

So this is more of a design question.

I have one primary key (say the user's ID), and I have tons of information associated with that user.

Should I have multiple tables broken down into categories according to the information, or should I have just one table with many columns?

The way I used to do it was to have multiple tables, so say, one table for application usage data, one table for profile info, one table for back end tokens etc. to keep things looking organized.

Recently some one told me that it's better not to do it that way and having a table with lots of columns is fine. The thing is, all those columns have the same primary key.

I'm pretty new to database design so which approach is better and what are the pros and cons?

What's the conventional way of doing it?

解决方案

Any time information is one-to-one (each user has one name and password), then it's probably better to have it one table, since it reduces the number of joins the database will need to do to retrieve results. I think some databases have a limit on the number of columns per table, but I wouldn't worry about it in normal cases, and you can always split it later if you need to.

If the data is one-to-many (each user has thousands of rows of usage info), then it should be split into separate tables to reduce duplicate data (duplicate data wastes storage space, cache space, and makes the database harder to maintain).

You might find the Wikipedia article on database normalization interesting, since it discusses the reasons for this in depth:

Database normalization is the process of organizing the fields and tables of a relational database to minimize redundancy and dependency. Normalization usually involves dividing large tables into smaller (and less redundant) tables and defining relationships between them. The objective is to isolate data so that additions, deletions, and modifications of a field can be made in just one table and then propagated through the rest of the database via the defined relationships.

Denormalization is also something to be aware of, because there are cases where repeating data is better (since it reduces the amount of work the database needs to do when reading data). I'd highly recommend making your data as normalized as possible to start out, and only denormalize if you're aware of performance problems in specific queries.

相关文章