何时在 MySQL 表中使用 NULL
我很欣赏数据库表中 NULL 值的语义,它不同于 false 和空字符串 ''.但是,我经常阅读有关字段可为空时的性能问题,并被建议在 NULL 实际上语义正确的情况下使用空字符串.
I appreciate the semantic meaning of a NULL value in a database table, different from both false and the empty string ''. However, I have often read about performance problems when fields are nullable and been advised to use an empty string in cases where NULL is actually semantically correct.
什么情况下适合使用可空字段和空值?有哪些取舍?完全避免使用 NULL 并简单地使用空字符串、false 或 0 来表示没有值是否明智?
What circumstances are appropriate to use nullable fields and NULL values? What are the trade-offs? Is it sensible to simply avoid using NULLs altogether and simply use empty strings, false or 0 to indicate the absence of a value?
更新
好的 - 我理解 '' 和 NULL 之间的语义差异以及 NULL 是适当字段值的(与性能无关的)情况.但是,让我扩展一下暗示的性能问题.这是来自 Schwartz、Zeitsev 等人的优秀高性能 MySQL"http://www.border.co.uk/book/high-performance-mysql-optimization-backups-replication-and-more/857673/:
OK - I understand the semantic difference between '' and NULL as well as the (performance-agnostic) circumstances in which NULL is the appropriate field value. However, let me expand on the hinted performance issue. This is from the excellent "High Performance MySQL" by Schwartz, Zeitsev et al http://www.borders.co.uk/book/high-performance-mysql-optimization-backups-replication-and-more/857673/:
MySQL 更难优化引用可空参数的查询,因为他们制作索引,索引统计和价值比较更多复杂的.可为空的列使用更多的存储空间,需要MySQL内部的特殊处理.什么时候一个可为空的列被索引,它每个条目需要一个额外的字节,并且甚至会导致固定大小的索引(例如单个整数的索引列)转换为MyISAM 中的可变大小.
It's harder for MySQL to optimize queries that refer to nullable coumns, because they make indexes, index statistics, and value comparisons more complicated. A nullable column uses more storage space and requires special processing inside MySQL. When a nullable column is indexed, it requires an extra byte per entry and can even cause a fixed-size inded (such as an index on a single integer column) to be converted to a variable-sized one in MyISAM.
更多信息:Google 图书预览
这很可能是确定的答案 - 我只是在寻找一线的第二意见和经验.
This is quite possibly the definitive answer - I was just looking for second opinions and experience from the front-line.
推荐答案
但是,我经常阅读有关字段时的性能问题可为空并被建议使用在 NULL 的情况下为空字符串实际上语义上是正确的.
However, I have often read about performance problems when fields are nullable and been advised to use an empty string in cases where NULL is actually semantically correct.
我会挑剔一下词的选择:
I'm going to be nit-picky about word choice for a moment:
- 即使它是一个重要的性能因素,也不能使语义使用值而不是 NULL 正确.在 SQL 中,NULL 具有语义作用,表示缺失或不适用的值.给定 RDBMS 实现中 NULL 的性能特征与此无关.性能可能因品牌或版本而异,但语言中NULL的目的是一致的.
- Even if it were a significant performance factor, that doesn't make it semantically correct to use a value instead of NULL. In SQL, NULL has a semantic role, to denote a missing or inapplicable value. The performance characteristics of NULL in a given RDBMS implementation are independent of this. The performance may vary from brand to brand or from version to version, but the purpose of NULL in the language is consistent.
无论如何,我还没有听说过任何证据表明 NULL 表现不佳.我会对任何对性能测量的引用感兴趣,这些参考表明可空列的性能比不可空列的性能差.
In any case, I have not heard of any evidence that NULL performs poorly. I'd be interested in any references to performance measurements that show nullable columns perform worse than non-nullable columns.
我并不是说我没有错,或者在某些情况下它不可能是真的——只是做出无意义的假设是没有意义的.科学不是由猜想组成的;必须用可重复的测量来证明证据.
I'm not saying I'm not wrong or that it can't be true in some cases -- just that it's not meaningful to make idle suppositions. Science is not made up of conjecture; one has to show evidence with repeatable measurements.
指标还通过性能差异告诉您,因此您可以判断是否值得担心.也就是说,影响可能是可衡量的且非零,但与更高的性能因素(例如正确索引表或调整数据库缓存大小)相比仍然微不足道.
Metrics also tell you by how much the performance differs, so you can make a judgment about whether it's something to worth worrying about. That is, the impact could be measurable and nonzero, but still insignificant compared to greater performance factors, such as properly indexing tables or sizing your database cache.
在 MySQL 中,搜索 NULL 可以从索引中受益:
In MySQL, searches for NULL can benefit from an index:
mysql> CREATE TABLE foo (
i INT NOT NULL,
j INT DEFAULT NULL,
PRIMARY KEY (i),
UNIQUE KEY j_index (j)
);
mysql> INSERT INTO foo (i, j) VALUES
(1, 1), (2, 2), (3, NULL), (4, NULL), (5, 5);
mysql> EXPLAIN SELECT * FROM foo WHERE i = 3;
+----+-------------+-------+-------+---------------+---------+---------+-------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+-------+
| 1 | SIMPLE | foo | const | PRIMARY | PRIMARY | 4 | const | 1 | |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+-------+
mysql> EXPLAIN SELECT * FROM foo WHERE j IS NULL;
+----+-------------+-------+------+---------------+---------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+---------+---------+-------+------+-------------+
| 1 | SIMPLE | foo | ref | j_index | j_index | 5 | const | 2 | Using where |
+----+-------------+-------+------+---------------+---------+---------+-------+------+-------------+
请注意,这仍然不是性能的衡量标准.我只展示了您可以在搜索 NULL 时使用索引.我要断言(诚然没有测量,但嘿,这只是 StackOverflow)在搜索 NULL 与空白字符串时,索引的好处掩盖了任何可能的惩罚.
Note that's still not a measurement of performance. I've only shown that you can use an index while searching for NULL. I'm going to assert (admittedly without having measured, but hey this is just StackOverflow) that the benefit of an index overshadows any possible penalty when searching for NULL versus a blank string.
选择零或空白或任何其他值来替代 NULL 并不是一个正确的设计决定.您可能需要在列中使用这些值作为重要值.这就是为什么 NULL 存在的原因,作为根据定义在任何数据类型的值域之外的值,因此您可以使用整数或字符串或其他任何值的完整范围,并且仍然有一些东西可以表示上述值都不是."
It's not a correct design decision to choose zero or blank or any other value to substitute for NULL. You may need to use those values as significant in the column. That's why NULL exists, as a value that is by definition outside the domain of values of any data type, so you can use the full range of values of integers or strings or whatever and still have something to signify "none of the above values."
相关文章