InnoDB 如何存储字符列?
此问题仅解决了如何将短"CHAR
和 VARCHAR
列存储在 InnoDB 表中.
This Question addresses only how 'short' CHAR
and VARCHAR
columns are stored in an InnoDB table.
CHAR(10)
列是否正好占用 10 个字节?- 尾随空格会怎样?
- 每个字符需要超过 1 个字节的字符集怎么办?
VARCHAR(10)
与CHAR(10)
有何不同?EXPLAIN
意味着所有索引的 varchars 都包含一个 2 字节长度的字段.真的是2个字节吗?或者它可能是 1 个字节?(参见key_len
).- 不同的
ROW_FORMAT
怎么样?
- Does a
CHAR(10)
column occupy exactly 10 bytes? - What happens with trailing blanks?
- What about character sets that need more than 1 byte per character?
- How does
VARCHAR(10)
differ fromCHAR(10)
? EXPLAIN
implies that all indexed varchars contain a 2-byte length field. Is it really 2 bytes? Or might it be 1 byte? (cfkey_len
).- What about different
ROW_FORMATs
?
本问题未涉及(以免过于宽泛):
Not covered in this Question (to keep it from being too broad):
TEXT
怎么样.- 255、191、页外存储等呢?
- 以 char/varchar 开头的索引会发生什么.(想一想:去除公共前缀.)
- 当涉及到
MEMORY
临时表时,char/varchar 会发生什么.另外,8.0 版发生了哪些变化. ROW_FORMAT
对较长的字符串列有重大影响,主要是决定何时使用页外存储.
- What about
TEXT
. - What about 255, 191, off-page storage, etc.
- What happens in an index starting with a char/varchar. (Think: removal of common prefix.)
- What happens with char/varchar when involved in a
MEMORY
temp table. Also, what changes happen in version 8.0. ROW_FORMAT
has a significant impact on longer string columns, primarily in deciding when off-page storage is used.
推荐答案
来自 MySQL 文档:
CHAR
和VARCHAR
值的区别在于它们的存储方式,CHAR(10)
无论如何都需要10个字节的存储空间你使用了很多字符,因为数据用空格右填充,VARCHAR (10)
只需要 1 个字节(在 1 个字节字符集中)+ 长度前缀(当长度为 255 或更少时为 1,2否则...我不知道为什么 key_len 为 EXPLAIN
添加 2 个字节)
The difference between CHAR
and VARCHAR
values is the way they are stored, CHAR (10)
requires 10 bytes of storage no matter how many characters you use because the data is right-padded with spaces, VARCHAR (10)
only takes 1 byte (in 1 byte character set) + length prefix (1 when the length is 255 or less, 2 otherwise... I don't know why key_len for EXPLAIN
add 2 bytes)
我不明白你对尾随空格的意思,尽管我可以想象你指的是多余的尾随空格,VARCHAR
这些被截断并带有警告,同时在 CHAR
列这些空格被静默截断,这有一定的意义,因为 CHAR
存储在末尾带有尾随空格.
I don't understand what you mean with trailing blanks, although I can imagine you are referring to the excess of trailing spaces, with VARCHAR
these are truncated with a warning, meanwhile in CHAR
columns these spaces are truncated silently, this has some sense cause CHAR
are stored with trailing blanks at the end.
关于此链接中的字符集 您可以看到 CHAR
或 VARCHAR
的字符数是相同的,尽管您的存储需要每个字符 1 到 4 个字节,here 是支持的字符集列表和 这里每个字符的字节数.
Regarding character set in this link you can see that the number of characters for the CHAR
or VARCHAR
is the same, although, your storage will require from 1 to 4 bytes per character, here is the list of supported character set and here the bytes per character.
我读到的 InnoDB
冗余行格式特征:
在内部,InnoDB 以固定长度格式存储固定长度的字符列,例如 CHAR(10).InnoDB 不会从 VARCHAR 列中截断尾随空格.
Internally, InnoDB stores fixed-length character columns such as CHAR(10) in a fixed-length format. InnoDB does not truncate trailing spaces from VARCHAR columns.
InnoDB 将长度大于或等于 768 字节的固定长度字段编码为可变长度字段,可以在页外存储.例如,如果字符集的最大字节长度大于 3,则 CHAR(255) 列可以超过 768 个字节,就像 utf8mb4 一样.
InnoDB encodes fixed-length fields greater than or equal to 768 bytes in length as variable-length fields, which can be stored off-page. For example, a CHAR(255) column can exceed 768 bytes if the maximum byte length of the character set is greater than 3, as it is with utf8mb4.
COMPACT 行格式特征:
在内部,对于非可变长度字符集,InnoDB 存储固定长度的字符列,例如固定长度中的 CHAR(10)格式.
Internally, for nonvariable-length character sets, InnoDB stores fixed-length character columns such as CHAR(10) in a fixed-length format.
InnoDB 不会从 VARCHAR 中截断尾随空格列.
InnoDB does not truncate trailing spaces from VARCHAR columns.
在内部,对于可变长度字符集,例如 utf8mb3 和utf8mb4,InnoDB 尝试通过修剪将 CHAR(N) 存储在 N 个字节中尾随空格.如果 CHAR(N) 列值的字节长度超过N 字节,InnoDB 将尾随空格修剪到最小的列值字节长度.CHAR(N) 列的最大长度是最大字符字节长度×N.
Internally, for variable-length character sets such as utf8mb3 and utf8mb4, InnoDB attempts to store CHAR(N) in N bytes by trimming trailing spaces. If the byte length of a CHAR(N) column value exceeds N bytes, InnoDB trims trailing spaces to a minimum of the column value byte length. The maximum length of a CHAR(N) column is the maximum character byte length × N.
InnoDB 为 CHAR(N) 保留最少 N 个字节.保留在许多情况下,最小空间 N 使列更新能够在放置而不会导致索引页面的碎片.通过比较,对于 ROW_FORMAT=REDUNDANT,CHAR(N) 列占据最大值字符字节长×N.
InnoDB reserves a minimum of N bytes for CHAR(N). Reserving the minimum space N in many cases enables column updates to be done in place without causing fragmentation of the index page. By comparison, for ROW_FORMAT=REDUNDANT, CHAR(N) columns occupy the maximum character byte length × N.
InnoDB 对大于或等于 768 字节的固定长度字段进行编码长度为可变长度字段,可以在页外存储.例如,一个 CHAR(255) 列可以超过 768 个字节,如果最大字符集的字节长度大于 3,因为它与utf8mb4.
InnoDB encodes fixed-length fields greater than or equal to 768 bytes in length as variable-length fields, which can be stored off-page. For example, a CHAR(255) column can exceed 768 bytes if the maximum byte length of the character set is greater than 3, as it is with utf8mb4.
ROW_FORMAT=DYNAMIC 和 ROW_FORMAT=COMPRESSED 处理 CHAR 存储在与 ROW_FORMAT=COMPACT 相同.
ROW_FORMAT=DYNAMIC and ROW_FORMAT=COMPRESSED handle CHAR storage in the same way as ROW_FORMAT=COMPACT.
...
DYNAMIC 和 COMPRESSED 行格式是 COMPACT 行格式的变体,因此处理 CHAR 存储的方式与 COMPACT 行格式相同
DYNAMIC and COMPRESSED row formats are variations of the COMPACT row format and therefore handle CHAR storage in the same way as the COMPACT row format
相关文章