为什么(以及如何)使用 master..spt_values 拆分列?

2021-12-10 00:00:00 sql tsql sql-server database-design

质疑问题的答案将一列拆分为多行" 我在这里重写为 [ 1 ].

Subquestioning the answer to question "Split one column into multiple rows" which I re-wrote here as [ 1 ].

Type = 'P' 的(含义)是什么,为什么要使用未记录的 master..spt_values 来拆分列?它有什么好处?

What is the (meaning of) Type = 'P' and why to use undocumented master..spt_values for splitting a column? What is the benefit of it?

[ 1 ]

CREATE TABLE dbo.Table1 
(
    Col1        CHAR(1),
    Col2        CHAR(1),
    Col3        CHAR(1),
    Col4        VARCHAR(50)
)
GO

INSERT INTO dbo.Table1 VALUES ('A','B','C','1,2,3')
GO
INSERT INTO dbo.Table1 VALUES ('D','E','F','6,7,8,9')
GO


SELECT
    T.col1, RIGHT(LEFT(T.col4,Number-1),
    CHARINDEX(',',REVERSE(LEFT(','+T.col4,Number-1))))
FROM
    master..spt_values,
    table1 T
WHERE
    Type = 'P' AND Number BETWEEN 1 AND LEN(T.col4)+1 AND
    (SUBSTRING(T.col4,Number,1) = ','
    -- OR SUBSTRING(T.col4,Number,1)  = '') --this does not work correctly anyway

相关问题:

  • system table table master..spt_values 的用途是什么,其值的含义是什么?

推荐答案

目的

为什么使用未记录的master..spt-values

Sybase 及其私生子 MS SQL 为产品提供了各种特性和功能,这些特性和功能在系统过程中实现(与作为服务启动的 sqlserver 等二进制文件相反).这些系统过程程序是用 SQL 代码编写的,并命名为 sp_%. 除了一些秘密的内部结构外,它们与任何其他 SQL 代码具有相同的限制和需求.它们是 Sybase ASE 或 SQL Server 产品的一部分.因此,他们不需要记录它;并且内部位不能被合理地标记为未记录".

Sybase, and therefore its bastard son MS SQL, provide various features and functions for the product, that is implemented in system procedures (as opposed to the binaries like sqlserver, which are started as a service). These system procedures procedures are written in SQL code and named sp_%. Except for some secret internals, they have the same limitations and needs as any other SQL code. They are part of the Sybase ASE or SQL Server product. As such, they are not required to document it; and the internal bits cannot be reasonably labelled as "undocumented".

master..spt_values 在 SQL 表中包含上述系统过程所需的所有各种零碎内容,以生成各种报告.sp 表示系统程序;spt 表示系统过程表;当然 values 是内容.

master..spt_values contains all the various bits and pieces that the said system procedures need, in an SQL table, to produce the various reports. The sp means system procedure; spt means tables for system procedures; and of course values is the content.

Type = 'P' 的(含义)是什么

人们经常将 spt_values 描述为非规范化",但这是不正确的术语.正确的术语是 folded 或 packed.它是 26 个左右的逻辑查找表,每个表都经过精美的标准化,折叠成一个物理表,并带有 Type 列来区分逻辑表.

People often describe spt_values as "de-normalised", but that is the incorrect term. The correct term is folded, or packed. It is 26 or so logical Lookup tables, each beautifully Normalised, folded into one physical table, with a Type column to differentiate the logical tables.

现在在普通数据库中,这将是一个严重错误(只需查看一个或多个查找表"的答案即可).但是在一个服务器目录中,最好是替换26个物理表.

Now in a normal database, that would be a gross error (just look at the answers for "one lookup table or many"). But in a server catalogue, it is desirable, it replaces 26 physical tables.

  • L"代表LockType Lookup;V"代表DeviceType Lookup(V是整个服务器设备的缩写);等等.类型P2"包含按位序数,用于打包到 INT 中的位的扩展.

  • "L" stands for for LockType Lookup; "V" stands for DeviceType Lookup (V is short for Device throughout the server); etc. Type "P2" contains bitwise ordinals, for the expansion of bits that are packed into an INT.

需要一组已知范围内的连续数字,以 SQL 表的形式提供,以便执行许多系统过程必须执行的投影.类型P"是 0 到 2047 之间的连续数字列表.

A set of consecutive numbers within known bounds, that is available in the form of an SQL table is required, in order to perform a Projection, which many of the system procedures have to do. Type "P" is a list of consecutive numbers between 0 and 2047.

术语投影在此处用作技术上精确的含义、自然的逻辑含义,而不是不自然的关系代数含义.

The term Projection is used here as the technically precise meaning, the natural logical sense, not the relational algebra meaning, which is unnatural.

因此,spt_values, 包含 26 个折叠的、单独的参考表和一个投影表只有一个目的.

There is therefore just one purpose for spt_values, to contain 26 folded, otherwise separate, Reference tables, and one Projection table.

spt_values 的普通用途是作为普通的查找或引用或 ENUM 表.首先,Lookup 值:

The ordinary use of spt_values then, is as an ordinary Lookup or Reference or ENUM table. First, the Lookup values:

    SELECT *                    -- list Genders
        FROM Gender 

它的使用方式与 Person 需要扩展的 GenderCode 相同(非常扩展,这些怪异的日子):

It is used in the same way that Person has a GenderCode that needs to be expanded (very expanded, these freaky days):

    SELECT  P.*,                -- list Person
            G.Name              -- expand GenderCode to Name
        FROM Person P
        JOIN Gender G
            ON P.GenderCode = G.GenderCode

例如.sp_lock 生成活动锁的报告,将锁类型显示为字符串名称.但是master..syslocks 包含锁类型为数字,它不包含那些名称;如果确实如此,这将是一个严重非规范化的表!如果您执行查询(Sybase ASE 代码,您将不得不进行转换):

Eg. sp_lock produces a report of active locks, displaying lock types as string names. But master..syslocks contains lock types as numbers, it does not contain those names; and if it did, it would be a badly denormalised table ! If you execute the query (Sybase ASE code, you will have to convert):

    SELECT *                    -- list LockTypes
        FROM master..spt_values 
        WHERE type = "L"

您会注意到查找表中有 66 个 LockType 数字 和 名称.这允许 sp_lock 执行像上面的 Person::Gender 这样的简单代码:

you will notice 66 LockType numbers and names in the Lookup table. That allows sp_lock to execute simple code like Person::Gender above:

    SELECT  spid,               -- list Active Locks
            DB_NAME(dbid),
            OBJECT_NAME(id, dbid),
            v.name,             -- expand lock name
            page,
            row
    FROM master..syslocks   L,
         master..spt_values LT
    WHERE L.type = LT.number    -- 
    AND   type = "L"            -- LockType Lookup table
    ORDER by 1, 2, 3, 4, 5, 6   -- such that perusal is easy

投影

Type = 'P' 的(含义)是什么?

什么是投影以及它是如何使用的?

What is Projection and how is it used ?

例如,您想要一个 all 66 个 LockTypes 的列表,而不是由上面的查询生成的活动锁,显示活动锁(或 Null)的数量.您不需要游标或 WHILE 循环.我们可以投射 LockType Lookup 表,通过活动锁的数量:

Say, for example, instead of the active locks produced by the query above, you wanted a list of all 66 LockTypes, showing the number of active locks (or Null). You don't need a cursor, or a WHILE loop. We could Project the LockType Lookup table, through the count of active locks:

    SELECT  LT.name,            -- list LockTypes
            [Count] = (         -- with count
        SELECT COUNT(*)
            FROM master..syslocks
            WHERE type = LT.number
                )
        FROM master..spt_values LT
        WHERE type = "L"

有多种方法,仅此一种.另一种方法是使用派生表而不是子查询.但是您仍然需要投影.

There are several methods, that is just one. Another method is to use a Derived Table instead of the Subquery. But you still need the Projection.

这通常是 spt_values 用于扩展或投影的用途.现在您知道它在那里,您也可以使用它.它是安全的(在 master 数据库中)并且几乎被所有系统过程使用,这意味着系统过程没有它就无法运行.

That is typically what spt_values is used for, either Expansion or Projection. Now that you know it is there, you can use it too. It is safe (in the master database) and used by virtually all the system procedures, which means the system procedures cannot run without it.

用于拆分列?

啊,您不懂将一个 CSV 列拆分为多行"的代码.

Ah, you do not understand the "Split one CSV column into multiple rows" code.

  • 暂时忘记 spt_values,然后再次检查该代码.它只需要一个连续数字的列表,以便可以逐字节遍历 CSV 列中的值列表.代码仅针对逗号或字符串结尾的每个字节激活.

  • Forget about spt_values for a moment, and examine that code again. It just needs a list of consecutive numbers, so that in can step through the list of values in the CSV column, byte by byte. The code is activated only for each byte that is a comma, or end-of-string.

哪里可以以SQL表的形式获取一组连续的数字,而不是从头开始创建一个并插入其中?为什么,master..spt_values 当然.如果你知道它在那里.

Where to get a set of consecutive numbers in the form of an SQL table, rather than CREATing one from scratch and INSERTing into it? Why, master..spt_values of course. If you know it is there.

(你可以通过阅读系统存储过程的代码来了解一些关于 ASE 或 SQL Server 的内部结构.)

(You can learn a bit about the internals of ASE or SQL Server, just by reading the code of the system stored procedures.)

请注意,一列中的任何 CSV 字段都是一个严重的标准化错误,它破坏了 2NF(包含重复值)和 1NF(非原子).请注意,这不是打包或折叠的,它是一个重复的组,它是非标准化的.这种严重错误的众多负面后果之一是,不是使用简单的 SQL 将重复组作为行导航,而是必须使用复杂的代码来确定和提取未规范化的 CSV 字段的内容.这里 spt_values P 为复杂代码提供了一个向量,使其更容易.

Note that any CSV field in one column is a gross Normalisation error, it breaks 2NF (contains repeating values) and 1NF (not atomic). Note, that is not packed or folded, it is a repeating group, it is un-normalised. One of the many negative consequences of such a gross error is, instead of using simple SQL to navigate the repeating group as rows, one has to use complex code to determine and extract the content of the un-normalised CSV field. Here spt_values P provides a vector for that complex code, making it easier.

它有什么好处?

我想我已经回答了那个.如果没有它,每个需要数字列表的系统过程都必须创建一个临时表;并将行插入其中;在运行其代码之前.当然,不必执行这些步骤,使系统程序更快.

I think I have answered that. If you did not have it, every system procedure that requires a list of Numbers would have to CREATE a temp table; and INSERT the rows into it; before running its code. Of course, not having to perform those steps, makes the system procedures much faster.

现在,当您需要执行投影时,例如.未来的日历日期或其他什么,您可以使用 spt_values,而不必每次都创建自己的临时表(或创建自己的私有永久表并维护它).

Now, when you need to perform a Projection, eg. calendar dates in the future, or whatever, you can use spt_values, instead of having to create your own temp table each time (or create your own private permanent table and maintain it).

相关文章