诊断 SQL Server 2005 中的死锁

2022-01-01 00:00:00 deadlock sql-server-2005 sql-server

我们在 Stack Overflow SQL Server 2005 数据库中看到了一些有害但罕见的死锁情况.

We're seeing some pernicious, but rare, deadlock conditions in the Stack Overflow SQL Server 2005 database.

我附加了分析器,使用 这篇关于解决死锁的优秀文章,并捕获了大量示例.奇怪的是死锁写入总是相同:

I attached the profiler, set up a trace profile using this excellent article on troubleshooting deadlocks, and captured a bunch of examples. The weird thing is that the deadlocking write is always the same:

UPDATE [dbo].[Posts]
SET [AnswerCount] = @p1, [LastActivityDate] = @p2, [LastActivityUserId] = @p3
WHERE [Id] = @p0

其他死锁语句各不相同,但通常是对帖子表的某种微不足道的、简单的阅读.这个人总是在僵局中被杀死.举个例子

The other deadlocking statement varies, but it's usually some kind of trivial, simple read of the posts table. This one always gets killed in the deadlock. Here's an example

SELECT
[t0].[Id], [t0].[PostTypeId], [t0].[Score], [t0].[Views], [t0].[AnswerCount], 
[t0].[AcceptedAnswerId], [t0].[IsLocked], [t0].[IsLockedEdit], [t0].[ParentId], 
[t0].[CurrentRevisionId], [t0].[FirstRevisionId], [t0].[LockedReason],
[t0].[LastActivityDate], [t0].[LastActivityUserId]
FROM [dbo].[Posts] AS [t0]
WHERE [t0].[ParentId] = @p0

完全清楚,我们看到的不是写/写死锁,而是读/写.

To be perfectly clear, we are not seeing write / write deadlocks, but read / write.

我们目前混合使用 LINQ 和参数化 SQL 查询.我们已将 with (nolock) 添加到所有 SQL 查询中.这可能对一些人有所帮助.我们还有一个(非常)写得很糟糕的徽章查询,我昨天修复了它,每次运行需要超过 20 秒,而且每分钟都在运行.我希望这是一些锁定问题的根源!

We have a mixture of LINQ and parameterized SQL queries at the moment. We have added with (nolock) to all the SQL queries. This may have helped some. We also had a single (very) poorly-written badge query that I fixed yesterday, which was taking upwards of 20 seconds to run every time, and was running every minute on top of that. I was hoping this was the source of some of the locking problems!

不幸的是,我在大约 2 小时前遇到了另一个死锁错误.完全相同的症状,完全相同的罪魁祸首.

Unfortunately, I got another deadlock error about 2 hours ago. Same exact symptoms, same exact culprit write.

真正奇怪的是,您在上面看到的锁定写入 SQL 语句是非常具体的代码路径的一部分.它仅在向问题添加新答案时执行——它使用新答案计数和最后日期/用户更新父问题.显然,相对于我们正在执行的大量读取而言,这并不常见!据我所知,我们并没有在应用的任何地方进行大量写入.

The truly strange thing is that the locking write SQL statement you see above is part of a very specific code path. It's only executed when a new answer is added to a question -- it updates the parent question with the new answer count and last date/user. This is, obviously, not that common relative to the massive number of reads we are doing! As far as I can tell, we're not doing huge numbers of writes anywhere in the app.

我意识到 NOLOCK 有点像一把巨大的锤子,但我们在这里运行的大多数查询不需要那么准确.您会关心您的用户个人资料是否过时了几秒钟吗?

I realize that NOLOCK is sort of a giant hammer, but most of the queries we run here don't need to be that accurate. Will you care if your user profile is a few seconds out of date?

在 Linq 中使用 NOLOCK 有点困难,因为 Scott Hanselman 在此处讨论.

Using NOLOCK with Linq is a bit more difficult as Scott Hanselman discusses here.

我们正在考虑使用的想法

We are flirting with the idea of using

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

在基本数据库上下文上,以便我们所有的 LINQ 查询都具有此设置.如果没有它,我们就必须将我们所做的每个 LINQ 调用(嗯,简单的读取调用,占绝大多数)包装在一个 3-4 行的事务代码块中,这很丑陋.

on the base database context so that all our LINQ queries have this set. Without that, we'd have to wrap every LINQ call we make (well, the simple reading ones, which is the vast majority of them) in a 3-4 line transaction code block, which is ugly.

我想我有点沮丧,SQL 2005 中的琐碎读取可能会导致写入死锁.我可以看到写/写死锁是一个大问题,但读?我们不是在这里运行银行网站,我们不需要每次都完美准确.

I guess I'm a little frustrated that trivial reads in SQL 2005 can deadlock on writes. I could see write/write deadlocks being a huge issue, but reads? We're not running a banking site here, we don't need perfect accuracy every time.

想法?想法?

您是为每个操作实例化一个新的 LINQ to SQL DataContext 对象,还是为所有调用共享相同的静态上下文?

Are you instantiating a new LINQ to SQL DataContext object for every operation or are you perhaps sharing the same static context for all your calls?

Jeremy,我们大部分时间都在基本控制器中共享一个静态数据上下文:

Jeremy, we are sharing one static datacontext in the base Controller for the most part:

private DBContext _db;
/// <summary>
/// Gets the DataContext to be used by a Request's controllers.
/// </summary>
public DBContext DB
{
    get
    {
        if (_db == null)
        {
            _db = new DBContext() { SessionName = GetType().Name };
            //_db.ExecuteCommand("SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED");
        }
        return _db;
    }
}

您是否建议我们为每个控制器、每个页面或.. 更频繁地创建一个新上下文?

Do you recommend we create a new context for every Controller, or per Page, or .. more often?

推荐答案

根据 MSDN:

http://msdn.microsoft.com/en-us/library/ms191242.aspx

当任一阅读提交的快照或允许快照隔离数据库选项为 ON,逻辑副本(版本)为所有数据维护中执行的修改数据库.每次修改一行通过特定的交易,数据库引擎存储的实例先前提交的版本tempdb 中行的图像.每个版本标有交易交易序号做出了改变.的版本修改后的行使用链接链接列表.最新的行值总是存储在当前数据库中和链接到存储的版本化行在临时数据库中.

When either the READ COMMITTED SNAPSHOT or ALLOW SNAPSHOT ISOLATION database options are ON, logical copies (versions) are maintained for all data modifications performed in the database. Every time a row is modified by a specific transaction, the instance of the Database Engine stores a version of the previously committed image of the row in tempdb. Each version is marked with the transaction sequence number of the transaction that made the change. The versions of modified rows are chained using a link list. The newest row value is always stored in the current database and chained to the versioned rows stored in tempdb.

对于短期运行的事务,一个修改行的版本可能会得到缓存在缓冲池中而不被写入磁盘文件tempdb 数据库.如果需要版本化的行是短暂的,它只会从缓冲池,可能不一定产生 I/O 开销.

For short-running transactions, a version of a modified row may get cached in the buffer pool without getting written into the disk files of the tempdb database. If the need for the versioned row is short-lived, it will simply get dropped from the buffer pool and may not necessarily incur I/O overhead.

额外的开销似乎有轻微的性能损失,但可以忽略不计.我们应该测试以确保.

There appears to be a slight performance penalty for the extra overhead, but it may be negligible. We should test to make sure.

尝试设置此选项并从代码查询中移除所有 NOLOCK,除非确实有必要.NOLOCK 或在数据库上下文处理程序中使用全局方法来对抗数据库事务隔离级别是解决问题的创可贴.NOLOCKS 将掩盖我们数据层的基本问题,并可能导致选择不可靠的数据,而自动选择/更新行版本控制似乎是解决方案.

Try setting this option and REMOVE all NOLOCKs from code queries unless it’s really necessary. NOLOCKs or using global methods in the database context handler to combat database transaction isolation levels are Band-Aids to the problem. NOLOCKS will mask fundamental issues with our data layer and possibly lead to selecting unreliable data, where automatic select / update row versioning appears to be the solution.

ALTER Database [StackOverflow.Beta] SET READ_COMMITTED_SNAPSHOT ON

相关文章