无锁队列――单生产者，多消费者

2022-01-21 00:00:00 queue atomic c++ lock-free

我正在寻找一种方法来实现支持单个生产者和多个消费者的无锁队列数据结构.我看过 Maged Michael 和 Michael Scott (1996) 的经典方法，但他们的版本使用链表.我想要一个使用有界循环缓冲区的实现.使用原子变量的东西?

I am looking for a method to implement lock-free queue data structure that supports single producer, and multiple consumers. I have looked at the classic method by Maged Michael and Michael Scott (1996) but their version uses linked lists. I would like an implementation that makes use of bounded circular buffer. Something that uses atomic variables?

顺便说一句，我不确定为什么这些经典方法是为需要大量动态内存管理的链表设计的.在多线程程序中，所有的内存管理例程都是序列化的.将无锁方法与动态数据结构结合使用，难道我们没有打败它们的好处吗?

On a side note, I am not sure why these classic methods are designed for linked lists that require a lot of dynamic memory management. In a multi-threaded program, all memory management routines are serialized. Aren't we defeating the benefits of lock-free methods by using them in conjunction with dynamic data structures?

我正在尝试在英特尔 64 位架构上使用 pthread 库在 C/C++ 中对此进行编码.

I am trying to code this in C/C++ using pthread library on a Intel 64-bit architecture.

谢谢你，希里什

推荐答案

循环缓冲区的使用使得锁成为必要，因为需要阻塞来防止头部越过尾部.但除此之外，头和尾指针可以很容易地以原子方式更新.或者在某些情况下，缓冲区可能太大以至于覆盖不是问题.(在现实生活中，你会在自动交易系统中看到这一点，循环缓冲区的大小可以容纳 X 分钟的市场数据.如果你落后 X 分钟，你会遇到比覆盖缓冲区更糟糕的问题).

The use of a circular buffer makes a lock necessary, since blocking is needed to prevent the head from going past the tail. But otherwise the head and tail pointers can easily be updated atomically. Or in some cases the buffer can be so large that overwriting is not an issue. (in real life you will see this in automated trading systems, with circular buffers sized to hold X minutes of market data. If you are X minutes behind, you have wayyyy worse problems than overwriting your buffer).

当我在 C++ 中实现 MS 队列时，我使用堆栈构建了一个无锁分配器，这很容易实现.如果我有 MSQueue，那么在编译时我知道 sizeof(MSQueue::node).然后我制作了一堆所需大小的 N 个缓冲区.N 可以增长，即如果 pop() 返回 null，很容易向堆请求更多块，并将这些块推入堆栈.除了对更多内存的可能阻塞调用之外，这是一个无锁操作.

When I implemented the MS queue in C++, I built a lock free allocator using a stack, which is very easy to implement. If I have MSQueue then at compile time I know sizeof(MSQueue::node). Then I make a stack of N buffers of the required size. The N can grow, i.e. if pop() returns null, it is easy to go ask the heap for more blocks, and these are pushed onto the stack. Outside of the possibly blocking call for more memory, this is a lock free operation.

请注意，T 不能有非平凡的 dtor.我研究了一个确实允许非平凡的 dtors 的版本，它确实有效.但我发现将 T 设为我想要的 T 的指针更容易，生产者释放所有权，消费者获得所有权.这当然需要使用无锁方法分配 T 本身，但我使用堆栈创建的相同分配器也可以在这里工作.

Note that the T cannot have a non-trivial dtor. I worked on a version that did allow for non-trivial dtors, that actually worked. But I found that it was easier just to make the T a pointer to the T that I wanted, where the producer released ownership, and the consumer acquired ownership. This of course requires that the T itself is allocated using lockfree methods, but the same allocator I made with the stack works here as well.

无论如何，无锁编程的重点不在于数据结构本身更慢.要点是:

In any case the point of lock-free programming is not that the data structures themselves are slower. The points are this:

无锁使我独立于调度程序.基于锁的编程依赖于调度程序来确保锁的持有者正在运行，以便他们可以释放锁.这就是导致优先级反转"的原因在 Linux 上，有一些锁定属性可以确保发生这种情况
如果我独立于调度程序，操作系统管理时间片的时间会容易得多，而且上下文切换会少得多
使用无锁方法编写正确的多线程程序更容易，因为我不必担心死锁、活锁、调度、同步等.这对于共享内存实现尤其如此，在共享内存实现中，进程可能会在持有共享锁时死亡内存，没有办法释放锁
无锁方法更容易扩展.事实上，我已经实现了使用网络消息传递的无锁方法.像这样的分布式锁是一场噩梦

也就是说，在很多情况下，基于锁的方法更可取和/或需要

That said, there are many cases where lock-based methods are preferable and/or required

更新昂贵或无法复制的内容时.大多数无锁方法使用某种版本控制，即复制对象，更新它，并检查共享版本是否仍然与复制时相同，然后使当前版本更新版本.否则再次复制它，应用更新，然后再次检查.继续这样做，直到它起作用.当对象很小但它们很大或包含文件句柄等时，这很好，不推荐
大多数类型都无法以无锁方式访问，例如任何 STL 容器.这些具有需要非原子访问的不变量，例如 assert(vector.size()==vector.end()-vector.begin()).因此，如果您要更新/读取共享的向量，则必须将其锁定.

相关文章