空间有限的优先级队列:寻找一个好的算法

2022-01-21 00:00:00 algorithm queue priority-queue c++

这不是家庭作业.

我正在使用一个小的优先队列"(目前实现为数组)来存储具有 smallest 值的最后 N 个项目.这有点慢 - O(N) 项插入时间.当前实现跟踪数组中最大的项目并丢弃任何不适合数组的项目，但我仍然想进一步减少操作数.

I'm using a small "priority queue" (implemented as array at the moment) for storing last N items with smallest value. This is a bit slow - O(N) item insertion time. Current implementation keeps track of largest item in array and discards any items that wouldn't fit into array, but I still would like to reduce number of operations further.

寻找符合以下要求的优先队列算法:

looking for a priority queue algorithm that matches following requirements:

队列可以实现为数组，它有固定的大小并且_不能_增长.严格禁止在任何队列操作期间进行动态内存分配.
任何不适合数组的东西都会被丢弃，但队列会保留所有遇到过的最小元素.
O(log(N)) 插入时间(即，将元素添加到队列中应该占用 O(log(N))).
(可选)O(1) 访问队列中*最大* 项(队列存储*最小* 项，因此最大的项将首先被丢弃，我需要它们来减少操作次数)
易于实施/理解.理想情况下――类似于二分搜索――一旦你理解了它，你就会永远记住它.
元素不需要以任何方式排序.我只需要保持 N 曾经遇到过的最小值.当我需要它们时，我会立即访问所有这些.所以从技术上讲，它不必是一个队列，我只需要存储 N 个最后的最小值.

queue can be implemented as array, which has fixed size and _cannot_ grow. Dynamic memory allocation during any queue operation is strictly forbidden.
Anything that doesn't fit into array is discarded, but queue keeps all smallest elements ever encountered.
O(log(N)) insertion time (i.e. adding element into queue should take up to O(log(N))).
(optional) O(1) access for *largest* item in queue (queue stores *smallest* items, so the largest item will be discarded first and I'll need them to reduce number of operations)
Easy to implement/understand. Ideally - something similar to binary search - once you understand it, you remember it forever.
Elements need not to be sorted in any way. I just need to keep N smallest value ever encountered. When I'll need them, I'll access all of them at once. So technically it doesn't have to be a queue, I just need N last smallest values to be stored.

我最初考虑使用二进制堆(它们可以通过数组轻松实现)，但显然当数组不能再增长时它们表现不佳.链表和数组需要额外的时间来移动东西.stl 优先级队列增长并使用动态分配(不过我可能错了).

I initially thought about using binary heaps (they can be easily implemented via arrays), but apparently they don't behave well when array can't grow anymore. Linked lists and arrays will require extra time for moving things around. stl priority queue grows and uses dynamic allocation (I may be wrong about it, though).

那么，还有其他想法吗?

So, any other ideas?

--编辑--
我对 STL 的实现不感兴趣.由于函数调用次数较多，STL 实现(由少数人推荐)的工作速度比当前使用的线性数组要慢一些.

--EDIT--
I'm not interested in STL implementation. STL implementation (suggested by a few people) works a bit slower than currently used linear array due to high number of function calls.

我对优先队列算法感兴趣，而不是实现.

I'm interested in priority queue algorithms, not implemnetations.

推荐答案

基于数组的堆似乎非常适合您的目的.我不知道你为什么拒绝他们.

Array based heaps seem ideal for your purpose. I am not sure why you rejected them.

您使用最大堆.

假设你有一个 N 元素堆(实现为一个数组)，其中包含迄今为止看到的 N 个最小元素.

Say you have an N element heap (implemented as an array) which contains the N smallest elements seen so far.

当一个元素进入时，检查最大值(O(1) 时间)，如果大于则拒绝.

When an element comes in you check against the max (O(1) time), and reject if it is greater.

如果传入的值较低，则将根修改为新值并筛选此更改的值 - 最坏情况为 O(log N) 时间.

If the value coming in is lower, you modify the root to be the new value and sift-down this changed value - worst case O(log N) time.

筛选过程很简单:从 root 开始，在每个步骤中，您都将这个值与它的较大子级交换，直到恢复 max-heap 属性.

The sift-down process is simple: Starting at root, at each step you exchange this value with it's larger child until the max-heap property is restored.

因此，如果您使用 std::priority_queue，您将不必执行任何可能必须执行的删除.根据 std::priority_queue 的实现，这可能会导致内存分配/释放.

So, you will not have to do any deletes which you probably will have to, if you use std::priority_queue. Depending on the implementation of std::priority_queue, this could cause memory allocation/deallocation.

所以你可以有如下代码:

So you can have the code as follows:

大小为 N 的已分配数组.
用你看到的前 N ??个元素填充它.
heapify(您应该在标准教科书中找到它，它使用 sift-down).这是 O(N).
现在你得到的任何新元素，要么在 O(1) 时间内拒绝它，要么在最坏的情况下通过筛选插入 O(logN) 时间.

不过，平均而言，您可能不必一直筛选新值，并且可能会比 O(logn) 平均插入时间更好(尽管我没有尝试证明这一点).

On an average, though, you probably will not have to sift-down the new value all the way down and might get better than O(logn) average insert time (though I haven't tried proving it).

你只分配一次大小为 N 的数组，任何插入都是通过交换数组的元素来完成的，因此之后没有动态内存分配.

You only allocate size N array once and any insertion is done by exchanging elements of the array, so there is no dynamic memory allocation after that.

查看包含 heapify 和 sift-down 伪代码的 wiki 页面:http://en.wikipedia.org/wiki/Heapsort

Check out the wiki page which has pseudo code for heapify and sift-down: http://en.wikipedia.org/wiki/Heapsort

相关文章