具有弹出随机元素的能力的 Python 集

2022-01-17 00:00:00 python random set

问题描述

我需要一个功能类似于集合(快速插入、删除和成员资格检查)但能够返回随机值的 Python (2.7) 对象.以前在 stackoverflow 上提出的问题的答案如下:

I am in need of a Python (2.7) object that functions like a set (fast insertion, deletion, and membership checking) but has the ability to return a random value. Previous questions asked on stackoverflow have answers that are things like:

import random
random.sample(mySet, 1)

但这对于大型集合来说非常慢(它在 O(n) 时间内运行).

But this is quite slow for large sets (it runs in O(n) time).

其他解决方案不够随机(它们依赖于 python 集的内部表示,这会产生一些非常非随机的结果):

Other solutions aren't random enough (they depend on the internal representation of python sets, which produces some results which are very non-random):

for e in mySet:
    break
# e is now an element from mySet

我编写了自己的基本类,它具有恒定的时间查找、删除和随机值.

I coded my own rudimentary class which has constant time lookup, deletion, and random values.

class randomSet:
    def __init__(self):
        self.dict = {}
        self.list = []

    def add(self, item):
        if item not in self.dict:
            self.dict[item] = len(self.list)
            self.list.append(item)

    def addIterable(self, item):
        for a in item:
            self.add(a)

    def delete(self, item):
        if item in self.dict:
            index = self.dict[item]
            if index == len(self.list)-1:
                del self.dict[self.list[index]]
                del self.list[index]
            else:
                self.list[index] = self.list.pop()
                self.dict[self.list[index]] = index
                del self.dict[item]

    def getRandom(self):
        if self.list:
            return self.list[random.randomint(0,len(self.list)-1)]

    def popRandom(self):
        if self.list:
            index = random.randint(0,len(self.list)-1)
            if index == len(self.list)-1:
                del self.dict[self.list[index]]
                return self.list.pop()
            returnValue = self.list[index]
            self.list[index] = self.list.pop()
            self.dict[self.list[index]] = index
            del self.dict[returnValue]
            return returnValue

有没有更好的实现,或者对这段代码有什么大的改进?

Are there any better implementations for this, or any big improvements to be made to this code?


解决方案

我认为最好的方法是使用 MutableSet collections 中的抽象基类.继承自MutableSet,然后定义add, discard, __len__, __iter__,和 __contains__;还重写 __init__ 以选择性地接受一个序列,就像 set 构造函数一样.MutableSet 提供了基于这些方法的所有其他 set 方法的内置定义.这样您就可以廉价地获得完整的 set 接口.(如果你这样做,addIterable 会为你定义,名称为 extend.)

I think the best way to do this would be to use the MutableSet abstract base class in collections. Inherit from MutableSet, and then define add, discard, __len__, __iter__, and __contains__; also rewrite __init__ to optionally accept a sequence, just like the set constructor does. MutableSet provides built-in definitions of all other set methods based on those methods. That way you get the full set interface cheaply. (And if you do this, addIterable is defined for you, under the name extend.)

discard 似乎就是您在此处所说的 delete.所以将 delete 重命名为 discard.此外,您可以像这样定义 popRandom,而不是使用单独的 popRandom 方法:

discard in the standard set interface appears to be what you have called delete here. So rename delete to discard. Also, instead of having a separate popRandom method, you could just define popRandom like so:

def popRandom(self):
    item = self.getRandom()
    self.discard(item)
    return item

这样您就不必维护两个单独的项目删除方法.

That way you don't have to maintain two separate item removal methods.

最后,在你的项目删除方法中(现在deletediscard根据标准集接口),你不需要if语句.无需测试是否 index == len(self.list) - 1,只需将列表中的最后一项与要弹出的列表索引处的项交换,并进行必要的更改为反向索引字典.然后从列表中弹出最后一项并将其从字典中删除.这适用于 index == len(self.list) - 1 与否:

Finally, in your item removal method (delete now, discard according to the standard set interface), you don't need an if statement. Instead of testing whether index == len(self.list) - 1, simply swap the final item in the list with the item at the index of the list to be popped, and make the necessary change to the reverse-indexing dictionary. Then pop the last item from the list and remove it from the dictionary. This works whether index == len(self.list) - 1 or not:

def discard(self, item):
    if item in self.dict:
        index = self.dict[item]
        self.list[index], self.list[-1] = self.list[-1], self.list[index]
        self.dict[self.list[index]] = index
        del self.list[-1]                    # or in one line:
        del self.dict[item]                  # del self.dict[self.list.pop()]

相关文章