我想根据相似的属性对元组进行分组
问题描述
我有一个元组列表.[ (1, 2), (2, 3), (4, 3), (5, 6), (6, 7), (8, 2) ]
I have a list of tuples. [ (1, 2), (2, 3), (4, 3), (5, 6), (6, 7), (8, 2) ]
我想根据连接的元组(具有相关值)将它们分组到列表中.
I want to group them into lists based on which tuples are connected (have related values).
所以最终结果是两个相关元组值的列表 =[ [1, 2, 3, 4, 8], [5, 6, 7] ]
So the end result is two lists of related tuple values = [ [1, 2, 3, 4, 8], [5, 6, 7] ]
我怎样才能写一个函数来做到这一点?这是一个求职面试问题.我试图在 Python 中做到这一点,但我很沮丧,只想看看答案背后的逻辑,所以即使是伪代码也会帮助我,所以我可以看到我做错了什么.
How can I write a function to do this? This was a job interview question. I was trying to do it in Python, but I'm frustrated and just want to see the logic behind the answer, so even psuedo code would help me, so I can see what I did wrong.
我只有几分钟的时间来现场做这件事,但这是我尝试过的:
I only had a few minutes to do this on the spot, but here is what I tried:
def find_partitions(connections):
theBigList = [] # List of Lists
list1 = [] # The initial list to store lists
theBigList.append(list1)
for list in theBigList:
list.append(connection[1[0], 1[1]])
for i in connections:
if i[0] in list or i[1] in list:
list.append(i[0], i[1])
else:
newList = []
theBigList.append(newList)
本质上,这家伙想要一个相关值列表的列表.我尝试使用for循环,但意识到它不起作用,然后时间用完了.
Essentially, the guy wanted an list of lists of values that were related. I tried to use a for loop, but realized that it wouldn't work, and then time ran out.
解决方案
在我们填写组件时,在每个阶段都需要考虑三种情况(因为您必须匹配重叠的组):
As we fill in the components, at each stage there are three cases to consider (as you will have to match up overlapping groups):
- x 或 y 均不在已找到的任何组件中.
- 两者都已经在不同的集合中,x 在 set_i 中,y 在 set_j 中.
- 其中一个或两个都在一个组件中,x 在 set_i 中或 y 在 set_i 中.
- Neither x or y are in any component already found.
- Both are already in different sets, x in set_i and y in set_j.
- Either one or both are in one component, x in set_i or y in a set_i.
我们可以使用内置的set
来帮助.(请参阅 @jwpat 和 @DSM 的更棘手的示例):
We can use the built-in set
to help. (see @jwpat's and @DSM's trickier examples):
def connected_components(lst):
components = [] # list of sets
for (x,y) in lst:
i = j = set_i = set_j = None
for k, c in enumerate(components):
if x in c:
i, set_i = k, c
if y in c:
j, set_j = k, c
#case1 (or already in same set)
if i == j:
if i == None:
components.append(set([x,y]))
continue
#case2
if i != None and j != None:
components = [components[k] for k in range(len(components)) if k!=i and k!=j]
components.append(set_i | set_j)
continue
#case3
if j != None:
components[j].add(x)
if i != None:
components[i].add(y)
return components
lst = [(1, 2), (2, 3), (4, 3), (5, 6), (6, 7), (8, 2)]
connected_components(lst)
# [set([8, 1, 2, 3, 4]), set([5, 6, 7])]
map(list, connected_components(lst))
# [[8, 1, 2, 3, 4], [5, 6, 7]]
connected_components([(1, 2), (4, 3), (2, 3), (5, 6), (6, 7), (8, 2)])
# [set([8, 1, 2, 3, 4]), set([5, 6, 7])] # @jwpat's example
connected_components([[1, 3], [2, 4], [3, 4]]
# [set([1, 2, 3, 4])] # @DSM's example
这肯定不是最有效的方法,但可能与他们所期望的相似.正如 Jon Clements 指出的那样,有一个用于这些类型计算的库:networkx,其中它们会更有效率.
This certainly won't be the most efficient method, but is perhaps one similar to what they would expect. As Jon Clements points out there is a library for these type of calculations: networkx, where they will be much more efficent.
相关文章