Python:父子层次结构的组合
问题描述
对于子父关系表 (csv),我正在尝试使用表中的所有数据收集可能的父子关系组合链.我正在尝试解决一个问题,即如果存在多个子父项(参见第 3 行和第 4 行),则迭代中不包含第二个子父项组合(第 4 行).
For a child-parent relationship table (csv), I am trying to gather possible parent to child relationship combination chains using all data in the table. I am trying against a problem where if multiple sub-parents exist (see rows 3 & 4), the second sub-parent combination (row 4) is not included in the iteration.
数据示例:
孩子,父母
A,B
A,C
B,D
B,C
C,D
预期的连锁结果:
D|B|A
D|C|B|A
D|C|A
实际链结结果:
D|B|A
D|C|A
代码
find= 'A' #The child for which the code should find all possible parent relationships
sequence = ''
with open('testing.csv','r') as f: #testing.csv = child,parent table (above example)
for row in f:
if row.strip().startswith(find):
parent = row.strip().split(',')[1]
sequence = parent + '|' + find
f1 = open('testing.csv','r')
for row in f1:
if row.strip().startswith(parent):
parent2 = row.strip().split(',')[1]
sequence = parent2 + '|' + sequence
parent = parent2
else:
continue
print sequence
解决方案
你看过 this 精彩的文章?真正理解python中的模式是必不可少的阅读.您的问题可以被认为是图问题 - 查找关系基本上是查找从子节点到父节点的所有路径.
Have you looked at this fantastic essay? It is essential reading to really understand patterns in python. Your problem can be thought of as a graph problem - finding the relationships is basically finding all paths from a child node to the parent node.
由于可能存在任意数量的嵌套(child->parent1->parent2...),因此您需要一个递归解决方案来查找所有路径.在您的代码中,您有 2 个 for
循环 - 正如您发现的那样,最多只会产生 3 级路径.
Since there could be an arbitrary amount of nesting (child->parent1->parent2...), you need a recursive solution to find all paths. In your code, you have 2 for
loops - which will only result in 3level paths at most as you found out.
下面的代码改编自上面的链接,以解决您的问题.find_all_paths
函数需要一个图形作为输入.
The code below was adapted from the link above to fix your issue. The function find_all_paths
requires a graph as an input.
让我们根据您的文件创建图表:
Let's create the graph from your file:
graph = {} # Graph is a dictionary to hold our child-parent relationships.
with open('testing.csv','r') as f:
for row in f:
child, parent = row.split(',')
graph.setdefault(parent, []).append(child)
print graph
使用您的示例,应打印:
with your sample, this should print:
{'C': ['A', 'B'], 'B': ['A'], 'D': ['B', 'C']}
以下代码直接来自论文:
The following code is straight from the essay:
def find_all_paths(graph, start, end, path=[]):
path = path + [start]
if start == end:
return [path]
if not graph.has_key(start):
return []
paths = []
for node in graph[start]:
if node not in path:
newpaths = find_all_paths(graph, node, end, path)
for newpath in newpaths:
paths.append(newpath)
return paths
for path in find_all_paths(graph, 'D', 'A'):
print '|'.join(path)
输出:
D|B|A
D|C|A
D|C|B|A
相关文章