合并 id 重复的字典列表 - python3

2022-01-10 00:00:00 python merge list dictionary duplicates

问题描述

我有一个字典列表:

[{"id":"1", "name":"Alice", "age":"25", "languages":"German"},
 {"id":"1", "name":"Alice", "age":"25", "languages":"French"},
 {"id":"2", "name":"John", "age":"30", "languages":"English"},
 {"id":"2", "name":"John", "age":"30", "languages":"Spanish"}]

我希望最终结果是(我在检查重复项时只考虑 id):

I'd like the end result to be (I am only considering the id when checking for duplicates):

[{"id":"1", "name":"Alice", "age":"25", "languages":"German, French"},
 {"id":"2", "name":"John", "age":"30", "languages":"English, Spanish"}]

看着类似的问题,我认为使用集合可能是答案,但一直无法正确实现.

looking at similar questions, I thought that using a set might be the answer, but haven't been able to implement it correctly.

提前感谢您的回答.


解决方案

在这里有点冗长以帮助查看结构.绝对可以做一些很酷的 lambda 东西来解决这个问题,并使列表理解更加pythonic".但这里有一个快速的解决方案!

Being a little verbose here to help see the structure. Definitely some cool lambda stuff you can do to solve this and list comprehension to be more "pythonic". But here is a quick solution!

# Set up initial data
unmerged = [
    {"id":"1", "name":"Alice", "age":"25", "languages":"German"},
    {"id":"1", "name":"Alice", "age":"25", "languages":"French"},
    {"id":"2", "name":"John", "age":"30", "languages":"English"},
    {"id":"2", "name":"John", "age":"30", "languages":"Spanish"}]

# merge the data by your composite key of id-name-age
merged = {}
for entry in unmerged:
    entry_id = entry['id']
    entry_name = entry['name']
    entry_age = entry['age']
    entry_languages = entry['languages']
    composite_key = entry_id + entry_name + entry_age
    if composite_key in merged:
        merged[composite_key]['languages'].append(entry_languages)
    else:
        merged[composite_key] = {
            'id': entry_id,
            'name': entry_name,
            'age': entry_age,
            'languages': [entry_languages]
        }

# reconstruct your list with just your unique entries
cleaned = []
for key, value in merged.items():
    print(key, value)
    cleaned.append({
        'id': value['id'],
        'name': value['name'],
        'age': value['age'],
        'languages': ', '.join(value['languages']) # string join langauges by ", "
    })

for clean in cleaned:
    print(clean)

然后给你你的最终输出,其中清理的是你的合并条目列表:

And than gives you your final output where cleaned is your list of merged entries:

{'id': '1', 'name': 'Alice', 'age': '25', 'languages': 'German, French'}
{'id': '2', 'name': 'John', 'age': '30', 'languages': 'English, Spanish'}

谢谢,如果这有帮助,请告诉我!

Thank, and let me know if this helps!

相关文章