在句子列表中查找单词列表,并返回匹配的句子

2022-03-12 00:00:00 python nlp nltk list-comprehension trigram

问题描述

如何从句子列表和单词列表返回句子列表,前提是单词列表(三元语法)中的所有三个单词都匹配。

请提出建议。下面是示例列表。

listwords = [['people','suffering','acute'], ['Covid-19','Corona','like'], ['people','must','collectively']]

listsent = ['The number of people suffering acute hunger could almost double.',
            'Lockdowns and global economic recession have',
            'one more shock – like Covid-19 – to push them over the edge',
            'people must collectively act now to mitigate the impact']
输出列表应该是第一句和最后一句,因为它们在列表单词中有三个匹配词。

预期输出为:

['The number of people suffering acute hunger could almost double.',
 'people must collectively act now to mitigate the impact']

解决方案

欢迎使用堆栈溢出

试用此解决方案:

listwords = [['people','suffering','acute'], ['Covid-19','Corona','like'], ['people','must','collectively']]

listsent = ['The number of people suffering acute hunger could almost double.',
            'Lockdowns and global economic recession have',
            'one more shock – like Covid-19 – to push them over the edge',
            'people must collectively act now to mitigate the impact']

# interate through each sentence
for sentence in listsent:
    # iterate through each group of words
    for words in listwords:
        # check to see if each word group is in the current sentence
        if all(word in sentence for word in words):
            print(sentence)

我注释这些行是为了让您了解情况

代码的第一部分迭代列表中的每个句子

for sentence in listsent:

然后,我们需要迭代您的单词列表中的词组

for words in listwords

这是事情变得有趣的地方。由于您有嵌套列表,我们需要检查以确保在句子中找到所有这三个单词

if all(word in sentence for word in words):

最后,您可以打印出包含所有单词的每个句子

print(sentence)

您还可以将其放入函数中,并将找到的句子作为新列表返回

listwords = [['people','suffering','acute'], ['Covid-19','Corona','like'], ['people','must','collectively']]

listsent = ['The number of people suffering acute hunger could almost double.',
            'Lockdowns and global economic recession have',
            'one more shock – like Covid-19 – to push them over the edge',
            'people must collectively act now to mitigate the impact']


def check_words(listwords, listsent):
    listsent_new = []
    # interate through each sentence
    for sentence in listsent:
        # iterate through each group of words
        for words in listwords:
            # check to see if each word group is in the current sentence
            if all(word in sentence for word in words):
                listsent_new.append(sentence)
    return listsent_new


if __name__ == '__main__':
    print(check_words(listwords, listsent))

相关文章