使用Python进行MongoDB的聚合查询

2023-04-15 00:00:00 python 查询 聚合

首先,连接到MongoDB:

from pymongo import MongoClient
client = MongoClient("mongodb://localhost:27017/")
db = client["mydatabase"]

假设我们有一个名为“employees”的集合,其中包含以下文档:

employees = [
    {"name": "John", "department": "IT", "salary": 5000},
    {"name": "Jane", "department": "HR", "salary": 6000},
    {"name": "Bob", "department": "IT", "salary": 7000},
    {"name": "Alice", "department": "HR", "salary": 8000},
    {"name": "Tom", "department": "IT", "salary": 9000},
    {"name": "Jerry", "department": "HR", "salary": 10000}
]
db.employees.insert_many(employees)

这里是一些关于聚合查询的示例:

统计部门平均工资

pipeline = [
    {"$group": {"_id": "$department", "avg_salary": {"$avg": "$salary"}}}
]
result = db.employees.aggregate(pipeline)
for doc in result:
    print(doc)

输出结果:

{'_id': 'HR', 'avg_salary': 8666.666666666666}
{'_id': 'IT', 'avg_salary': 7000.0}

按部门分组并获取每个部门的前两个工资最高的员工

pipeline = [
    {"$sort": {"salary": -1}},
    {"$group": {"_id": "$department", "employees": {"$push": "$name"}}},
    {"$project": {"employees": {"$slice": ["$employees", 2]}}}
]
result = db.employees.aggregate(pipeline)
for doc in result:
    print(doc)

输出结果:

{'_id': 'HR', 'employees': ['Jerry', 'Alice']}
{'_id': 'IT', 'employees': ['Tom', 'Bob']}

获取公司工资最高的员工

pipeline = [
    {"$sort": {"salary": -1}},
    {"$limit": 1},
    {"$project": {"_id": 0, "employee_name": "$name"}}
]
result = db.employees.aggregate(pipeline)
for doc in result:
    print(doc)

输出结果:

{'employee_name': 'Jerry'}

这些都是聚合查询的简单示例。使用更复杂的管道可以实现各种不同的聚合查询。

相关文章