使用Python进行MongoDB的聚合查询
首先,连接到MongoDB:
from pymongo import MongoClient client = MongoClient("mongodb://localhost:27017/") db = client["mydatabase"]
假设我们有一个名为“employees”的集合,其中包含以下文档:
employees = [ {"name": "John", "department": "IT", "salary": 5000}, {"name": "Jane", "department": "HR", "salary": 6000}, {"name": "Bob", "department": "IT", "salary": 7000}, {"name": "Alice", "department": "HR", "salary": 8000}, {"name": "Tom", "department": "IT", "salary": 9000}, {"name": "Jerry", "department": "HR", "salary": 10000} ] db.employees.insert_many(employees)
这里是一些关于聚合查询的示例:
统计部门平均工资
pipeline = [ {"$group": {"_id": "$department", "avg_salary": {"$avg": "$salary"}}} ] result = db.employees.aggregate(pipeline) for doc in result: print(doc)
输出结果:
{'_id': 'HR', 'avg_salary': 8666.666666666666} {'_id': 'IT', 'avg_salary': 7000.0}
按部门分组并获取每个部门的前两个工资最高的员工
pipeline = [ {"$sort": {"salary": -1}}, {"$group": {"_id": "$department", "employees": {"$push": "$name"}}}, {"$project": {"employees": {"$slice": ["$employees", 2]}}} ] result = db.employees.aggregate(pipeline) for doc in result: print(doc)
输出结果:
{'_id': 'HR', 'employees': ['Jerry', 'Alice']} {'_id': 'IT', 'employees': ['Tom', 'Bob']}
获取公司工资最高的员工
pipeline = [ {"$sort": {"salary": -1}}, {"$limit": 1}, {"$project": {"_id": 0, "employee_name": "$name"}} ] result = db.employees.aggregate(pipeline) for doc in result: print(doc)
输出结果:
{'employee_name': 'Jerry'}
这些都是聚合查询的简单示例。使用更复杂的管道可以实现各种不同的聚合查询。
相关文章