即使在添加唯一键之后,MongoDB 也会重复文档
我创建了一个集合并添加了一个这样的唯一键
I have created a collection and added a unique key like this
db.user_services.createIndex({"uid":1 , "sid": 1},{unique:true,dropDups: true})
集合看起来像这样用户服务"
The collection looks something like this "user_services"
{
"_id" : ObjectId("55068b35f791c7f81000002d"),
"uid" : 15,
"sid" : 1,
"rate" : 5
},
{
"_id" : ObjectId("55068b35f791c7f81000002f"),
"uid" : 15,
"sid" : 1,
"rate" : 4
}
问题:
我正在使用 php 驱动程序插入具有相同 uid 和 sid 的文档,并且它正在被插入.
Am using php driver to insert documents with same uid and sid and it is getting inserted.
我想要什么
- 在 Mongo Shell 上:在 uid 和 sid 上添加唯一键,并且没有具有相同 uid 和 sid 的重复文档.
- 在 PHP 方面:有类似 mysql "insert (value) on duplicate key update rate=rate+1" 的东西.也就是说,每当我尝试插入一个文档时,如果没有,它应该被插入,否则它应该更新文档的 rate 字段
- On Mongo Shell : Add unique key on uid and sid with no duplicate documents with the same uid and sid.
- On PHP Side : having something like mysql "insert (value) on duplicate key update rate=rate+1". That is whenever I try to insert a document, it should be inserted if not there else it should update the rate field of the document
推荐答案
恭喜,您似乎发现了一个错误.在我的测试中,这只发生在 MongoDB 3.0.0 中,或者至少在 MongoDB 2.6.6 中不存在.错误现在记录在 SERVER-17599
注意:实际上不是问题",而是按设计"确认.删除了版本 3.0.0 的选项.仍然在 文档 中列出.
NOTE: Not actually an "issue" but confirmed "by design". Dropped the option for version 3.0.0. Still listed in the documentation though.
问题在于,当您尝试在复合键"字段上存在重复项的集合上创建索引时,未创建索引并且出错.在上面,索引创建应该在 shell 中产生:
The problem is that the index is not being created and errors when you attempt to create this on a collection with existing duplicates on the "compound key" fields. On the above, the index creation should yield this in the shell:
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"errmsg" : "exception: E11000 duplicate key error dup key: { : 15.0, : 1.0 }",
"code" : 11000,
"ok" : 0
}
当不存在重复项时,您可以按照当前尝试创建索引,它将被创建.
When there are no duplicates present you can create the index as you are currently trying and it will be created.
所以要解决这个问题,首先使用如下过程删除重复项:
So to work around this, first remove the duplicates with a procedure like this:
db.events.aggregate([
{ "$group": {
"_id": { "uid": "$uid", "sid": "$sid" },
"dups": { "$push": "$_id" },
"count": { "$sum": 1 }
}},
{ "$match": { "count": { "$gt": 1 } }}
]).forEach(function(doc) {
doc.dups.shift();
db.events.remove({ "_id": {"$in": doc.dups }});
});
db.events.createIndex({"uid":1 , "sid": 1},{unique:true})
然后包含重复数据的进一步插入将不会被插入,并将记录相应的错误.
Then further inserts containing duplicate data will not be inserted and the appropriate error will be recorded.
最后要注意的是,dropDups"是/不是删除重复数据的非常优雅的解决方案.如上所示,您确实想要具有更多控制权的东西.
The final note here is that "dropDups" is/was not a very elegant solution for removing duplicate data. You really want something with more control as demonstrated above.
对于第二部分,而不是使用 .insert()
使用 .update()
方法.它有一个 "upsert" 选项
For the second part, rather than use .insert()
use the .update()
method. It has an "upsert" option
$collection->update(
array( "uid" => 1, "sid" => 1 ),
array( '$set' => $someData ),
array( 'upsert' => true )
);
因此,找到"的文档被修改",未找到的文档被插入".另请参阅 $setOnInsert
用于仅在实际插入文档而不是在修改时创建某些数据的方法.
So the "found" documents are "modified" and the documents not found are "inserted". Also see $setOnInsert
for a way to only create certain data when the document is actually inserted and not when modified.
对于您的具体尝试,.update()
的正确语法是三个参数.查询"、更新"和选项":
For your specific attempt, the correct syntax of .update()
is three arguments. "query", "update" and "options":
$collection->update(
array( "uid" => 1, "sid" => 1 ),
array(
'$set' => array( "field" => "this" ),
'$inc' => array( "counter" => 1 ),
'$setOnInsert' => array( "newField" => "another" )
),
array( "upsert" => true )
);
不允许任何更新操作访问与该更新"文档部分中另一个更新操作中使用的路径相同的路径.
None of the update operations are allowed to "access the same path" as used in another update operation in that "update" document section.
相关文章