Mongo Map Reduce 第一次
第一次在这里使用 Map/Reduce 用户,并使用 MongoDB.我有很多页面访问数据,我想通过使用 Map/Reduce 来了解这些数据.下面基本上是我想做的,但作为一个 Map/Reduce 的初学者,我认为这超出了我的知识范围!
First time Map/Reduce user here, and using MongoDB. I have a lot of page visit data which I'd like to make some sense of by using Map/Reduce. Below is basically what I want to do, but as a total beginner a Map/Reduce, I think this is above my knowledge!
- 浏览过去 30 天内访问过的所有页面,其中 external = true.
- 然后对于每个页面,查找所有访问次数
- 按推荐位置对所有访问进行分组
- 对于每个推荐位置,计算有多少人访问了具有特定类型"并且标签"中也有特定单词的页面.
数据库和集合被组织为
$mongo->dbname->visits
一个示例文档是:
{"url": "www.example.com", "type": "a", "refer": {"external": true, "domain": "twitter.com", "url": "http://www.twitter.com/page"}, "page": "1235", "user": "1232", "time": 1234567890}
然后我想找到带有特定标签的B类型文档.
And then I want to find documents of type B with a certain tag.
{"url": "www.example.com", "type": "b", "page": "745", "user": "1232", "time": 1234567890, "tags": {"a", "b", "c"}}
如果有影响,我正在使用普通的 Mongo PHP 扩展.
I'm using the normal Mongo PHP extension if that has an impact.
推荐答案
好的,我想出了一些我认为可以做你想做的事.请注意,这可能不完全有效,因为我不是 100% 确定您的架构(考虑到您的示例显示 refer
在类型 a 中可用,但不是 b (我不确定这是否是一个遗漏,或者考虑到您想通过推荐人查看什么)...无论如何,这就是我想出的:
Ok, I've come up with something that I think may do what you want. Note, that this may not work exactly since I'm not 100% sure of your schema (considering your examples show refer
available in type a, but not b (I'm not sure if that's an omission, or what considering you want to view by referer)... Anyway, here's what I've come up with:
地图功能:
function() {
var obj = {
"types": {},
"tags": {},
}
obj.types[this.type] = 1;
if (this.tags) {
for (var tag in this.tags) {
obj.tags[this.tags[tag]] = 1;
}
}
emit(this.refer.url, obj);
}
Reduce 函数:
function(key, values) {
var obj = {
"types": {},
"tags": {},
}
for (var i = 0; i < values.length; i++) {
for (var type in values[i].types) {
if (!type in obj.types) {
obj.types[type] = 0;
}
obj.types[type] += values[i].types[type];
}
for (var tag in values[i].tags) {
if (!tag in obj.tags) {
obj.tags[tag] = 0;
}
obj.tags[tag] += values[i].tags[tag];
}
}
return obj;
}
所以基本上,它的工作原理是这样的.Map 函数使用 refer.url 的键(我根据您的描述猜测).所以最终结果看起来像一个数组,其中 _id
等于 refer.url(它基于 url 分组).然后它创建一个对象,它下面有两个对象(类型和标签).对象的原因是 map 和 reduce 可以发出相同的格式对象.除此之外,我认为它应该是相对不言自明的(如果你不明白,我可以尝试解释更多)...
So basically, how it works is this. The Map function uses a key of refer.url (what I guessed based on your description). So the end result will look like an array with _id
equal to refer.url (It groups based on url). It then creates an object that has two objects under it (types and tags). The reason for the object is so that map and reduce can emit the same format object. Other than that, I THINK that it should be relatively self explanatory (If you don't understand, I can try to explain more)...
所以让我们在 PHP 中实现它(假设 $map
和 $reduce
是字符串,为了简洁起见,上面包含了它们):
So let's implement this in PHP (Assuming that $map
and $reduce
are strings with the above contained with them for terseness):
$mapFunc = new MongoCode($map);
$reduceFunc = new MongoCode($reduce);
$query = array(
'time' => array('$gte' => time() - (60*60*60*24*30)),
'refer.external' => true
);
$collection = 'visits';
$command = array(
'mapreduce' => $collection,
'map' => $mapFunc,
'reduce' => $reduceFunc,
'query' => $query,
);
$statsInfo = $db->command($command);
$statsCollection = $db->selectCollection($sales['result']);
$stats = $statsCollection->find();
foreach ($stats as $stat) {
echo $stats['_id'] .' Visited ';
foreach ($stats['value']['types'] as $type => $times) {
echo "Type $type $times Times, ";
}
foreach ($stats['value']['tags'] as $tag => $times) {
echo "Tag $tag $times Times, ";
}
echo "
";
}
注意,我没有测试过这个.这正是我根据对您的架构的理解以及对 Mongo 及其 Map-Reduce 实现的理解得出的结论......
Note, I haven't tested this. This is just what I've come up with based on my understanding of your schema, and from my understanding of Mongo and its Map-Reduce implementation...
相关文章