与 mysql 相比，neo4j 的性能(如何改进?)

2021-12-28 00:00:00 python performance neo4j mysql

这是无法重现/验证图形数据库和行动手册中的neo4j 中的性能声明的后续行动.我已经更新了设置和测试，不想对原来的问题改动太多.

This is a follow up to can't reproduce/verify the performance claims in graph databases and neo4j in action books. I have updated the setup and tests, and don't want to change the original question too much.

整个故事(包括脚本等)在 https://baach.de/Members/jhb/neo4j-performance-compared-to-mysql

The whole story (including scripts etc) is on https://baach.de/Members/jhb/neo4j-performance-compared-to-mysql

简短版本:在尝试验证图形数据库"一书中的性能声明时，我得出了以下结果(查询包含 n 个人的随机数据集，每个人有 50 个朋友):

Short version: while trying to verify the performance claims made in the 'Graph Database' book I came to the following results (querying a random dataset containing n people, with 50 friends each):

My results for 100k people depth neo4j mysql python 1 0.010 0.000 0.000 2 0.018 0.001 0.000 3 0.538 0.072 0.009 4 22.544 3.600 0.330 5 1269.942 180.143 0.758

"*": 仅单次运行

My results for 1 million people depth neo4j mysql python 1 0.010 0.000 0.000 2 0.018 0.002 0.000 3 0.689 0.082 0.012 4 30.057 5.598 1.079 5 1441.397* 300.000 9.791

"*": 仅单次运行

在 64 位 ubuntu 上使用 1.9.2 我已经用这些值设置了 neo4j.properties:

Using 1.9.2 on a 64bit ubuntu I have setup neo4j.properties with these values:

neostore.nodestore.db.mapped_memory=250M neostore.relationshipstore.db.mapped_memory=2048M

和neo4j-wrapper.conf:

and neo4j-wrapper.conf with:

wrapper.java.initmemory=1024 wrapper.java.maxmemory=8192

我对 neo4j 的查询如下所示(使用 REST api):

My query to neo4j looks like this (using the REST api):

start person=node:node_auto_index(noscenda_name="person123") match (person)-[:friend]->()-[:friend]->(friend) return count(distinct friend);

Node_auto_index 到位，很明显

Node_auto_index is in place, obviously

我能做些什么来加速neo4j(比mysql更快)?

还有 Stackoverflow 中的另一个基准测试有同样的问题.

And also there is another benchmark in Stackoverflow with same problem.

推荐答案

很抱歉您无法重现结果.但是，在 MacBook Air(1.8 GHz i7，4 GB RAM)上有 2 GB 堆、GCR 缓存，但没有缓存升温，也没有其他调整，具有类似大小的数据集(100 万用户，每人 50 个朋友)，我在 1.9.2 上使用遍历框架反复获得大约 900 毫秒:

I'm sorry you can't reproduce the results. However, on a MacBook Air (1.8 GHz i7, 4 GB RAM) with a 2 GB heap, GCR cache, but no warming of caches, and no other tuning, with a similarly sized dataset (1 million users, 50 friends per person), I repeatedly get approx 900 ms using the Traversal Framework on 1.9.2:

public class FriendOfAFriendDepth4 { private static final TraversalDescription traversalDescription = Traversal.description() .depthFirst() .uniqueness( Uniqueness.NODE_GLOBAL ) .relationships( withName( "FRIEND" ), Direction.OUTGOING ) .evaluator( new Evaluator() { @Override public Evaluation evaluate( Path path ) { if ( path.length() >= 4 ) { return Evaluation.INCLUDE_AND_PRUNE; } return Evaluation.EXCLUDE_AND_CONTINUE; } } ); private final Index<Node> userIndex; public FriendOfAFriendDepth4( GraphDatabaseService db ) { this.userIndex = db.index().forNodes( "user" ); } public Iterator<Path> getFriends( String name ) { return traversalDescription.traverse( userIndex.get( "name", name ).getSingle() ) .iterator(); } public int countFriends( String name ) { return count( traversalDescription.traverse( userIndex.get( "name", name ).getSingle() ) .nodes().iterator() ); } }

Cypher 速度较慢，但远没有您建议的那么慢:大约 3 秒:

Cypher is slower, but nowhere near as slow as you suggest: approx 3 seconds:

START person=node:user(name={name}) MATCH (person)-[:FRIEND]->()-[:FRIEND]->()-[:FRIEND]->()-[:FRIEND]->(friend) RETURN count(friend)

亲切的问候

安

相关文章