hugegraph基本使用

2022-04-25 00:00:00 查询路径返回方式传入

HugeGraphServer
HugeGraphServer(后面简称HGserver)封装的REST-API提供了多种资源访问方式, 常见的5种: (建议postman内测试)

graph: 查询vertices、edges (e.g. : IP:8080/graphs/hugegraph/graph/vertices)
schema : 查询vertexlabels、 propertykeys、 edgelabels、indexlabels (e.g. : IP:8080/graphs/hugegraph/schema/vertexlabels)
gremlin : 执行gremlin语句 , 可以同步或者异步执行 (e.g. : g.V())
traverser :包含各种查询，包括短路径、交叉点、N步可达邻居等
task :包含异步任务的查询和删除
前面几个是常见的CURD操作和显示一些帮助信息, 这里重点关注gremlin, traverser 和task,
特别是异步任务.

Gremlin(同步+异步)

同步方式
//发送等价于g.V(vid)的点查
GET http://IP:8080/gremlin?gremlin=hugegraph.traversal().V('1:jin')

//POST方式.为空的都可以不传
POST http://IP:8080/gremlin
{
"gremlin": "g.E('S1:jin>1>>S1:tom')",
"bindings": {},
"aliases": {
"graph": "hugegraph",
"g": "__g_hugegraph"
}
}

异步方式
// 2.异步方式 (貌似只支持POST,并且必须指定language字段)
// [此时可以使用g.V()的方式,g等价于hugegraph.traversal()]
POST http://IP:8080/graphs/hugegraph/jobs/gremlin
{
"gremlin": "g.V().count()",
"language": "gremlin-groovy"
}
//返回任务ID,然后通过下面的 tasks/taskID去查询结果
{"task_id": 5}

当然稍有不同, 如果直接传入之前的g.V(vid) 语句是会报错的,这里是直接使用的图实例对象, 获取了它的遍历器之后再去做查询,不过参考官方:

可以通过"aliases": {“graph”: “hugegraph”, “g”: “__g_hugegraph”} 为图和遍历器添加别名后使用别名操作。其中，hugegraph是原生存在的变量，__g_hugegraph是HugeGraphServer额外添加的变量，每个图都会存在一个对应的这样格式（_g${graph}）的遍历器对象。
此时, 响应体的结构与其他 Vertex 或 Edge 的 Restful API的结构有区别，用户可能需要自行解析

Task(异步)

因为Gremlin 默认的同步方式去执行任务, 很多计算量稍大的任务就会超时或者严重影响使用, 比如g.V().count() 这种操作, HG这里提供了一个异步的方式去执行任务, 并提供了相应的任务信息接口

//1.查看所有task
GET http://IP:8080/graphs/hugegraph/tasks (后可跟"?status=SUCCESS")

//返回
{
"tasks": [{
"task_name": "INDEX_LABEL:3:createdByDateFull",
"task_progress": 0,
"task_create": 1546831919171,
"task_status": "success",
"task_update": 1546831919752,
"task_retries": 0,
"id": 3,
"task_type": "rebuild_index",
"task_callable": "com.baidu.hugegraph.job.schema.RebuildIndexCallable"
}....}

//2.查看某个具体任务,通过传入taskID
GET http://IP:8080/graphs/hugegraph/tasks/taskID

//g.V().count()的执行结果,不过返回值好像不太对..待确认原因?
//更新,语句应该改为g.V().count().next(), 默认统计的traverser
{
"task_name": "g.V().count()",
"task_progress": 0,
"task_create": 1546857683640,
"task_status": "success",
"task_update": 1546857684513,
"task_result": "1",
"task_retries": 0,
"id": 5,
"task_type": "gremlin",
"task_callable": "com.baidu.hugegraph.api.job.GremlinAPI$GremlinJob",
"task_input": "{\"gremlin\":\"g.V().count()\",\"bindings\":{},\"language\":\"gremlin-groovy\",\"aliases\":{\"hugegraph\":\"graph\"}}"
}

Traverser

这里其实就是封装了之前Gremlin 支持不友好的图算法, 比如PageRank, K-out, K步邻居, 短路径, 全路径查询, 社群去发现, 批量查询等… 效果接近TigerGraph的函数封装, 传入相应的参数就能直接获得所需返回值, 避免自己去拼凑冗长的gremlin语句, 通过这种方式可以针对业务层做许多常用查询的封装, 能极大提高使用体验和效率…

下面列举几个典型代表:

//1.短路径. (返回的一条短路径)
GET http://localhost:8080/graphs/hugegraph/traversers/shortestpath?source=1&target=12345&max_depth=5&direction=OUT
//返回
{
"path": [1,27,76,582,12345]
}

//2.K步邻居
GET http://localhost:8080/graphs/hugegraph/traversers/kneighbor?source=1&depth=5&direction=OUT
//返回
{
"vertices": [12,27,556,1113,233] .....
}
//3.批量查询顶点/边 (直接传入多个id)
GET http://localhost:8080/graphs/hugegraph/traversers/vertices?ids="1:4.4.4.4"&ids="1:5.5.5.5"&ids="1:8.8.8.8"
//返回
{
"vertices": [
{"id": "1:4.4.4.4", "label": "book", "type": "vertex", "properties":{"name":[{"id": "..",…},
{"id": "1:5.5.5.5", "label": "book", "type": "vertex", "properties":{"name":[{"id": "..",…},
{"id": "1:8.8.8.8", "label": "book", "type": "vertex", "properties":{"name":[{"id": "..",…}
]
}

//4.1 获取shard分片信息 (这里是取的全顶点?...那用意是)
GET http://localhost:8080/graphs/hugegraph/traversers/vertices/shards?split_size=67108864
//返回
{
"shards":[
{
"start": "0",
"end": "1234567",
"length": 0
},
{
"start": "1234567",
"end": "3456789",
"length": 0
}......]
}

//4.2 然后结合scan来获取这批顶点. (start和end)
GET http://localhost:8080/graphs/hugegraph/traversers/vertices/scan?start=554189328&end=692736660
//返回 (同传入多id批量查询,不重复列了)

相关代码实现见HugeTraverser.java
————————————————
版权声明：本文为CSDN博主「lingxingzhang」的原创文章，遵循CC 4.0 BY-SA版权协议，转载请附上原文出处链接及本声明。
原文链接：https://blog.csdn.net/guajidai0165/article/details/106960865

相关文章