Google Datastore 学习记录

2022-04-08 00:00:00 查询数据多个指定返回

由于在google app engine 使用google cloud sql 是要收费的，于是学习一下google提供的免费的非关系型数据库datastore。

它的特点有：

No planned downtime（全天24小时无宕机服务）
Atomic transactions（原子性的事务机制）
High availability of reads and writes（高可用的读写操作）
Strong consistency for reads and ancestor queries（祖先查询和读操作的强一致性）
Eventual consistency for all other queries（所有查询的终一致性）
关于Datastore，官方文档时这样介绍的：
The Datastore holds data objects known as entities. An entity has one or more properties, named values of one of several supported data *: for instance, a property can be a string, an integer, or a reference to another entity. Each entity is identified by its kind, whichcategorizes the entity for the purpose of queries, and a key that uniquely identifies it within its kind. The Datastore can execute multiple operations in a single transaction. By definition, a transaction cannot succeed unless every one of its operations succeeds; if any of the operations fails, the transaction is automatically rolled back. This is especially useful for distributed web applications, where multiple users may be accessing or manipulating the same data at the same time.

Datastore是以entities来存储数据对象的，一个entity有一个或者多个property，property的取值是某一种支持的数据类型：例如，一个property的取值可以是string，integer，或者是一个到其他entity的引用。每一个entity可以由它的kind来进行区分的，kind根据查询的目的来分类entity,key的来区分entity。datastore可以在一个事务中处理多个操作。根据定义一个事务在所有的操作未完成之前是不能够成功的；如果某一个操作失败那么事务将会回滚。这在分布式的web应用中是非常常见的，应为在同一时刻会存在同一个用户操作同一数据的情况。

同传统数据库的对比

Unlike traditional relational databases, the Datastore uses a distributed architecture to automatically manage scaling to very large data sets. While the Datastore interface has many of the same features as traditional databases, it differs from them in the way it describes relationships between data objects. Entities of the same kind can have different properties, and different entities can have properties with the same name but different value *.

不像传统的关系型数据库datastore使用了分布式的体系结构来自动管理大规模的数据集，datastore也有许多踢同传统的关系型数据库一样的特性，它大的不同是它描述数据对象关系的方式。数据同一个kind的entity可以有不同的properties，不同的entity可以有取值不一样的相同的properties
These unique characteristics imply a different way of designing and managing data to take advantage of the ability to scale automatically. In particular, the Datastore differs from a traditional relational database in the following important ways:

以上这些不同的设计和管理数据的方式能够充分利用google datastore在分布式环境下的自适应性。一般datastore具有以下与传统的关系型数据库不一样的地方：
The Datastore is designed to scale, allowing applications to maintain high performance as they receive more traffic:（在大规模访问下能够保证自适应和高性能）
Datastore writes scale by automatically distributing data as necessary.（大规模写时自动分发写入的数据）
Datastore reads scale because the only queries supported are those whose performance scales with the size of the result set (as opposed to the data set). This means that a query whose result set contains 100 entities performs the same whether it searches over a hundred entities or a million. This property is the key reason some * of queries are not supported.（）（读受限制于结果集的大小）
Because all queries are served by pre-built indexes, the * of queries that can be executed are more restrictive than those allowed on a relational database with SQL. In particular, the following are not supported:（由于所有的query被预先编了索引，所以查询的类型严格受限）
Join operations(不允许Join查询)
Inequality filtering on multiple properties（）
Filtering of data based on results of a subquery（不支持子查询的过滤）
Unlike traditional relational databases, the Datastore doesn't require entities of the same kind to have a consistent property set (although you can choose to enforce such a requirement in your own application code).（不要求有相同的kind的entity有相同的property集）

entity

datastore中的对象叫做entity，一个entity有一个或者多个property，每一个property有一个或者多个value,proerpty的value的类型可以是，integer，float number ,string,date,和binary数据等其他的。在一个具有多种取值的property上的查询能够测试这些取值是否满足特征查询。

注意：datastore不要求指定kind的entity具有相同的property,同样也不要求所有的entity同一个property具有相同的值。
kinds,keys,和identifiers
kind是来表义entity的类型的，例如一个人力资源管理应用中可以用一个kind叫 employee的entity来代表每一个雇员。

然而每一个entity有它的一个key 这是来区别entity的，key中一般包括了如下的信息：

entity的kind

一个identifier，identifier既可以是 key 的字符串名称也可以是一个整形的id

一个祖先路径（可选）它是用在在datastore继承关系中来定位当前entity的

identifier是在entity创建的时候指定的由于它是entity的key的一部分，它的与entity联系且不可以更改，可以通过以下两种方式来指定identifier

在构造entity的时候传递 string类型的key

使用缺省的构造函数时datastore会自动的指定一个integer类型的id为identifier。

Ancestor paths
datastore中entity的继承结构有点类似于文件系统中的目录结构。当你创建一个entity的时候，你可以指定另一个entity作为它的parent.那么这个新创建的entity就叫做parent entity的child(不想文件系统，这里的parent不要求一定存在)，如果一个entity没有父entity那么我们叫他root entity.当entity创建后它与它的parent的关系就是不可更改的。datastore不会为entity的子entity 分配相同的numeric id,也不会为两个root entity分配相同的numeric id。entity的parent 的parent就叫做entity的祖先，同样的entity的children的children就叫做entity的后代。一个entity同它的后代属于同一个entity group.一个entity的ancestor path就是从root entity到当前entity的一个entity的序列。
完整的key 包含了entity的ancestor path(是以kind-identifier的序列的形式):

[Person:GreatGrandpa, Person:Grandpa, Person:Dad, Person:Me]
如果是一个root entity 它的完整的key 则是如下的形式：
[Person:GreatGrandpa]
Queries and indexes
为了通过key来获取datastore中的entity,应用程序可以通过property值来组成一个query后获取entity.query（查询）可以根据特定的kind，或者是entity的property值，key,和ancestor，后可以返回一个或者多个entity作为query的结果。query的结果同样可以按照某种规则进行排序。返回的entity的结果是满足所有查询条件的entity。一个查询也能返回整个entity集（类似于sql中的select *）,或者是 projected entities,或者是所有的entity的key.
一个query一般包括以下几个部分：

entity 的kind
一个或者多个基于entity的属性值，key，ancestor的过滤条件（类似于sql中的 where ）

一个或者多个用于排列结果的规则

当执行这个sql后将返回满足指定kind和过滤条件以及排序规则的所有entity。

（为了节约内存和提供查询的性能，一个query好限制查询结果集的数量）

另外可以通过在query中构造ancestor filter来返回一个从指定ancestor到当前entity的entityt group，这种query就叫做ancestor query。使用ancestor query 返回的数据满足强一致性，相对的，普通的query返回的数据集仅仅是终一致性的里面有可能会包含脏数据。

相关文章