在 IndexedDB 中,有没有办法进行排序的复合查询?
假设一个表有姓名、ID、年龄、性别、教育等.ID 是键,该表还为姓名、年龄和性别建立索引.我需要所有 25 岁以上的男学生按姓名排序.
Say a table has, name, ID, age, sex, education, etc. ID is the key and the table is also indexed for name, age and sex. I need all male students, older than 25, sorted by their names.
这在 mySQL 中很容易:
This is easy in mySQL:
SELECT * FROM table WHERE age > 25 AND sex = "M" ORDER BY name
IndexDB 允许创建索引并根据该索引对查询进行排序.但它不允许多个查询,如年龄和性别.我发现了一个名为 queryIndexedDB (https://github.com/philikon/queryIndexedDB) 的小型库,它允许复合查询但不提供排序结果.
IndexDB allows creation of an index and orders the query based on that index. But it doesn't allow multiple queries like age and sex. I found a small library called queryIndexedDB (https://github.com/philikon/queryIndexedDB) which allows compound queries but doesn't provide sorted results.
那么有没有办法在使用 IndexedDB 的同时进行排序的复合查询?
So is there a way to make a sorted compound query, while using IndexedDB?
推荐答案
本回答中使用的术语复合查询是指在其 WHERE 子句中涉及多个条件的 SQL SELECT 语句.尽管 indexedDB 规范中未提及此类查询,但您可以通过使用由属性名称数组组成的 keypath 创建索引来近似复合查询的行为.
The term compound query as used in this answer refers to an SQL SELECT statement involving more than one condition in its WHERE clause. Although such queries are not mentioned in the indexedDB specification, you can approximate the behavior of a compound query by creating an index with a keypath that consists of an array of property names.
这与创建索引时使用多条目标志完全无关.多条目标志调整 indexedDB 如何在单个数组属性上创建索引.我们正在索引一个对象属性数组,而不是对象的单个数组属性的值.
This is completely unrelated to using the multi-entry flag when creating an index. The multi-entry flag adjusts how indexedDB creates an index over a single array property. We are indexing an array of object properties, not the values of a single array property of an object.
在此示例中,name"、gender"和age"对应于存储在学生对象存储中的学生对象的属性名称.
In this example, 'name', 'gender', and 'age' correspond to property names of student objects stored within the students object store.
// An example student object in the students store
var foo = {
'name': 'bar',
'age': 15,
'gender': 'M'
};
function myOnUpgradeNeeded(event) {
var db = event.target.result;
var students = db.createObjectStore('students');
var name = 'males25';
var keyPath = ['name', 'gender', 'age'];
students.createIndex(name, keyPath);
}
在索引上打开一个游标
然后你可以在索引上打开一个游标:
Opening a cursor on the index
You can then open a cursor on the index:
var students = transaction.objectStore('students');
var index = students.index('males25');
var lowerBound = ['AAAAA','male',26];
var upperBound = ['ZZZZZ','male',200];
var range = IDBKeyRange.bound(lowerBound, upperBound);
var request = index.openCursor(range);
然而,由于我将要解释的原因,这并不总是有效.
However, for reasons I am about to explain, this won't always work.
旁白:使用范围参数来 openCursor 或 get 是可选的.如果您不指定范围,则 IDBKeyRange.only
将隐式用于您.换句话说,你只需要对有界游标使用IDBKeyRange
.
Aside: using a range parameter to openCursor or get is optional. If you do not specify a range, then IDBKeyRange.only
is implicitly used for you. In other words, you only need to use IDBKeyRange
for bounded cursors.
索引就像对象存储,但不是直接可变的.相反,您在引用的对象存储上使用 CRUD(创建读取更新删除)操作,然后 indexedDB 自动将更新级联到索引.
Indices are like object stores but are not directly mutable. Instead, you use CRUD (create read update delete) operations on the referenced object store, and then indexedDB automatically cascades updates to the index.
理解排序是理解索引的基础.索引基本上只是一个经过特殊排序的对象集合.从技术上讲,它也被过滤了,但我稍后会谈到这一点.通常,当您在索引上打开游标时,您是根据索引的顺序进行迭代的.该顺序可能并且很可能不同于被引用对象存储中的对象的顺序.顺序很重要,因为这样可以提高迭代效率,并允许自定义下限和上限,这仅在特定于索引的顺序的上下文中才有意义.
Understanding sorting is fundamental to understanding indices. An index is basically just a specially sorted collection of objects. Technically, it is also filtered, but I'll touch on that in a moment. Generally, when you open a cursor on an index, you are iterating according to the index's order. This order could be, and probably is, different than the order of the objects in the referenced object store. The order is important because this allows iteration to be more efficient, and allows a custom lower and upper bound that only makes sense in the context of an index-specific order.
索引中的对象在存储发生更改时进行排序.当您将一个对象添加到存储中时,它会被添加到索引中的适当位置.排序归结为一个比较函数,类似于 Array.prototype.sort,它比较两个项目并返回一个对象是否小于另一个、大于另一个或相等.因此,我们可以通过深入了解比较函数的更多细节来更好地理解排序行为.
The objects in the index are sorted at the time changes to the store occur. When you add an object to the store, it is added to the proper position in the index. Sorting boils down to a comparison function, similar to Array.prototype.sort, that compares two items and returns whether one object is less than the other one, greater than the other one, or equal. So we can understand sorting behavior better by diving into more details on comparison functions.
这意味着,例如,'Z' 小于 'a',并且 string '10' 大于 string '020'.
This means, for example, that 'Z' is less than 'a' and that the string '10' is greater than the string '020'.
例如,规范指定字符串类型值如何出现在日期类型值之前或之后.值包含什么并不重要,重要的是类型.
For example, the specification specifies how a string-type value comes before or after a date-type value. It does not matter what the values contain, just the types.
IndexedDB 不会为您强制类型.你可以在这里用脚射击自己.您通常不想比较不同的类型.
IndexedDB does not coerce types for you. You can shoot yourself in the foot here. You generally never want to be comparing different types.
正如我所提到的,索引可能并不总是包含引用对象存储中的所有对象.将对象放入对象存储时,如果该对象缺少索引所基于的属性的值,则该对象将不会出现在索引中.例如,如果我们有一个不知道年龄的学生,我们将其插入到学生存储中,该特定学生将不会出现在男性 25 索引中.
As I mentioned, indices may not always include all objects from the referenced object store. When you put an object into an object store, the object will not appear in the index if it has missing values for the properties upon which the index is based. For example, if we have a student where we don't know the age, and we insert this into the students store, the particular student will not appear in the males25 index.
当您想知道为什么在索引上迭代游标时对象没有出现时请记住这一点.
Remember this when you wonder why an object doesn't appear when iterating a cursor on the index.
还要注意 null 和空字符串之间的细微差别.空字符串不是缺失值.属性为空字符串的对象仍然可以出现在基于该属性的索引中,但如果该属性存在但未定义或不存在,则不会出现在索引中.如果它不在索引中,则在索引上迭代游标时您将看不到它.
Also note the subtle difference between null and an empty string. An empty string is not a missing value. An object with an empty string for a property could still appear in an index based on that property, but will not appear in the index if the property is present but undefined or not present. And if it is not in the index, you won't see it when iterating a cursor over the index.
创建下限或上限以在该范围内打开光标时使用的下限或上限时,您必须为数组键路径中的每个属性指定一个有效值.否则,您将收到某种类型的 Javascript 错误(因浏览器而异).例如,您不能创建诸如 IDBKeyRange.only([undefined, 'male', 25])
这样的范围,因为 name 属性未定义.
You must specify a valid value for each property in the array keypath when creating a lower or upper bound to use in a range for when opening a cursor over that range. Otherwise, you will get some type of Javascript error (varies by browser). For example, you cannot create a range such as IDBKeyRange.only([undefined, 'male', 25])
because the name property is undefined.
令人困惑的是,如果您指定了错误的类型值,例如 IDBKeyRange.only(['male', 25])
,其中 name 未定义,您将获胜不会得到上述意义上的错误,但会得到无意义的结果.
Confusingly, if you specify the wrong type of value, such as IDBKeyRange.only(['male', 25])
, where name is undefined, you won't get an error in the above sense, but you will get nonsensical results.
这个一般规则有一个例外:您可以比较不同长度的数组.因此,从技术上讲,您可以省略范围中的属性,前提是您从数组的end 开始这样做,并且您适当地截断了数组.例如,您可以使用 IDBKeyRange.only(['josh','male'])
.
There is an exception to this general rule: you can compare arrays of different lengths. Therefore, you technically can omit properties from the range, provided that you do so from the end of the array, and that you appropriately truncate the array. For example, you could use IDBKeyRange.only(['josh','male'])
.
indexedDB 规范为排序数组:
Array 类型的值与 Array 类型的其他值的比较如下:
Values of type Array are compared to other values of type Array as follows:
- 设 A 为第一个数组值,B 为第二个数组值.
- 令长度为 A 的长度和 B 的长度中的较小者.
- 让我成为 0.
- 如果A的第i个值小于B的第i个值,则A小于比 B. 跳过剩余的步骤.
- 如果 A 的第 i 个值大于 B 的第 i 个值,则 A 大于 B.跳过其余步骤.
- 将 i 增加 1.
- 如果 i 不等于长度,则返回第 4 步.否则继续下一步.
- 如果 A 的长度小于 B 的长度,则 A 小于 B.如果 A 的长度大于 B 的长度,则 A 大于 B.否则 A 和 B 相等.
问题在于第 4 步和第 5 步:跳过其余步骤.这基本上意味着,如果我们比较两个数组的顺序,例如 [1,'Z'] 和 [0,'A'],则该方法仅考虑第一个元素,因为在该点 1 >0. 由于短路评估(规范中的第 4 步和第 5 步),它永远不会检查 Z 与 A.
The catch is in steps 4 and 5: Skip the remaining steps. What this basically means is that if we are comparing two arrays for order, such as [1,'Z'] and [0,'A'], the method only considers the first element because at that point 1 is > 0. It never gets around to checking Z vs A because of short-circuited evaluation (steps 4 and 5 in the spec).
所以,前面的例子是行不通的.它实际上更像下面这样:
So, the earlier example is not going to work. It actually works more like the following:
WHERE (students.name >= 'AAAAA' && students.name <= 'ZZZZZ') ||
(students.name >= 'AAAAA' && students.name <= 'ZZZZZ' &&
students.gender >= 'male' && students.gender <= 'male') ||
(students.name >= 'AAAAA' && students.name <= 'ZZZZZ' &&
students.gender >= 'male' && students.gender <= 'male' &&
students.age >= 26 && students.age <= 200)
如果您对 SQL 或一般编程中的此类布尔子句有任何经验,那么您应该已经认识到不一定涉及完整的条件集.这意味着您将无法获得所需的对象列表,这就是您无法真正获得与 SQL 复合查询相同的行为的原因.
If you have any experience with such Boolean clauses in SQL or in general programming, then you already should recognize how the full set of conditions are not necessarily involved. That means you will not get the list of objects you want, and this is why you cannot truly get the same behavior as SQL compound queries.
在当前实现中,您无法轻易避免这种短路行为.在最坏的情况下,您必须将存储/索引中的所有对象加载到内存中,然后使用您自己的自定义排序函数对集合进行排序.
You cannot easily avoid this short-circuiting behavior in the current implementation. In the worst case you have to load all objects from the store/index into memory and then sort the collection using your own custom sorting function.
有一些方法可以最大限度地减少或避免一些短路问题:
There are ways to minimize or avoid some of the short-circuiting issues:
例如,如果您使用 index.get(array) 或 index.openCursor(array),则没有短路问题.要么是完整的匹配,要么不是完整的匹配.在这种情况下,比较函数只是评估两个值是否相同,而不是评估一个值是大于还是小于另一个.
For example, if you are using index.get(array) or index.openCursor(array), then there is no short-circuiting concern. There is either an entire match or not an entire match. In this case, the comparison function is only evaluating whether two values are the same, not whether one is greater than or less than the other.
其他需要考虑的技术:
- 将关键路径的元素从最窄到最宽重新排列.主要是在范围内提供早期钳位,以切断一些不需要的短路结果.
- 将包装的对象存储在使用特殊自定义属性的存储中,以便可以使用非数组键路径(非复合索引)对其进行排序,或者可以使用不受短路行为.
- 使用多个索引.这会导致爆炸式索引问题.请注意,此链接是关于另一个 no-sql 数据库,但相同的概念和解释适用于 indexedDB,并且该链接是一个合理(且冗长而复杂)的解释,因此我在此不再重复.
- indexedDB(规范和 Chrome 实现)的创建者之一最近建议使用 cursor.continue:https://gist.github.com/inexorabletash/704e9688f99ac12dd336
- Rearrange the elements of the keypath from narrowest to widest. Basically provide early clamps on ranges that cut off some of the unwanted results of short-circuiting.
- Store a wrapped object in a store that uses specially customized properties so that it can be sorted using a non-array keypath (a non-compound index), or, can make use of a compound index that is not affected by the short-circuiting behavior.
- Use multiple indices. This leads to the exploding index problem. Note this link is about another no-sql database, but the same concepts and explanation applies to indexedDB, and the link is a reasonable (and lengthy and complicated) explanation so I am not repeating it here.
- One of the creators of indexedDB (the spec, and the Chrome implementation) recently suggested using cursor.continue: https://gist.github.com/inexorabletash/704e9688f99ac12dd336
cmp 函数提供了一种快速简单的方法检查排序的工作原理.例如:
The cmp function provides a quick and simple way to examine how sorting works. For example:
var a = ['Hello',1];
var b = ['World',2];
alert(indexedDB.cmp(a,b));
indexedDB.cmp 函数的一个很好的特性是它的签名与 Array.prototype.sort.您可以轻松地从控制台测试值,而无需处理连接/模式/索引等等.此外,indexedDB.cmp 是同步的,因此您的测试代码不需要涉及异步回调/承诺.
One nice property of the indexedDB.cmp function is that its signature is the same as the function parameter to Array.prototype.sort. You can easily test values from the console without dealing with connections/schemas/indices and all that. Furthermore, indexedDB.cmp is synchronous, so your test code does not need to involve async callbacks/promises.
相关文章