redis - 使用哈希

2022-01-13 00:00:00 python django redis memcached nosql

问题描述

我正在使用 redis 为我的 Web 应用程序实现社交流和通知系统.我是 redis 新手,我对哈希及其效率有一些疑问.

I'm implementing a social stream and a notification system for my web application by using redis. I'm new to redis and I have some doubts about hashes and their efficiency.

我读过这篇很棒的 Instagram 帖子我计划实施他们类似的解决方案以减少存储空间.

I've read this awesome Instagram post and I planned to implement their similar solution for minimal storage.

正如他们的博客中提到的,他们确实喜欢这个

As mentioned in their blog, they did like this

为了利用散列类型,我们将所有媒体 ID 分桶到 1000 个桶中(我们只取 ID,除以 1000 并丢弃余数).这决定了我们落入哪个键;接下来,在位于该键的散列中,媒体 ID 是散列内的查找键,用户 ID 是值.举个例子,给定 Media ID 为 1155315,这意味着它落入桶 1155 (1155315/1000 = 1155):

To take advantage of the hash type, we bucket all our Media IDs into buckets of 1000 (we just take the ID, divide by 1000 and discard the remainder). That determines which key we fall into; next, within the hash that lives at that key, the Media ID is the lookup key within the hash, and the user ID is the value. An example, given a Media ID of 1155315, which means it falls into bucket 1155 (1155315 / 1000 = 1155):

HSET "mediabucket:1155" "1155315" "939"
HGET "mediabucket:1155" "1155315"
> "939"

因此,他们没有将 1000 个单独的键存储在 一个具有数千个查找键的哈希中.我的疑问是为什么我们不能将查找键值增加到更大.

So Instead of having 1000 seperate keys they are storing it in one hash with thousand lookup keys. And my doubt is why can't we increase the lookup key values to even more larger.

例如: 媒体 ID 1155315 除以 10000 将落入 mediabucket:115甚至更大.

他们为什么要使用一个具有 1000 个查找键的哈希桶.为什么他们不能有一个具有 100000 个查找键的哈希桶.这与效率有关吗?

Why are they settling with one hash bucket with 1000 lookup keys. Why can't they have one hash bucket with 100000 lookup keys. Is that related to efficiency?

我需要您的建议,以便在我的 Web 应用程序中实施有效的方法.

I need your suggestion for implementing the efficient method in my web application.

附:请!不要说stackoverflow不是用来提建议的,我不知道去哪里寻求帮助.

P.S. Please! don't say that stackoverflow is not for asking suggestions and I don't know where to find help.

谢谢!


解决方案

是的,和效率有关.

我们向总是乐于助人的 Redis 核心开发人员之一 Pieter Noordhuis 征求意见,他建议我们使用 Redis 哈希.Redis 中的哈希是可以非常有效地在内存中编码的字典;Redis 设置hash-zipmap-max-entries"配置哈希可以具有的最大条目数,同时仍然可以有效编码.我们发现这个设置最好在 1000 左右;更高,HSET 命令将导致明显的 CPU 活动.更多详细信息,您可以查看 zipmap 源文件.

We asked the always-helpful Pieter Noordhuis, one of Redis’ core developers, for input, and he suggested we use Redis hashes. Hashes in Redis are dictionaries that are can be encoded in memory very efficiently; the Redis setting ‘hash-zipmap-max-entries’ configures the maximum number of entries a hash can have while still being encoded efficiently. We found this setting was best around 1000; any higher and the HSET commands would cause noticeable CPU activity. For more details, you can check out the zipmap source file.

小散列以一种特殊的方式(zipmaps)进行编码,即节省内存,但操作 O(N) 而不是 O(1).因此,使用一个包含 100k 字段的 zipmap 而不是 100 个包含 1k 字段的 zipmap,您不会获得内存优势,但您的所有操作都会慢 100 倍.

Small hashes are encoded in a special way (zipmaps), that is memory efficient, but makes operations O(N) instead of O(1). So, with one zipmap with 100k fields instead of 100 zipmaps with 1k fields you gain no memory benefits, but all your operations get 100 times slower.

相关文章