使用 localStorage 进行 javascript 字符串压缩

我在一个项目中使用 localStorage,它需要存储 lots 的数据,主要是 int、bool 和 string 类型.我知道 javascript 字符串是 unicode,但是当存储在 localStorage 中时,它们会保持 unicode 吗?如果是这样,有没有办法可以压缩字符串以使用 unicode 字节中的所有数据,或者我应该只使用 base64 并减少压缩?所有数据都将存储为一个大字符串.

I am using localStorage in a project, and it will need to store lots of data, mostly of type int, bool and string. I know that javascript strings are unicode, but when stored in localStorage, do they stay unicode? If so, is there a way I could compress the string to use all of the data in a unicode byte, or should i just use base64 and have less compression? All of the data will be stored as one large string.

现在我想起来了,base64 根本不会做太多压缩,数据已经在 base 64 中,a-zA-Z0-9 ;: 是 65 个字符.

Now that I think about it, base64 wouldn't do much compression at all, the data is already in base 64, a-zA-Z0-9 ;: is 65 characters.

推荐答案

当存储在 localStorage 中时,它们是否保持 unicode?"

Web Storage 工作草案 将本地存储值定义为DOM 字符串.DOMStrings 被定义为使用 UTF-16 编码 的 16 位单元.所以是的,它们仍然是 Unicode.

The Web Storage working draft defines local storage values as DOMString. DOMStrings are defined as sequences of 16-bit units using the UTF-16 encoding. So yes, they stay Unicode.

有没有办法可以压缩字符串以使用 unicode 字节中的所有数据...?

Base32k"编码应该为每个字符提供 15 位.base32k 类型编码利用了 UTF-16 字符中的完整 16 位,但会丢失一点以避免在双字字符上出错.如果您的原始数据采用 base64 编码,则每个字符仅使用 6 位.将这 6 位编码为 base32k 应将其压缩为原始大小的 6/15 = 40%.请参阅 http://lists.xml.org/archives/xml-dev/200307/msg00505.html 和 http://lists.xml.org/archives/xml-dev/200307/msg00507.html.

"Base32k" encoding should give you 15 bits per character. A base32k-type encoding takes advantage of the full 16 bits in UTF-16 characters, but loses a bit to avoid tripping on double-word characters. If your original data is base64 encoded, it only uses 6 bits per character. Encoding those 6 bits into base32k should compress it to 6/15 = 40% of its original size. See http://lists.xml.org/archives/xml-dev/200307/msg00505.html and http://lists.xml.org/archives/xml-dev/200307/msg00507.html.

为了进一步减小大小,您可以将 base64 字符串解码为完整的 8 位二进制文​​件,使用一些已知的压缩算法对其进行压缩(例如,请参阅 gzip的javascript实现),然后base32k编码压缩输出.

For even further reduction in size, you can decode your base64 strings into their full 8-bit binary, compress them with some known compression algorithm (e.g. see javascript implementation of gzip), and then base32k encode the compressed output.

相关文章