应该执行哪些字符替换以使 base 64 编码 URL 安全?

2022-01-21 00:00:00 encoding perl url base64 php

在查看 URL 安全 base 64 编码时，我发现它是一件非常不标准的事情.尽管 PHP 具有大量内置函数，但没有一个用于 URL 安全 base 64 编码的函数.在 base64_encode() 的手册页上，大多数评论建议使用该函数，用 strtr():

In looking at URL safe base 64 encoding, I've found it to be a very non-standard thing. Despite the copious number of built in functions that PHP has, there isn't one for URL safe base 64 encoding. On the manual page for base64_encode(), most of the comments suggest using that function, wrapped with strtr():

function base64_url_encode($input) { return strtr(base64_encode($input), '+/=', '-_,'); }

我能在这个领域找到的唯一 Perl 模块是 MIME::Base64::URLSafe (source)，它在内部执行以下替换:

The only Perl module I could find in this area is MIME::Base64::URLSafe (source), which performs the following replacement internally:

sub encode ($) { my $data = encode_base64($_[0], ''); $data =~ tr|+/=|-_|d; return $data; }

与上面的 PHP 函数不同，这个 Perl 版本完全删除了 '='(等于)字符，而不是像 PHP 那样用 ','(逗号)替换它.Equals 是一个填充字符，因此 Perl 模块在解码时会根据需要替换它们，但这种差异导致两种实现不兼容.

Unlike the PHP function above, this Perl version drops the '=' (equals) character entirely, rather than replacing it with ',' (comma) as PHP does. Equals is a padding character, so the Perl module replaces them as needed upon decode, but this difference makes the two implementations incompatible.

最后，Python 函数 urlsafe_b64encode(s) 保留'=' 填充，提示有人提出此功能删除'python base64 url safe':

Finally, the Python function urlsafe_b64encode(s) keeps the '=' padding around, prompting someone to put up this function to remove the padding which shows prominently in Google results for 'python base64 url safe':

from base64 import urlsafe_b64encode, urlsafe_b64decode def uri_b64encode(s): return urlsafe_b64encode(s).strip('=') def uri_b64decode(s): return urlsafe_b64decode(s + '=' * (4 - len(s) % 4))

这里的愿望是有一个可以包含在 URL 中的字符串而无需进一步编码，因此放弃或翻译字符+"、/"和=".由于没有明确的标准，正确的方法是什么?

The desire here is to have a string that can be included in a URL without further encoding, hence the ditching or translation of the characters '+', '/', and '='. Since there isn't a defined standard, what is the right way?

推荐答案

似乎有一个标准，它是 RFC 3548，第 4 节，使用 URL 和文件名安全字母的 Base 64 编码:

There does appear to be a standard, it is RFC 3548, Section 4, Base 64 Encoding with URL and Filename Safe Alphabet:

这种编码在技术上是相同的对上一个，除了62:nd 和 63:rd 字母字符，如见表 2.

This encoding is technically identical to the previous one, except for the 62:nd and 63:rd alphabet character, as indicated in table 2.

+ 和 / 应分别替换为 - (减号) 和 _ (understrike).任何不兼容的库都应进行包装，使其符合 RFC 3548.

+ and / should be replaced by - (minus) and _ (understrike) respectively. Any incompatible libraries should be wrapped so they conform to RFC 3548.

请注意，这要求您对 (pad) = 字符进行 URL 编码，但我更喜欢对 + 和 / 进行 URL 编码来自标准 base64 字母表的字符.

Note that this requires that you URL encode the (pad) = characters, but I prefer that over URL encoding the + and / characters from the standard base64 alphabet.

相关文章