将 HTTP 范围标头与字节以外的范围说明符一起使用?

2022-01-17 00:00:00 http-headers http ajax pagination

核心问题是关于 HTTP Headers 的使用,包括 范围, If-Range、Accept-Ranges 和用户定义的范围说明符.

The core question is about the use of the HTTP Headers, including Range, If-Range, Accept-Ranges and a user defined range specifier.

这是一个制造的例子来帮助说明我的问题.假设我有一个显示某种人类可读文档的 Web 2.0 风格的应用程序.这些文档在编辑上被分成页面(类似于您在新闻网站上看到的文章).对于此示例,假设:

Here is a manufactured example to help illustrate my question. Assume I have a Web 2.0 style application that displays some sort of human readable documents. These documents are editorially broken up into pages (similar to articles you see on news websites). For this example, assume:

  • 有一个名为HTTP Range Question"的文档分为三页.
  • shell页面(/document/shell/http-range-question)知道文档的元信息,包括页数.
  • 文档的第一个可读页面在页面 onload 事件期间通过 ajax GET 加载并插入到页面中.
  • 一个看起来像 [ 1 2 3 All ] 的 UI 控件位于页面底部,单击一个数字将显示该可读页面(也通过 ajax 加载),然后单击全部"将显示整个文档.假设这些 URL 用于 1、2、3 和所有用例:
    • /document/content/http-range-question?page=1
    • /document/content/http-range-question?page=2
    • /document/content/http-range-question?page=3
    • /document/content/http-range-question
    • There is a document titled "HTTP Range Question" is broken up into three pages.
    • The shell page (/document/shell/http-range-question) knows the meta information about the document, including the number of pages.
    • The first readable page of the document is loaded during the page onload event via an ajax GET and inserted onto the page.
    • A UI control that looks like [ 1 2 3 All ] is at the bottom of the page, and clicking on a number will display that readable page (also loaded via ajax), and clicking "All" will display the entire document. Assume these URLS for the 1, 2, 3 and All use cases:
      • /document/content/http-range-question?page=1
      • /document/content/http-range-question?page=2
      • /document/content/http-range-question?page=3
      • /document/content/http-range-question

      现在回答问题.我可以使用 HTTP 范围标头而不是 URL 的一部分(例如查询字符串参数)吗?在 GET/document/content/http-range-question 请求中可能会出现这样的情况:

      Now to the question. Can I use the HTTP Range headers instead part of the URL (e.g. a querystring parameter)? Maybe something like this on the GET /document/content/http-range-question request:

      Range: page=1
      

      看起来规范只将字节范围定义为允许的,所以即使我让我的 ajax 调用与我的浏览器和服务器代码一起工作,中间的任何东西都可能破坏合同(例如缓存代理服务器).

      It looks like the spec only defines byte ranges as allowable, so even if I made my ajax calls work with my browser and server code, anything in the middle could break the contract (e.g. a caching proxy server).

      Range: bytes=0-499
      

      自定义范围说明符的任何意见或真实世界示例?

      Any opinions or real world examples of custom range specifiers?

      更新:我确实发现了关于 Range 标头的类似问题 (Paging in a Rest Collection),他们提到 Dojo 的 JsonRestStore 使用自定义 Range 标头值.

      Update: I did find a similar question about the Range header (Paging in a Rest Collection) where they mention that Dojo's JsonRestStore uses a custom Range header value.

      Range: items=0-24
      

      推荐答案

      当然——你可以自由指定任何你喜欢的范围单位.

      Absolutely - you are free to specify any range units you like.

      来自 RFC 2616:

      From RFC 2616:

      3.12 范围单位

      HTTP/1.1 允许客户端请求只有部分(范围)
      响应实体包含在回复.HTTP/1.1 使用范围单位在范围内(第 14.35 节)和内容范围(第 14.16 节)
      标题字段.实体可以被破坏根据各种结构单元.

      HTTP/1.1 allows a client to request that only part (a range of) the
      response entity be included within the response. HTTP/1.1 uses range units in the Range (section 14.35) and Content-Range (section 14.16)
      header fields. An entity can be broken down into subranges according to various structural units.

        range-unit       = bytes-unit | other-range-unit
        bytes-unit       = "bytes"
        other-range-unit = token
      

      定义的唯一范围单位HTTP/1.1 是字节".HTTP/1.1
      实现可以忽略范围使用其他单位指定.

      The only range unit defined by HTTP/1.1 is "bytes". HTTP/1.1
      implementations MAY ignore ranges specified using other units.

      关键是最后一段.真正的意思是,当他们为 HTTP/1.1 编写规范时,他们只概述了字节"令牌.但是,正如您从其他范围单元"位中看到的那样,您可以自由地提出自己的令牌说明符.

      The key piece is the last paragraph. Really what it's saying is that when they wrote the spec for HTTP/1.1, they only outlined the "bytes" token. But, as you can see from the 'other-range-unit' bit, you are free to come up with your own token specifiers.

      提出自己的范围说明符确实意味着您必须控制使用该说明符的客户端和服务器代码.因此,如果您拥有公开/document/content/http-range-question" URI 的后端部分,那么您就可以开始了;大概您使用的是现代 Web 框架,可让您检查传入的请求标头.然后您可以查看 Range 值以正确执行支持查询.

      Coming up with your own Range specifiers does mean that you have to have control over the client and server code that uses that specifier. So, if you own the backend piece that exposes the "/document/content/http-range-question" URI, you are good to go; presumably you're using a modern web framework that lets you inspect the request headers coming in. You could then look at the Range values to perform the backing query correctly.

      此外,如果您控制向后端发出请求的 AJAX 代码,您应该可以自己设置 Range 标头.

      Furthermore, if you control the AJAX code that makes requests to the backend, you should be able to set the Range header yourself.

      但是,您在问题中预期有一个潜在的缺点:可能会破坏缓存.如果您使用自定义范围单位,则客户端和源服务器之间的任何缓存可能会忽略使用 [除 'bytes' 以外的单位] 指定的范围".例如,如果您在前端和后端之间有一个 Squid/Varnish 缓存,则无法保证您希望的结果会从缓存中提供!

      However, there is a potential downside which you anticipate in your question: the potential to break caching. If you are using a custom Range unit, any caches between your client and the origin servers "MAY ignore ranges specified using [units other than 'bytes']". So for example, if you had a Squid/Varnish cache between the front and backend, there's no guarantee that the results you're hoping for will be served from the cache!

      您还可以考虑另一种实现方式,其中不使用查询字符串,而是将页面设置为 URI 的参数";例如:/document/content/http-range-question/page/1.这对您的服务器端来说可能需要更多的工作,但它符合 HTTP/1.1 并且缓存应该正确处理它.

      You might also consider an alternative implementation where, rather than using a query string, you make the page a "parameter" of the URI; e.g.: /document/content/http-range-question/page/1. This would likely be a little more work for you server-side, but it's HTTP/1.1 compliant and caches should treat it properly.

      希望这会有所帮助.

相关文章