是否总是假定索引范围的上限是互斥的?

2022-01-24 00:00:00 indexing range collections java

所以在 Java 中,只要给定索引范围,上限几乎总是排他性的.

So in Java, whenever an indexed range is given, the upper bound is almost always exclusive.

来自java.lang.String:

substring(int beginIndex, int endIndex)

返回一个新字符串,它是该字符串的子字符串.子字符串从指定的 beginIndex 开始并延伸到索引 endIndex - 1

Returns a new string that is a substring of this string. The substring begins at the specified beginIndex and extends to the character at index endIndex - 1

来自java.util.Arrays:

copyOfRange(T[] original, int from, int to)

from - 要复制的范围的初始索引,包括
to - 要复制的范围的最终索引,独占.

from - the initial index of the range to be copied, inclusive
to - the final index of the range to be copied, exclusive.

来自java.util.BitSet:

set(int fromIndex, int toIndex)

fromIndex - 要设置的第一位的索引.
toIndex - 要设置的最后一位之后的索引.

fromIndex - index of the first bit to be set.
toIndex - index after the last bit to be set.

如您所见,Java 确实试图使其成为一个一致的约定,即上限是排他的.

As you can see, it does look like Java tries to make it a consistent convention that upper bounds are exclusive.

我的问题是:

  • 这是官方权威推荐吗?
  • 是否存在值得我们警惕的明显违规行为?
  • 这个系统有名字吗?(又是基于 0"与基于 1")

澄清:我完全理解基于 0 的系统中的 N 对象集合的索引为 0..N-1.我的问题是,如果一个范围 (2,4) 给出,它可以是 3 项或 2 项,具体取决于系统.你怎么称呼这些系统?

CLARIFICATION: I fully understand that a collection of N objects in a 0-based system is indexed 0..N-1. My question is that if a range (2,4) given, it can be either 3 items or 2, depending on the system. What do you call these systems?

再次,问题不是第一个索引 0 最后一个索引 N-1"与第一个索引 1 最后一个索引 N"系统;这就是所谓的 0-based vs 1-based 系统.

AGAIN, the issue is not "first index 0 last index N-1" vs "first index 1 last index N" system; that's known as the 0-based vs 1-based system.

问题是(2,4) 中有 3 个元素"与(2,4) 中有 2 个元素"系统.你怎么称呼这些,是官方认可的吗?

The issue is "There are 3 elements in (2,4)" vs "There are 2 elements in (2,4)" systems. What do you call these, and is one officially sanctioned over the other?

推荐答案

Credit 在他的评论中提到 FredOverflow 说这被称为半开范围".所以推测,Java 集合可以描述为基于 0 的半开范围".

Credit goes to FredOverflow in his comment saying that this is called the "half-open range". So presumably, Java Collections can be described as "0-based with half-open ranges".

我在别处整理了一些关于半开与封闭范围的讨论:

I've compiled some discussions about half-open vs closed ranges elsewhere:

siliconbrain.com - 使用半开范围的 16 个充分理由(为简洁而编辑):

  • [n, m) 范围内的元素数只是 mn(而不是 m-n+1).
  • 空范围是 [n, n)(而不是 [n, n-1],如果 n 是一个已经指向列表的第一个元素的迭代器,或者如果 n == 0).
  • 对于浮点数,您可以编写 [13, 42)(而不是 [13, 41.999999999999]).
  • 在处理范围时,几乎从不使用 +1-1.如果它们很昂贵(因为它是日期),这是一个优势.
  • 如果您在一个范围内编写查找,则可以通过将结尾作为找到的位置返回来轻松指示没有找到任何内容的事实:if( find( [begin, end) ) == end) 没有找到.
  • 在数组下标以 0 开头的语言(如 C、C++、JAVA、NCL)中,上限等于大小.
  • The number of elements in the range [n, m) is just m-n (and not m-n+1).
  • The empty range is [n, n) (and not [n, n-1], which can be a problem if n is an iterator already pointing the first element of a list, or if n == 0).
  • For floats you can write [13, 42) (instead of [13, 41.999999999999]).
  • The +1 and -1 are almost never used, when handling ranges. This is an advantage if they are expensive (as it is for dates).
  • If you write a find in a range, the fact that there was nothing found can easily indicated by returning the end as the found position: if( find( [begin, end) ) == end) nothing found.
  • In languages, which start the array subscripts with 0 (like C, C++, JAVA, NCL) the upper bound is equal to the size.

<小时>

半开与封闭范围

半开范围的优点:

  • 空范围有效:[0 .. 0]
  • 子范围很容易到达原始的末尾:[x .. $]
  • 易于拆分范围:[0 .. x][x .. $]

封闭范围的优点:

  • 对称.
  • 可以说更容易阅读.
  • ['a' ... 'z'] 不需要在 'z' 之后笨拙的 +1.
  • [0 ... uint.max] 是可能的.
  • Symmetry.
  • Arguably easier to read.
  • ['a' ... 'z'] does not require awkward + 1 after 'z'.
  • [0 ... uint.max] is possible.

最后一点非常有趣.如果 Integer.MAX_VALUE 可以合法地在一个范围内,那么用半开范围编写一个 numberIsInRange(int n, int min, int max) 谓词真的很尴尬.

That last point is very interesting. It's really awkward to write an numberIsInRange(int n, int min, int max) predicate with a half-open range if Integer.MAX_VALUE could be legally in a range.

相关文章