一个迭代器,它改变并返回相同的对象.不好的做法?

2022-01-10 00:00:00 iterator java guava

我正在编写 GC 友好的代码来读取并返回给用户一系列 byte[] 消息.在内部我重用相同的 ByteBuffer 这意味着我将重复返回相同的 byte[] 实例大部分时间.

我正在考虑编写警示性 javadoc 并将其作为 迭代器.AFAIK 它不会违反 Iterator 合同,但如果他们这样做,用户肯定会感到惊讶 Lists.newArrayList(myIterator) 并返回一个List 在每个位置都填充了相同的 byte[]

问题:对于一个可能发生变异并返回相同对象的类来实现Iterator接口是不是不好的做法?p>

  • 如果是这样,最好的选择是什么?不要改变/重用你的对象"是一个简单的答案.但它并没有解决非常需要重用的情况.

  • 如果不是,您如何证明违反最小惊讶原则?

两个小音符:

  • 我正在使用 Guava 的 AbstractIterator 所以 remove() 并不重要.

  • 在我的用例中,用户是 me,并且此类的可见性将受到限制,但我已尝试将这个问题广泛应用于更广泛的应用.

更新:我接受 Louis 的回答,因为它的投票数是 Keith 的 3 倍,但请注意,在我的用例中,我打算采用我在评论中留下的代码基思对生产的回答.

解决方案

EnumMap 在其 entrySet() 迭代器中基本上就是这样做的,这会导致混乱、疯狂、令人沮丧迄今为止的错误.

如果我是你,我不会使用 Iterator —— 我会编写一个不同的 API(甚至可能与 Iterator 完全不同)并实现它.例如,您可以编写一个新的 API,将 input 用作写入消息的 ByteBuffer,因此 API 的用户可以控制缓冲区是否被重用.这看起来相当直观(用户可以编写明显而干净地重用 ByteBuffer 的代码),而不会创建不必要的混乱代码.

I'm writing GC friendly code to read and return to the user a series of byte[] messages. Internally I reuse the same ByteBuffer which means I'll repeatedly return the same byte[] instance most of the time.

I'm considering writing cautionary javadoc and exposing this to the user as a Iterator<byte[]>. AFAIK it won't violate the Iterator contract, but the user certainly could be surprised if they do Lists.newArrayList(myIterator) and get back a List populated with the same byte[] in each position!

The question: is it bad practice for a class that may mutate and return the same object to implement the Iterator interface?

  • If so, what is the best alternative? "Don't mutate/reuse your objects" is an easy answer. But it doesn't address the cases when reuse is very desirable.

  • If not, how do you justify violating the principle of least astonishment?

Two minor notes:

  • I'm using Guava's AbstractIterator so remove() isn't really of concern.

  • In my use case the user is me and the visibility of this class will be limited, but I've tried to ask this generally enough to apply more broadly.

Update: I'm accepting Louis' answer because it has 3x more votes than Keith's, but note that in my use case I'm planning to take the code that I left in a comment on Keith's answer to production.

解决方案

EnumMap did essentially exactly this in its entrySet() iterator, which causes confusing, crazy, depressing bugs to this day.

If I were you, I just wouldn't use an Iterator -- I'd write a different API (possibly quite dissimilar from Iterator, even) and implement that. For example, you might write a new API that takes as input the ByteBuffer to write the message into, so users of the API could control whether or not the buffer gets reused. That seems reasonably intuitive (the user can write code that obviously and cleanly reuses the ByteBuffer), without creating unnecessarily cluttered code.

相关文章