Java中不同类型的线程安全集
似乎有很多不同的实现和方法可以在 Java 中生成线程安全的 Set.一些例子包括
There seems to be a lot of different implementations and ways to generate thread-safe Sets in Java. Some examples include
1) CopyOnWriteArraySet
2) Collections.synchronizedSet(Set set)
3) ConcurrentSkipListSet
4) Collections.newSetFromMap(new ConcurrentHashMap())
5) 与(4)类似的方式生成的其他Set
5) Other Sets generated in a way similar to (4)
这些示例来自 并发模式:Java 6 中的并发集实现
有人能简单解释一下这些例子和其他例子的区别、优点和缺点吗?我无法理解和保持 Java Std Docs 中的所有内容.
Could someone please simply explain the differences, advantages, and disadvantage of these examples and others? I'm having trouble understanding and keeping straight everything from the Java Std Docs.
推荐答案
CopyOnWriteArraySet
是一个非常简单的实现——它基本上有一个数组中的元素列表,当更改列表时,它会复制该数组.此时正在运行的迭代和其他访问继续使用旧数组,避免了读取器和写入器之间同步的必要性(尽管写入本身需要同步).通常快速的集合操作(尤其是contains()
)在这里非常慢,因为数组将在线性时间内被搜索.
The
CopyOnWriteArraySet
is a quite simple implementation - it basically has a list of elements in an array, and when changing the list, it copies the array. Iterations and other accesses which are running at this time continue with the old array, avoiding necessity of synchronization between readers and writers (though writing itself needs to be synchronized). The normally fast set operations (especiallycontains()
) are quite slow here, as the arrays will be searched in linear time.
仅将其用于非常小的集合,这些集合将经常读取(迭代)并且很少更改.(Swing 的监听器集就是一个例子,但它们并不是真正的集,无论如何都应该只在 EDT 中使用.)
Use this only for really small sets which will be read (iterated) often and changed seldom. (Swing's listener-sets would be an example, but these are not really sets, and should be only used from the EDT anyway.)
Collections.synchronizedSet
将简单地围绕原始集合的每个方法包装一个同步块.您不应直接访问原始集.这意味着集合中没有两个方法可以同时执行(一个会阻塞直到另一个完成) - 这是线程安全的,但是如果多个线程正在使用该集合,您将不会有并发性.如果使用迭代器,在迭代器调用之间修改集合时,通常仍需要外部同步以避免 ConcurrentModificationExceptions.性能将与原始集合的性能相似(但有一些同步开销,如果同时使用会阻塞).
Collections.synchronizedSet
will simply wrap a synchronized-block around each method of the original set. You should not access the original set directly. This means that no two methods of the set can be executed concurrently (one will block until the other finishes) - this is thread-safe, but you will not have concurrency if multiple threads are using the set. If you use the iterator, you usually still need to synchronize externally to avoid ConcurrentModificationExceptions when modifying the set between iterator calls. The performance will be like the performance of the original set (but with some synchronization overhead, and blocking if used concurrently).
如果您只有低并发,并且希望确保所有更改对其他线程立即可见,请使用此选项.
Use this if you only have low concurrency, and want to be sure all changes are immediately visible to the other threads.
ConcurrentSkipListSet
是并发SortedSet
实现,大部分基本操作都在O(log n) 内.它允许并发添加/删除和读取/迭代,其中迭代可能会或可能不会告诉自迭代器创建以来的更改.批量操作只是多个单个调用,而不是原子完成的——其他线程可能只观察到其中的一些.
ConcurrentSkipListSet
is the concurrent SortedSet
implementation, with most basic operations in O(log n). It allows concurrent adding/removing and reading/iteration, where iteration may or may not tell about changes since the iterator was created. The bulk operations are simply multiple single calls, and not done atomically – other threads may observe only some of them.
显然,只有当您的元素有一定的总顺序时,您才能使用它.对于不太大的集合(因为 O(log n)),这看起来是高并发情况下的理想选择.
Obviously you can use this only if you have some total order on your elements. This looks like an ideal candidate for high-concurrency situations, for not-too-large sets (because of the O(log n)).
对于 ConcurrentHashMap
(以及从它派生的 Set):这里最基本的选项是(平均而言,如果你有一个良好且快速的 hashCode()
) 在 O(1) 中(但当许多键具有相同的哈希码时可能会退化为 O(n)),例如 HashMap/HashSet.写入的并发性有限(表已分区,写入访问将在所需的分区上同步),而读取访问与自身和写入线程完全并发(但可能尚未看到当前更改的结果书面).迭代器自创建以来可能会或可能不会看到更改,并且批量操作不是原子的.调整大小很慢(对于 HashMap/HashSet),因此您应该尝试通过估计创建时所需的大小来避免这种情况(并使用大约 1/3 的大小,因为它会在 3/4 满时调整大小).
For the ConcurrentHashMap
(and the Set derived from it): Here most basic options are (on average, if you have a good and fast hashCode()
) in O(1) (but might degenerate to O(n) when many keys have the same hash code), like for HashMap/HashSet. There is a limited concurrency for writing (the table is partitioned, and write access will be synchronized on the needed partition), while read access is fully concurrent to itself and the writing threads (but might not yet see the results of the changes currently being written). The iterator may or may not see changes since it was created, and bulk operations are not atomic.
Resizing is slow (as for HashMap/HashSet), thus you should try to avoid this by estimating the needed size on creation (and using about 1/3 more of that, as it resizes when 3/4 full).
当您拥有大型集合、良好(且快速)的哈希函数并且可以在创建映射之前估计集合大小和所需的并发性时使用此选项.
Use this when you have large sets, a good (and fast) hash function and can estimate the set size and needed concurrency before creating the map.
还有其他可以在这里使用的并发地图实现吗?
Are there other concurrent map implementations one could use here?
相关文章