ConcurrentHashMap:避免使用“putIfAbsent"创建额外的对象?

我正在为多线程环境中的键聚合多个值.密钥是事先不知道的.我以为我会做这样的事情:

I am aggregating multiple values for keys in a multi-threaded environment. The keys are not known in advance. I thought I would do something like this:

class Aggregator {
    protected ConcurrentHashMap<String, List<String>> entries =
                            new ConcurrentHashMap<String, List<String>>();
    public Aggregator() {}

    public void record(String key, String value) {
        List<String> newList =
                    Collections.synchronizedList(new ArrayList<String>());
        List<String> existingList = entries.putIfAbsent(key, newList);
        List<String> values = existingList == null ? newList : existingList;
        values.add(value);
    }
}

我看到的问题是,每次运行此方法时,我都需要创建一个 ArrayList 的新实例,然后将其丢弃(在大多数情况下).这似乎是对垃圾收集器的无理滥用.有没有更好的、线程安全的方法来初始化这种结构,而无需 synchronize record 方法?我对让 putIfAbsent 方法不返回新创建的元素的决定感到有些惊讶,而且除非调用它(可以这么说),否则缺乏延迟实例化的方法.

The problem I see is that every time this method runs, I need to create a new instance of an ArrayList, which I then throw away (in most cases). This seems like unjustified abuse of the garbage collector. Is there a better, thread-safe way of initializing this kind of a structure without having to synchronize the record method? I am somewhat surprised by the decision to have the putIfAbsent method not return the newly-created element, and by the lack of a way to defer instantiation unless it is called for (so to speak).

推荐答案

Java 8 引入了一个 API 来解决这个确切的问题,提供了一个单行解决方案:

Java 8 introduced an API to cater for this exact problem, making a 1-line solution:

public void record(String key, String value) {
    entries.computeIfAbsent(key, k -> Collections.synchronizedList(new ArrayList<String>())).add(value);
}

对于 Java 7:

public void record(String key, String value) {
    List<String> values = entries.get(key);
    if (values == null) {
        entries.putIfAbsent(key, Collections.synchronizedList(new ArrayList<String>()));
        // At this point, there will definitely be a list for the key.
        // We don't know or care which thread's new object is in there, so:
        values = entries.get(key);
    }
    values.add(value);
}

这是填充 ConcurrentHashMap 时的标准代码模式.

This is the standard code pattern when populating a ConcurrentHashMap.

特殊方法 putIfAbsent(K, V)) 将把你的值对象放入,或者如果另一个线程在你之前,那么它将忽略你的值对象.无论哪种方式,在调用 putIfAbsent(K, V)) 之后,get(key) 保证在线程之间是一致的,因此上面的代码是线程安全的.

The special method putIfAbsent(K, V)) will either put your value object in, or if another thread got before you, then it will ignore your value object. Either way, after the call to putIfAbsent(K, V)), get(key) is guaranteed to be consistent between threads and therefore the above code is threadsafe.

唯一浪费的开销是如果某个其他线程同时为同一个键添加了一个新条目:您可能最终会丢弃新创建的值,但这只有在有还不是一个条目并且你的线程输掉了一场比赛,这通常很少见.

The only wasted overhead is if some other thread adds a new entry at the same time for the same key: You may end up throwing away the newly created value, but that only happens if there is not already an entry and there's a race that your thread loses, which would typically be rare.

相关文章