如何使用 LinkedHashMap 获取子图?

2022-01-24 00:00:00 dictionary performance iteration java

目前,我使用 TreeMap 来存储一些 x 和 y 坐标,但与 ArrayListHashMap 相比,迭代速度非常慢.我使用它是因为我需要 subMap() 方法,因此即使不存在确切的 X 值(键),我也可以获得确定范围内的 X 值.

Currently, I'm using TreeMap to store some x and y coordinates but the iteration is very slow compared to an ArrayList or HashMap. I'm using it because I need the subMap() method so I can get X values in a determined range even if the exact X value (key) doesn't exists.

LinkedHashMap 的速度几乎与 HashMap 相同,我可以按插入顺序迭代键(我需要插入顺序或比较器的顺序,因为它在TreeMap),但我没有 submap() 方法.在 TreeMap 中,我可以非常快速地生成子图.

LinkedHashMap has almost the same speed of a HashMap and I can iterate the keys in the insertion order (I need insertion order or order by comparator as it is done in TreeMap) but I don't have a submap() method. In TreeMap I can generate submaps very fast.

是否有任何数据结构或某种方式可以比 TreeMap 更快地存储有序值(通过插入顺序或比较器),即使确切值不在地图中,也可以在某个范围内获取子图?我的意思是,也许我想要 2 到 25 之间的值,但 2 不存在,最接近的是 3,因此它将返回一个从 3 到 25 的子图.或者以某种方式将此功能添加到 LinkedHashMap?

Is there any data structure or some way to store ordered values (by insertion order or comparator) faster than TreeMap that allows to get submaps in a range even if the exact value is not in the map? I mean, maybe I want values between 2 and 25 but 2 doesn't exist, the nearest is 3 so it will return a submap from 3 to 25. Or some way to add this functionality to LinkedHashMap?

推荐答案

今天我终于得到了我的问题的答案.经过几次 HashMap 测试,LinkedHashMapTreeMapArrayList 慢得多,我只想将它们用于创建 subMaps() 的能力.所以我创建了一个扩展 ArrayList 的新类,它在 this answer 我创建了通过值而不是索引获取子列表的快速方法.这是完整的课程:

Today I finally got the answer to my problem. After a few test HashMap, LinkedHashMap and TreeMap are way slower than ArrayList and I wanted to use them just for the ability to create subMaps(). So I created a new class extending ArrayList which gives me a very good performance and with the help of this answer I created fast way of getting a sublist by values not by index. Here is the complete class:

/**
 * The purpose of this class is to be a faster replacement to a {@link java.util.TreeMap} with
 *  the ability to get sublist containing a range of x values. ArrayList access time is O(1) while
 *  {@link java.util.TreeMap} is O(log(n)). When large data is handled the impact on performance is
 *  noticeable.
 */
public class XYDataset extends ArrayList<PointValue> {

    private final float COMPARISON_THRESHOLD = 0.01f;

    final Comparator<PointValue> comparator = new Comparator<PointValue>() {
        @Override
        public int compare(PointValue lhs, PointValue rhs) {
            if (Math.abs(lhs.getX() - rhs.getX()) < COMPARISON_THRESHOLD) return 0;
            return lhs.getX() < rhs.getX() ? -1 : 1;
        }
    };

    public XYDataset(int capacity) {
        super(capacity);
    }

    public XYDataset() {
    }

    public XYDataset(Collection<? extends PointValue> collection) {
        super(collection);
    }

    @Override
    public List<PointValue> subList(int start, int end) {
        return super.subList(start, end);
    }

    /**
     * Generate a sublist containing the range of x values passed
     * @param x1 lower x value
     * @param x2 upper x value
     * @return sublist containing x values from x1 to x2
     */
    public List<PointValue> subList(float x1, float x2){
        /**
         * Collections.binarySearch() returns the index of the search key, if it is contained in the list;
         *  otherwise it returns (-(insertion point) - 1).
         * The insertion point is defined as the point at which the key would be inserted into the list:
         *  the index of the first element greater than the key, or list.size() if all elements in the list
         *  are less than the specified key. Note that this guarantees that the return value will be >= 0 if
         *  and only if the key is found.
         */
        int n1 = Collections.binarySearch(this, new PointValue(x1, 0), comparator);
        int n2 = Collections.binarySearch(this, new PointValue(x2, 0), comparator);

        /**
         * Example, we assume the list is sorted. Based on (https://stackoverflow.com/questions/19198586/search-sorted-listlong-for-closest-and-less-than)
         *
         * long X = 500;
         * List<Long> foo = new Arraylist<>();
         * foo.add(450L);
         * foo.add(451L);
         * foo.add(499L);
         * foo.add(501L);
         * foo.add(550L);
         *
         * If we search for something that isn't in the list you can work backward from the return value
         *  to the index you want. If you search for 500 in your example list, the algorithm would return (-3 - 1) = -4.
         * Thus, you can add 1 to get back to the insertion point (-3), and then multiply by -1 and subtract 1 to get
         *  the index BEFORE the first element GREATER than the one you searched for, which will either be an index that
         *  meets your 2 criteria OR -1 if all elements in the list are greater than the one you searched for.
         */
        if(n1 < 0) n1 = -n1-1;
        if(n2 < 0) n2 = -n2-1;

        return this.subList(n1, n2);
    }
}

PointValue 只是一个包含 x 和 y 坐标的类.所以现在我只需调用 subList() 传递我想要的 x 坐标范围.在我的情况下,插入顺序也被排序,这对于使用 Collections.binarySearch()

PointValue is a just a class that contains x and y coordinates. So now I just call subList() passing the range of x coordinates I want. In my case the insertion order is also sorted which is important in order to use Collections.binarySearch()

相关文章