双向地图是否有更有效的实现?

2021-12-22 00:00:00 data-structures c++ c++11 map bimap

我创建了一个简单的双向映射类，它通过内部存储两个具有相反键/值类型的std::map实例来工作，并提供一个用户友好的界面:

template类 Bimap{std::map地图1;std::map地图2;//...};

有没有更有效的方法来实现不需要两倍内存的双向映射?
bimap 通常是如何实现的?

<小时>

bimap 元素应该是可变的还是不可变的?(更改 map1 中的一个元素应该会更改 map2 中的键，但键是常量，这是不可能的 - 解决方案是什么?)
元素的所有权也是另一个问题:当用户在 bimap 中插入键值对时，bimap 应该制作该键值对的副本并存储它，然后内部的第二个映射(与反转的键/值)不应复制而是指向原始对.如何实现?

<小时>

编辑 2:

我在代码审查中发布了一个可能的实现.

解决方案

在双映射的所有简单实现中双重存储数据存在一定的问题.如果您可以将其分解为来自外部的指针双映射，那么您可以轻松地忽略这一点，并像 Arkaitz Jimenez 一样简单地保留 std::map<A*,B*> 形式的两个映射已经建议(尽管与他的回答相反，您必须关心外部存储以避免 A->A* 查找).但是如果你有指针，为什么不简单地存储一个 std::pair<A,B> 在你本来存储 A 和 B 的地方 分开?

使用 std::map 而不是 std::map 会更好例如，将允许通过新创建的具有相同内容的字符串而不是指向创建该对的原始字符串的指针来查找与字符串关联的元素.但是习惯上为每个条目存储密钥的完整副本，并且仅依靠哈希来找到正确的存储桶.这样，即使在散列冲突的情况下，返回的项目也将是正确的...

如果你想让它又快又脏，有这个

<块引用>

骇人听闻的解决方案:

创建两个地图 std::mapmapA 和 std::map地图B.插入时对要插入的两个元素进行散列以获取相应映射的键.

void insert(const A &a, const B &b) {size_t hashA = std::hash(a);size_t hashB = std::hash(b);mapA.insert({hashB, a});mapB.insert({hashA, b});}

查找的实现方式类似.

使用 multimap 而不是 map 并通过在其他地图中查找来验证您获得的每个元素(从以下位置获取候选 bmapA，散列 b 并查看 mapB 如果它匹配想要的键，否则迭代到下一个候选 b)这是一个有效的实现 -但在我看来仍然是骇人听闻的...

通过使用用于比较条目(见上文)的元素副本作为唯一存储，您可以获得更好的解决方案.但是，要解决这个问题有点困难.详细说明:

<块引用>

更好的解决方案:

创建两组对作为 std::set 和 std::set> 并重载 operator< 和 operator== 以仅考虑对的第一个元素(或提供相应的比较类).有必要创建成对集而不是映射(内部看起来类似)，因为我们需要保证 A 和 B 将在内存中处于恒定位置.在插入 pair 后，我们将其拆分为适合上述集合的两个元素.

 std::set>地图A;std::set地图B;void insert(const A &a, const B &b) {auto aitr = mapA.insert({b, nullptr}).first;//创建第一对B *bp = &(aitr->first);//获取我们存储的 b 副本的指针auto bitr = mapB.insert({a, bp}).first;//插入第二对 {a,pointer_to_b}A *ap = &(bitr->first);//更新 mapA 中的指针以指向 aaitr->second = ap;}

<块引用>

现在可以简单地通过简单的 std::set 查找和指针取消引用来完成查找.

这个更好的解决方案类似于 boost 使用的解决方案 - 即使它们使用一些匿名指针作为对的第二个元素，因此必须使用 reinterpret_casts.

请注意，对的 .second 部分需要是可变的(所以我不确定 std::pair 是否可以使用)，或者您必须即使对于这个简单的插入，添加另一层抽象(std::set> mapA).在这两种解决方案中，您都需要临时元素来返回对元素的非常量引用.

I created a simple bidirectional map class that works by internally storing two std::map instances, with opposite key/value types, and providing an user-friendly interface:

template<class T1, class T2> class Bimap
{
    std::map<T1, T2> map1;
    std::map<T2, T1> map2;
    // ...
};

Is there a more efficient method of implementing a bidirectional map that doesn't require twice the memory?
How is a bimap usually implemented?

EDIT:

Should bimap element be mutable or immutable? (Changing one element in map1 should change the key in map2, but keys are const and that's impossible - what's the solution?)
Ownership of elements is also another problem: when an user inserts a key-value pair in the bimap, the bimap should make a copy of that key-value pair and store it, then the internal second map (with inverted key/value) should not copy but point to the original pair. How can this be achieved?

EDIT 2:

I've posted a possible implementation I made on Code Review.

解决方案

There is a certain problem with double-storing your data in all simple implementations of a bimap. If you can break it down to a bimap of pointers from outside, then you can readily ignore this and simply keep both maps of the form std::map<A*,B*> like Arkaitz Jimenez already suggested (though contrary to his answer you have to care about the storage from outside to avoid a A->A* lookup). But if you have the pointers anyway, why not simply store a std::pair<A,B> at the point where you would otherwise store A and B separately?

It would be nice to have std::map<A,B*> instead of std::map<A*,B*> as this would allow for example the lookup of an element associated to an string by a newly created string with the same content instead of the pointer to the original string that created the pair. But it is customary to store a full copy of the key with every entry and only rely on the hash to find the right bucket. This way the returned item will be the correct one even in the case of a hash-collision...

If you want to have it quick and dirty though, there is this

hackish solution:

Create two maps std::map<size_t, A> mapA and std::map<size_t, B> mapB. Upon insertion hash both elements that are to be inserted to get the keys to the respective maps.
void insert(const A &a, const B &b) {
    size_t hashA = std::hash<A>(a);
    size_t hashB = std::hash<B>(b);

    mapA.insert({hashB, a});
    mapB.insert({hashA, b});
}
Lookup is implemented analogously.

Using a multimap instead of a map and verifying every element you get with a lookup in the respectively other map (get candidate b from mapA, hash b and look in mapB if it matches the wanted key, iterate to the next candidate b otherwise) this is a valid implementation - but still hackish in my opinion...

You can get a much nicer solution by using the copies of the elements that are used to compare the entries (see above) as only storage. It is a bit harder to get your head around that though. To elaborate:

a nicer solution:

Create two sets of pairs as std::set<pair<A, B*>> and std::set<pair<B, A*>> and overload the operator< and operator== to only take the first element of the pairs into account (or provide an corresponding comparion class). It is necessary to create sets of pairs instead of maps (which internally look similarly) because we need a guarantee that A and B will be at constant positions in memory. Upon insertion of an pair<A, B> we split it into two elements that fit into the above sets.

  std::set<pair<B, A*>> mapA;
  std::set<pair<A, B*>> mapB;

  void insert(const A &a, const B &b) {
      auto aitr = mapA.insert({b, nullptr}).first; // creates first pair
      B *bp = &(aitr->first);  // get pointer of our stored copy of b
      auto bitr = mapB.insert({a, bp}).first; 
      // insert second pair {a, pointer_to_b}
      A *ap = &(bitr->first);  // update pointer in mapA to point to a
      aitr->second = ap;
  }

Lookup can now simply be done by a simple std::set lookup and a pointer dereference.

This nicer solution is similar to the solution that boost uses - even though they use some annonymized pointers as second elements of the pairs and thus have to use reinterpret_casts.

Note that the .second part of the pairs need to be mutable (so I'm not sure std::pair can be used), or you have to add another layer of abstraction (std::set<pair<B, A**>> mapA) even for this simple insertion. In both solutions you need temporary elements to return non-const references to elements.

相关文章