使用自定义类类型作为键的C ++ unordered_map

285

我正在尝试使用自定义类作为的键unordered_map，如下所示：

#include <iostream>
#include <algorithm>
#include <unordered_map>

using namespace std;

class node;
class Solution;

class Node {
public:
    int a;
    int b; 
    int c;
    Node(){}
    Node(vector<int> v) {
        sort(v.begin(), v.end());
        a = v[0];       
        b = v[1];       
        c = v[2];       
    }

    bool operator==(Node i) {
        if ( i.a==this->a && i.b==this->b &&i.c==this->c ) {
            return true;
        } else {
            return false;
        }
    }
};

int main() {
    unordered_map<Node, int> m;    

    vector<int> v;
    v.push_back(3);
    v.push_back(8);
    v.push_back(9);
    Node n(v);

    m[n] = 0;

    return 0;
}

但是，g ++给我以下错误：

In file included from /usr/include/c++/4.6/string:50:0,
                 from /usr/include/c++/4.6/bits/locale_classes.h:42,
                 from /usr/include/c++/4.6/bits/ios_base.h:43,
                 from /usr/include/c++/4.6/ios:43,
                 from /usr/include/c++/4.6/ostream:40,
                 from /usr/include/c++/4.6/iostream:40,
                 from 3sum.cpp:4:
/usr/include/c++/4.6/bits/stl_function.h: In member function ‘bool std::equal_to<_Tp>::operator()(const _Tp&, const _Tp&) const [with _Tp = Node]’:
/usr/include/c++/4.6/bits/hashtable_policy.h:768:48:   instantiated from ‘bool std::__detail::_Hash_code_base<_Key, _Value, _ExtractKey, _Equal, _H1, _H2, std::__detail::_Default_ranged_hash, false>::_M_compare(const _Key&, std::__detail::_Hash_code_base<_Key, _Value, _ExtractKey, _Equal, _H1, _H2, std::__detail::_Default_ranged_hash, false>::_Hash_code_type, std::__detail::_Hash_node<_Value, false>*) const [with _Key = Node, _Value = std::pair<const Node, int>, _ExtractKey = std::_Select1st<std::pair<const Node, int> >, _Equal = std::equal_to<Node>, _H1 = std::hash<Node>, _H2 = std::__detail::_Mod_range_hashing, std::__detail::_Hash_code_base<_Key, _Value, _ExtractKey, _Equal, _H1, _H2, std::__detail::_Default_ranged_hash, false>::_Hash_code_type = long unsigned int]’
/usr/include/c++/4.6/bits/hashtable.h:897:2:   instantiated from ‘std::_Hashtable<_Key, _Value, _Allocator, _ExtractKey, _Equal, _H1, _H2, _Hash, _RehashPolicy, __cache_hash_code, __constant_iterators, __unique_keys>::_Node* std::_Hashtable<_Key, _Value, _Allocator, _ExtractKey, _Equal, _H1, _H2, _Hash, _RehashPolicy, __cache_hash_code, __constant_iterators, __unique_keys>::_M_find_node(std::_Hashtable<_Key, _Value, _Allocator, _ExtractKey, _Equal, _H1, _H2, _Hash, _RehashPolicy, __cache_hash_code, __constant_iterators, __unique_keys>::_Node*, const key_type&, typename std::_Hashtable<_Key, _Value, _Allocator, _ExtractKey, _Equal, _H1, _H2, _Hash, _RehashPolicy, __cache_hash_code, __constant_iterators, __unique_keys>::_Hash_code_type) const [with _Key = Node, _Value = std::pair<const Node, int>, _Allocator = std::allocator<std::pair<const Node, int> >, _ExtractKey = std::_Select1st<std::pair<const Node, int> >, _Equal = std::equal_to<Node>, _H1 = std::hash<Node>, _H2 = std::__detail::_Mod_range_hashing, _Hash = std::__detail::_Default_ranged_hash, _RehashPolicy = std::__detail::_Prime_rehash_policy, bool __cache_hash_code = false, bool __constant_iterators = false, bool __unique_keys = true, std::_Hashtable<_Key, _Value, _Allocator, _ExtractKey, _Equal, _H1, _H2, _Hash, _RehashPolicy, __cache_hash_code, __constant_iterators, __unique_keys>::_Node = std::__detail::_Hash_node<std::pair<const Node, int>, false>, std::_Hashtable<_Key, _Value, _Allocator, _ExtractKey, _Equal, _H1, _H2, _Hash, _RehashPolicy, __cache_hash_code, __constant_iterators, __unique_keys>::key_type = Node, typename std::_Hashtable<_Key, _Value, _Allocator, _ExtractKey, _Equal, _H1, _H2, _Hash, _RehashPolicy, __cache_hash_code, __constant_iterators, __unique_keys>::_Hash_code_type = long unsigned int]’
/usr/include/c++/4.6/bits/hashtable_policy.h:546:53:   instantiated from ‘std::__detail::_Map_base<_Key, _Pair, std::_Select1st<_Pair>, true, _Hashtable>::mapped_type& std::__detail::_Map_base<_Key, _Pair, std::_Select1st<_Pair>, true, _Hashtable>::operator[](const _Key&) [with _Key = Node, _Pair = std::pair<const Node, int>, _Hashtable = std::_Hashtable<Node, std::pair<const Node, int>, std::allocator<std::pair<const Node, int> >, std::_Select1st<std::pair<const Node, int> >, std::equal_to<Node>, std::hash<Node>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, false, false, true>, std::__detail::_Map_base<_Key, _Pair, std::_Select1st<_Pair>, true, _Hashtable>::mapped_type = int]’
3sum.cpp:149:5:   instantiated from here
/usr/include/c++/4.6/bits/stl_function.h:209:23: error: passing ‘const Node’ as ‘this’ argument of ‘bool Node::operator==(Node)’ discards qualifiers [-fpermissive]
make: *** [threeSum] Error 1

我猜想，我需要告诉C ++如何对类进行哈希处理Node，但是，我不太确定该怎么做。如何完成这项任务？

— 钟fr
source

在第三个模板参数是您需要提供的散列函数。

— chrisaycock

cppreference有一个简单而实际的示例，说明了如何执行此操作：en.cppreference.com/w/cpp/container/unordered_map/unordered_map

— jogojapan 2013年

Answers:

484

为了能够std::unordered_map与用户定义的键类型一起使用（或其他无序关联容器之一），您需要定义两件事：

一个哈希函数 ; operator()在给定键类型的对象的情况下，它必须是重写并计算哈希值的类。一种特别简单的方法是std::hash针对您的密钥类型专门设计模板。
相等的比较函数 ; 这是必需的，因为哈希不能依赖以下事实：哈希函数将始终为每个不同的键提供唯一的哈希值（即，它必须能够处理冲突），因此需要一种比较两个给定键的方法完全匹配。您可以将其实现为重写的类，也可以实现operator()为的专业化std::equal，或者（最简单的）通过重载operator==()键类型（如前所述）来实现。

哈希函数的困难在于，如果您的键类型由多个成员组成，则通常将让哈希函数为各个成员计算哈希值，然后以某种方式将它们组合为整个对象的一个哈希值。为了获得良好的性能（即，很少发生冲突），您应该仔细考虑如何组合各个哈希值，以确保避免过于频繁地为不同对象获得相同的输出。

哈希函数的一个很好的起点是使用移位和按位XOR组合各个哈希值的函数。例如，假设这样的键类型：

struct Key
{
  std::string first;
  std::string second;
  int         third;

  bool operator==(const Key &other) const
  { return (first == other.first
            && second == other.second
            && third == other.third);
  }
};

这是一个简单的哈希函数（改编自cppreference示例中用于用户定义哈希函数的）：

namespace std {

  template <>
  struct hash<Key>
  {
    std::size_t operator()(const Key& k) const
    {
      using std::size_t;
      using std::hash;
      using std::string;

      // Compute individual hash values for first,
      // second and third and combine them using XOR
      // and bit shifting:

      return ((hash<string>()(k.first)
               ^ (hash<string>()(k.second) << 1)) >> 1)
               ^ (hash<int>()(k.third) << 1);
    }
  };

}

有了这个，您可以实例化一个std::unordered_mapkey-type：

int main()
{
  std::unordered_map<Key,std::string> m6 = {
    { {"John", "Doe", 12}, "example"},
    { {"Mary", "Sue", 21}, "another"}
  };
}

它将自动使用std::hash<Key>上面定义的哈希值计算，以及operator==定义为的成员函数Key进行相等性检查。

如果您不想在std名称空间内专门化模板（尽管在这种情况下完全合法），则可以将哈希函数定义为单独的类，然后将其添加到地图的模板参数列表中：

struct KeyHasher
{
  std::size_t operator()(const Key& k) const
  {
    using std::size_t;
    using std::hash;
    using std::string;

    return ((hash<string>()(k.first)
             ^ (hash<string>()(k.second) << 1)) >> 1)
             ^ (hash<int>()(k.third) << 1);
  }
};

int main()
{
  std::unordered_map<Key,std::string,KeyHasher> m6 = {
    { {"John", "Doe", 12}, "example"},
    { {"Mary", "Sue", 21}, "another"}
  };
}

如何定义更好的哈希函数？如上所述，定义良好的哈希函数对于避免冲突并获得良好的性能很重要。为了获得真正的好处，您需要考虑所有字段的可能值的分布，并定义一个哈希函数，该哈希函数将该分布投影到尽可能广泛且均匀分布的可能结果空间。

这可能很难。上面的XOR /移位方法可能不是一个不好的开始。为了更好地开始，您可以使用Boost库中的hash_valueand hash_combine函数模板。前者的行为与std::hash标准类型相似（最近还包括元组和其他有用的标准类型）；后者可帮助您将单个哈希值组合为一个。这是使用Boost辅助函数的哈希函数的重写：

#include <boost/functional/hash.hpp>

struct KeyHasher
{
  std::size_t operator()(const Key& k) const
  {
      using boost::hash_value;
      using boost::hash_combine;

      // Start with a hash value of 0    .
      std::size_t seed = 0;

      // Modify 'seed' by XORing and bit-shifting in
      // one member of 'Key' after the other:
      hash_combine(seed,hash_value(k.first));
      hash_combine(seed,hash_value(k.second));
      hash_combine(seed,hash_value(k.third));

      // Return the result.
      return seed;
  }
};

这是一个重写，它不使用boost，而是使用组合哈希的好方法：

namespace std
{
    template <>
    struct hash<Key>
    {
        size_t operator()( const Key& k ) const
        {
            // Compute individual hash values for first, second and third
            // http://stackoverflow.com/a/1646913/126995
            size_t res = 17;
            res = res * 31 + hash<string>()( k.first );
            res = res * 31 + hash<string>()( k.second );
            res = res * 31 + hash<int>()( k.third );
            return res;
        }
    };
}

— Jogojapan
source

您能解释一下为什么需要移入位KeyHasher吗？

— Chani 2014年

如果您不移位并且两个字符串相同，则异或会导致它们彼此抵消。因此hash（“ a”，“ a”，1）与hash（“ b”，“ b”，1）相同。而且顺序也无关紧要，所以hash（“ a”，“ b”，1）与hash（“ b”，“ a”，1）相同。

— Buge 2014年

我只是在学习C ++，而我经常遇到的一件事是：将代码放在哪里？std::hash正如您所做的那样，我已经为我的密钥编写了一种专门化方法。我将其放在Key.cpp文件的底部，但出现以下错误：

Error 57 error C2440: 'type cast' : cannot convert from 'const Key' to 'size_t'	c:\program files (x86)\microsoft visual studio 10.0\vc\include\xfunctional

。我猜编译器没有找到我的哈希方法？我应该在Key.h文件中添加任何内容吗？

— 2015年

@Ben将其放入.h文件是正确的。std::hash实际上不是结构，而是struct的模板（专业化）。因此，它不是实现，而是在编译器需要时将其转变为实现。模板应始终放入头文件中。另请参见stackoverflow.com/questions/495021/...

— jogojapan

@nightfury find()返回一个迭代器，该迭代器指向地图的“入口”。条目std::pair由键和值组成。因此，如果这样做auto iter = m6.find({"John","Doe",12});，您将得到键，iter->first而其中的值（即字符串"example"）也将得到iter->second。如果直接需要该字符串，则可以使用m6.at({"John","Doe",12})（如果键不存在则将引发异常），也可以使用（如果键不存在m6[{"John","Doe",12}]则将创建一个空值）。

— jogojapan 2015年

我认为，高戈潘（Jogojapan）给出了一个很好的详尽的答案。您一定要在阅读我的文章之前先看一下。但是，我想添加以下内容：

您可以unordered_map单独定义一个比较函数，而不是使用相等比较运算符（operator==）。例如，如果您想使用后者将两个Node对象的所有成员彼此进行比较，而仅将某些特定成员用作的键，则这可能会有所帮助unordered_map。
您也可以使用lambda表达式来代替定义哈希和比较函数。

总而言之，对于您的Node课程，代码可以编写如下：

using h = std::hash<int>;
auto hash = [](const Node& n){return ((17 * 31 + h()(n.a)) * 31 + h()(n.b)) * 31 + h()(n.c);};
auto equal = [](const Node& l, const Node& r){return l.a == r.a && l.b == r.b && l.c == r.c;};
std::unordered_map<Node, int, decltype(hash), decltype(equal)> m(8, hash, equal);

笔记：

我只是在jogojapan回答的结尾重用了哈希方法，但是您可以在这里找到更通用的解决方案的想法（如果您不想使用Boost的话）。
我的代码可能太小了。有关可读性更高的版本，请参见Ideone上的此代码。

— 鸣喇叭
source

8是从哪里来的？这是什么意思？

— AndiChin

@WhalalalalalalaCHen：请查看构造函数的文档unordered_map。该8代表所谓的“斗数”。桶是容器内部哈希表中的插槽，请参见例如unordered_map::bucket_count更多信息。

— 嘟

@WhalalalalalalaCHen：我8随机选择。根据您要存储在其中的内容，存储unordered_map桶数可能会影响容器的性能。

— 嘟