1. Modification of hash table
Note: Here we use a hash bucket to encapsulate unordered_map and unordered_set.
1.1. Hash table node structure
template<class T> struct HashNode { T _data; HashNode<T>* _next; HashNode(const T& data) :_data(data) , _next(nullptr) {} };
Because we want to reuse the hash table, that is, use the same hash table code to encapsulate unordered_map and unordered_set, so here we change the template parameter to T, which is the data type to be stored. For unordered_set, T is directly the data type to be stored; for unordered_map, T is of pair type.
In the insertion method, we use parameter constructors, and when creating nodes, we directly assign data through the constructor, so we also implement a constructor here.
1.2. Iterator
Iterator core source code:
template <class Value, class Key, class HashFcn, class ExtractKey, class EqualKey, class Alloc> struct __hashtable_iterator { typedef hashtable<Value, Key, HashFcn, ExtractKey, EqualKey, Alloc> hashtable; typedef __hashtable_iterator<Value, Key, HashFcn, ExtractKey, EqualKey, Alloc> iterator; typedef __hashtable_const_iterator<Value, Key, HashFcn, ExtractKey, EqualKey, Alloc> const_iterator; typedef __hashtable_node<Value> node; typedef forward_iterator_tag iterator_category; typedef Value value_type; node* cur; hashtable* ht; __hashtable_iterator(node* n, hashtable* tab) : cur(n), ht(tab) {} __hashtable_iterator() {} reference operator*() const { return cur->val; } #ifndef __SGI_STL_NO_ARROW_OPERATOR pointer operator->() const { return &(operator*()); } #endif /* __SGI_STL_NO_ARROW_OPERATOR */ iterator& operator++(); iterator operator++(int); bool operator==(const iterator& it) const { return cur == ; } bool operator!=(const iterator& it) const { return cur != ; } }; template <class V, class K, class HF, class ExK, class EqK, class A> __hashtable_iterator<V, K, HF, ExK, EqK, A>& __hashtable_iterator<V, K, HF, ExK, EqK, A>::operator++() { const node* old = cur; cur = cur->next; if (!cur) { size_type bucket = ht->bkt_num(old->val); while (!cur && ++bucket < ht->()) cur = ht->buckets[bucket]; } return *this; }
Iterator implementation ideas analysis:
- The large framework implemented by iterator is consistent with the iterator idea of list. It encapsulates the pointer of a node with a type, and then implements it through reloading operators. The behavior of iterators accessing like pointers. It should be noted that the iterator of the hash table is a one-way iterator.
- The difficulty of this is the implementation of operator++. There is a pointer to a node in the iterator. If there are still nodes under the current bucket, the pointer to the node can point to the next node. If the current bucket is finished, you need to find a way to calculate and find the next bucket. The difficulty of this is instead a problem of structural design. Referring to the above source code, we can know that in the iterator, in addition to the pointer with nodes, there are also pointers of the hash table object. In this way, when the current bucket is finished, it will be relatively easier to calculate the next bucket. Use the key value to calculate the current bucket position and then find the next bucket that is not empty.
- begin() returns the iterator constructed by the first node pointer in the first non-empty bucket. This end() return iterator can be represented by a null pointer.
- The iterator of unordered_map does not support modifying the key, but can modify the value. We can change the first parameter of the pair of the first template parameter of unordered_map to const K. HashTable<K, pair<const K, V>, MapKeyOfT, Hash> _ht; (The key is not allowed to be modified because the address stored in the hash table is mapped through the Key. If the key is modified, the structure of the hash table is destroyed).
- The iterator of unordered_set does not support modification. We can change the second template parameter of unordered_set to const K, HashTable<K, const K, SetKeyOfT, Hash> _ht; (same as unordered_map).
Specific code:
// Pre-declaration template<class K, class T, class KeyOfT, class Hash> class HashTable; template<class K, class T, class Ptr, class Ref, class KeyOfT, class Hash> struct HTIterator { typedef HashNode<T> Node; typedef HTIterator<K, T, Ptr, Ref, KeyOfT, Hash> Self; Node* _node; const HashTable<K, T, KeyOfT, Hash>* _pht; HTIterator(Node* node, const HashTable<K, T, KeyOfT, Hash>* pht) :_node(node) ,_pht(pht) {} Self& operator++() { if (_node->_next) { //There are nodes in the current bucket _node = _node->_next; } else { //The current bucket is finished, find the next bucket that is not empty KeyOfT kot; Hash hs; size_t hashi = hs(kot(_node->_data)) % _pht->_tables.size(); ++hashi; while (hashi < _pht->_tables.size()) { if (_pht->_tables[hashi]) { break; } ++hashi; } if (hashi == _pht->_tables.size()) { _node = nullptr; //end() } else { _node = _pht->_tables[hashi]; } } return *this; } Ref operator*() { return _node->_data; } Ptr operator->() { return &_node->_data; } bool operator!=(const Self& s) { return _node != s._node; } };
Note: The hash table needs to be predeclared here, because the hash table is used in the iterator, but the compiler searches upwards when compiling, and the hash table is below, an error will be reported because it cannot be found. Putting the hash table on top is not possible, because the iterator will also be encapsulated in the hash table. If the hash table is searched upwards on the top, the iterator will not be found. In short, there must be a predeclaration. In addition, when overloading the ++ operator in the iterator, in order to determine the location of the current node, a private member of the hash table is accessed, so a friend declaration is required in the hash table later.
1.3. Hash table structure
template<class K, class T, class KeyOfT, class Hash> class HashTable { // Friendship statement template<class K, class T, class Ptr, class Ref, class KeyOfT, class Hash> friend struct HTIterator; typedef HashNode<T> Node; //The node does not want to be accessed by the outside world public: typedef HTIterator<K, T, T*, T&, KeyOfT, Hash> Iterator; //Iterator needs to be accessed by the outside world typedef HTIterator<K, T, const T*, const T&, KeyOfT, Hash> ConstIterator; Iterator Begin() { if (_n == 0) //No valid data { return End(); } for (size_t i = 0; i < _tables.size(); i++) { Node* cur = _tables[i]; if (cur) { return Iterator(cur, this); } } return End(); } Iterator End() { return Iterator(nullptr, this); } ConstIterator Begin() const { if (_n == 0) return End(); for (size_t i = 0; i < _tables.size(); i++) { Node* cur = _tables[i]; if (cur) { return ConstIterator(cur, this); } } return End(); } ConstIterator End() const { return ConstIterator(nullptr, this); } HashTable() { _tables.resize(10, nullptr); } ~HashTable() { for (size_t i = 0; i < _tables.size(); i++) { Node* cur = _tables[i]; while (cur) { Node* next = cur->_next; delete cur; cur = next; } _tables[i] = nullptr; } } pair<Iterator,bool> Insert(const T& data) { KeyOfT kot; Iterator it = Find(kot(data)); //Do not if (it != End()) { return make_pair(it,false); } Hash hs; size_t hashi = hs(kot(data)) % _tables.size(); //Load factor==1 expansion if (_n == _tables.size()) { // New nodes are required and old nodes are released, which is less efficient // HashTable<K, V, Hash> newHT; // for (size_t i = 0; i < _tables.size(); i++) // { // Node* cur = _tables[i]; // while (cur) // { // (cur->_kv); // cur = cur->_next; // } // } // _tables.swap(newHT._tables); vector<Node*> newtables(_tables.size() * 2, nullptr); for (size_t i = 0; i < _tables.size(); i++) { Node* cur = _tables[i]; while (cur) { Node* next = cur->_next; //The nodes in the old table are remapping the location in the new table size_t hashi = hs(kot(cur->_data)) % (); cur->_next = newtables[hashi]; newtables[hashi] = cur; cur = next; } //The nodes have been moved to the new table, the old table is empty _tables[i] = nullptr; } _tables.swap(newtables); } //Head plug Node* newnode = new Node(data); newnode->_next = _tables[hashi]; _tables[hashi] = newnode; ++_n; return make_pair(Iterator(newnode,this),true); } Iterator Find(const K& key) { KeyOfT kot; Hash hs; size_t hashi = hs(key) % _tables.size(); Node* cur = _tables[hashi]; while (cur) { if (kot(cur->_data) == key) { return Iterator(cur,this); } cur = cur->_next; } return End(); } bool Erase(const K& key) { KeyOfT kot; Hash hs; size_t hashi = hs(key) % _tables.size(); Node* prev = nullptr; Node* cur = _tables[hashi]; while (cur) { if (kot(cur->_data) == key) { if (prev == nullptr) { _tables[hashi] = cur->_next; } else { prev->_next = cur->_next; } delete cur; --_n; return true; } prev = cur; cur = cur->_next; } return false; } private: vector<Node*> _tables; //Pointer Array size_t _n; //The number of data stored in the table }; }
Why are KeyOfT template parameters required:
Compared with map and set, the simulation implementation class structure of unordered_map and unordered_set is more complex, but the large framework and ideas are completely similar. Because HashTable implements generics and does not know whether the T parameters are K or pair, then when inserting the insert, it is necessary to convert the K object into a shaping modulus and K comparison equally (deduplication), because the value of the pair does not need to participate in the calculation modulus, and the pair supports the key and value equally by default. But in fact, what we need is that we only need to compare K objects at any time. Therefore, we implement a functor of MapKeyOfT and SetKeyOfT in the unordered_map and unordered_set layers respectively to pass it to the KeyOfT of HashTable. Then, the K object in the T-type object is taken out through the KeyOfT functor in the HashTable, and then convert it into a shaping modulus and K comparison equally.
Modification of return value:
Here, in order to conform to the use of unordered_map and unordered_set, the return value of the Find method is changed to an iterator. In order to implement the [ ] operator overload of unordered_map, the return value of the Insert method is the pair type, where the value of the first attribute of the returned pair object is the iterator of the new insert node/original node, and the value of the second attribute is bool type, which means whether the insertion is successful.
1.4. Complete code
namespace hash_bucket { template<class T> struct HashNode { T _data; HashNode<T>* _next; HashNode(const T& data) :_data(data) , _next(nullptr) {} }; // Pre-declaration template<class K, class T, class KeyOfT, class Hash> class HashTable; template<class K, class T, class Ptr, class Ref, class KeyOfT, class Hash> struct HTIterator { typedef HashNode<T> Node; typedef HTIterator<K, T, Ptr, Ref, KeyOfT, Hash> Self; Node* _node; const HashTable<K, T, KeyOfT, Hash>* _pht; HTIterator(Node* node, const HashTable<K, T, KeyOfT, Hash>* pht) :_node(node) ,_pht(pht) {} Self& operator++() { if (_node->_next) { //There are nodes in the current bucket _node = _node->_next; } else { //The current bucket is finished, find the next bucket that is not empty KeyOfT kot; Hash hs; size_t hashi = hs(kot(_node->_data)) % _pht->_tables.size(); ++hashi; while (hashi < _pht->_tables.size()) { if (_pht->_tables[hashi]) { break; } ++hashi; } if (hashi == _pht->_tables.size()) { _node = nullptr; //end() } else { _node = _pht->_tables[hashi]; } } return *this; } Ref operator*() { return _node->_data; } Ptr operator->() { return &_node->_data; } bool operator!=(const Self& s) { return _node != s._node; } }; template<class K, class T, class KeyOfT, class Hash> class HashTable { // Friendship statement template<class K, class T, class Ptr, class Ref, class KeyOfT, class Hash> friend struct HTIterator; typedef HashNode<T> Node; //The node does not want to be accessed by the outside world public: typedef HTIterator<K, T, T*, T&, KeyOfT, Hash> Iterator; //Iterator needs to be accessed by the outside world typedef HTIterator<K, T, const T*, const T&, KeyOfT, Hash> ConstIterator; Iterator Begin() { if (_n == 0) //No valid data { return End(); } for (size_t i = 0; i < _tables.size(); i++) { Node* cur = _tables[i]; if (cur) { return Iterator(cur, this); } } return End(); } Iterator End() { return Iterator(nullptr, this); } ConstIterator Begin() const { if (_n == 0) return End(); for (size_t i = 0; i < _tables.size(); i++) { Node* cur = _tables[i]; if (cur) { return ConstIterator(cur, this); } } return End(); } ConstIterator End() const { return ConstIterator(nullptr, this); } HashTable() { _tables.resize(10, nullptr); } ~HashTable() { for (size_t i = 0; i < _tables.size(); i++) { Node* cur = _tables[i]; while (cur) { Node* next = cur->_next; delete cur; cur = next; } _tables[i] = nullptr; } } pair<Iterator,bool> Insert(const T& data) { KeyOfT kot; Iterator it = Find(kot(data)); //Do not if (it != End()) { return make_pair(it,false); } Hash hs; size_t hashi = hs(kot(data)) % _tables.size(); //Load factor==1 expansion if (_n == _tables.size()) { // New nodes are required and old nodes are released, which is less efficient // HashTable<K, V, Hash> newHT; // for (size_t i = 0; i < _tables.size(); i++) // { // Node* cur = _tables[i]; // while (cur) // { // (cur->_kv); // cur = cur->_next; // } // } // _tables.swap(newHT._tables); vector<Node*> newtables(_tables.size() * 2, nullptr); for (size_t i = 0; i < _tables.size(); i++) { Node* cur = _tables[i]; while (cur) { Node* next = cur->_next; //The nodes in the old table are remapping the location in the new table size_t hashi = hs(kot(cur->_data)) % (); cur->_next = newtables[hashi]; newtables[hashi] = cur; cur = next; } //The nodes have been moved to the new table, the old table is empty _tables[i] = nullptr; } _tables.swap(newtables); } //Head plug Node* newnode = new Node(data); newnode->_next = _tables[hashi]; _tables[hashi] = newnode; ++_n; return make_pair(Iterator(newnode,this),true); } Iterator Find(const K& key) { KeyOfT kot; Hash hs; size_t hashi = hs(key) % _tables.size(); Node* cur = _tables[hashi]; while (cur) { if (kot(cur->_data) == key) { return Iterator(cur,this); } cur = cur->_next; } return End(); } bool Erase(const K& key) { KeyOfT kot; Hash hs; size_t hashi = hs(key) % _tables.size(); Node* prev = nullptr; Node* cur = _tables[hashi]; while (cur) { if (kot(cur->_data) == key) { if (prev == nullptr) { _tables[hashi] = cur->_next; } else { prev->_next = cur->_next; } delete cur; --_n; return true; } prev = cur; cur = cur->_next; } return false; } private: vector<Node*> _tables; //Pointer Array size_t _n; //The number of data stored in the table }; }
2. Unordered_map implementation
There is no difficulty in implementing it here, just put a layer of shell directly, and all calls will eventually adjust the hash table method, so I won't go into details here, just put the code directly.
#include"" namespace bit { template<class K, class V, class Hash = HashFunc<K>> class unordered_map { struct MapKeyOfT { const K& operator()(const pair<K, V>& kv) { return ; } }; public: typedef typename hash_bucket::HashTable<K, pair<const K, V>, MapKeyOfT, Hash>::Iterator iterator; typedef typename hash_bucket::HashTable<K, pair<const K, V>, MapKeyOfT, Hash>::ConstIterator const_iterator; iterator begin() { return _ht.Begin(); } iterator end() { return _ht.End(); } const_iterator begin() const { return _ht.Begin(); } const_iterator end() const { return _ht.End(); } V& operator[](const K& key) { pair<iterator, bool> ret = _ht.Insert(make_pair(key, V())); return ->second; } pair<iterator, bool> insert(const pair<K, V>& kv) { return _ht.Insert(kv); } iterator find(const K& key) { return _ht.Find(key); } bool erase(const K& key) { return _ht.Erase(key); } private: hash_bucket::HashTable<K, pair<const K, V>, MapKeyOfT, Hash> _ht; }; void test_map() { unordered_map<string, string> dict; ({ "sort", "Sorting" }); ({ "left", "left" }); ({ "right", "right" }); dict["left"] = "Left, remaining"; dict["insert"] = "insert"; dict["string"]; unordered_map<string, string>::iterator it = (); while (it != ()) { // Cannot modify first, second //it->first += 'x'; it->second += 'x'; cout << it->first << ":" << it->second << endl; ++it; } cout << endl; } }
Implementation of unordered_set
Like unordered_map, it is to directly put a shell on it. All calls will eventually adjust the hash table method, so I won't go into details here, just put the code directly.
#include"" namespace bit { template<class K, class Hash = HashFunc<K>> class unordered_set { struct SetKeyOfT { const K& operator()(const K& key) { return key; } }; public: typedef typename hash_bucket::HashTable<K, const K, SetKeyOfT, Hash>::Iterator iterator; typedef typename hash_bucket::HashTable<K, const K, SetKeyOfT, Hash>::ConstIterator const_iterator; iterator begin() { return _ht.Begin(); } iterator end() { return _ht.End(); } const_iterator begin() const { return _ht.Begin(); } const_iterator end() const { return _ht.End(); } pair<iterator, bool> insert(const K& key) { return _ht.Insert(key); } iterator find(const K& key) { return _ht.Find(key); } bool erase(const K& key) { return _ht.Erase(key); } private: hash_bucket::HashTable<K, const K, SetKeyOfT, Hash> _ht; }; void Print(const unordered_set<int>& s) { unordered_set<int>::const_iterator it = (); while (it != ()) { // *it += 1; cout << *it << " "; ++it; } cout << endl; } struct Date { int _year; int _month; int _day; bool operator==(const Date& d) const { return _year == d._year && _month == d._month && _day == d._day; } }; struct HashDate { size_t operator()(const Date& key) { // 112 // 121 return (key._year * 31 + key._month) * 31 + key._day; } }; void test_set() { unordered_set<int> s; int a[] = { 4, 2, 6, 1, 3, 5, 15, 7, 16, 14, 3,3,15 }; for (auto e : a) { (e); } for (auto e : s) { cout << e << " "; } cout << endl; unordered_set<int>::iterator it = (); while (it != ()) { //*it += 1; cout << *it << " "; ++it; } cout << endl; unordered_set<Date, HashDate> us; ({ 2024, 7, 25 }); ({ 2024, 7, 26 }); Print(s); } }
This is the end of this article about the implementation of unordered encapsulation in C++. For more related content of unordered encapsulation, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!