跟踪插入顺序的std

113

我目前有一个std::map<std::string,int>将整数值存储到唯一字符串标识符的，并且确实在查询该字符串。它主要执行我想要的操作，除了它不跟踪插入顺序。因此，当我迭代地图以打印出值时，它们会根据字符串进行排序；但我希望根据（第一次）插入的顺序对它们进行排序。

我考虑过使用a vector<pair<string,int>>，但是我需要查找字符串并将整数值递增约10,000,000次，因此我不知道a是否std::vector会明显变慢。

有没有一种使用方法，std::map还是有另一个std更适合我需要的容器？

[我使用的是GCC 3.4，并且我的值对可能不超过50对std::map。

谢谢。

— 多种语言
source

8

std :: map快速查找时间的一部分与它按顺序排序有关，因此可以执行二进制搜索。就是也不能吃你的蛋糕！

— bobobobo 2012年

1

那时您最终使用了什么？

— aggsol

56

如果std :: map中只有50个值，则可以在打印出来之前将它们复制到std :: vector并使用适当的函子通过std :: sort进行排序。

或者，您可以使用boost :: multi_index。它允许使用多个索引。在您的情况下，它可能如下所示：

struct value_t {
      string s;
      int    i;
};
struct string_tag {};
typedef multi_index_container<
    value_t,
    indexed_by<
        random_access<>, // this index represents insertion order
        hashed_unique< tag<string_tag>, member<value_t, string, &value_t::s> >
    >
> values_t;

— 基里尔诉莱德温斯基
source

那很棒！Boost甚至有一个成员选择器来完成这项工作！

— xtofl

2

是的，multi_index是我在boost中最喜欢的功能：）

— Kirill V. Lyadvinsky

3

@Kristo：这与容器的大小无关，而与重用现有实现来解决此问题有关。很好诚然，C ++不是一种功能语言，因此语法有些复杂。

— xtofl

4

从什么时候开始编程来保存按键？

— GManNickG

1

感谢您发布此信息。是否有一本“提高假人多重索引”的书？我可以用它...

— 唐明

25

您可以将a std::vector与std::tr1::unordered_map（哈希表）结合使用。这里有一个链接Boost的文档进行unordered_map。您可以使用向量来跟踪插入顺序，并使用哈希表来进行频繁查找。如果您要进行成千上万的查找，则std::map哈希表的O（log n）查找与O（1）之间的差异可能会很大。

std::vector<std::string> insertOrder;
std::tr1::unordered_map<std::string, long> myTable;

// Initialize the hash table and record insert order.
myTable["foo"] = 0;
insertOrder.push_back("foo");
myTable["bar"] = 0;
insertOrder.push_back("bar");
myTable["baz"] = 0;
insertOrder.push_back("baz");

/* Increment things in myTable 100000 times */

// Print the final results.
for (int i = 0; i < insertOrder.size(); ++i)
{
    const std::string &s = insertOrder[i];
    std::cout << s << ' ' << myTable[s] << '\n';
}

— 迈克尔·克里斯托菲克
source

4

@xtofl，这如何使我的回答无济于事，因此值得一票否决？我的代码在某种程度上不正确吗？

— Michael Kristofik

这是最好的方法。非常便宜的内存成本（仅50个字符串！），可以std::map按预期的方式工作（即，在插入时对自身进行排序），并且运行速度很快。（我在写完我的版本后使用std :: list读了这篇文章！）

— bobobobo 2012年

我认为std :: vector或std :: list是一个品味问题，尚不清楚哪个更好。（向量具有不需要的随机访问，也具有连续的内存，也不需要。连续存储顺序，而不会牺牲这两个功能中的任何一个，例如在增长时进行重新分配）。

— 奥利弗·舍洛克（OliverSchönrock），

14

保持平行list<string> insertionOrder。

当需要打印时，请在列表上进行迭代并在地图中进行查找。

each element in insertionOrder  // walks in insertionOrder..
    print map[ element ].second // but lookup is in map

— 波波波波
source

1

这也是我的第一个想法，但是它在第二个容器中复制了密钥，对吗？如果不是std :: string键的话，不是吗？

— 奥利弗·舍洛克（OliverSchönrock），

2

++的C @OliverSchonrock 17个，您可以使用std::string_view该地图的钥匙闯民宅到std::string在insertionOrder列表中。这样可以避免复制，但是您需要注意，insertionOrder元素在引用它们的映射中的生存期会更长。

— flyx

我最终写了一个将地图和列表整合到其中的容器：codereview.stackexchange.com/questions/233177/… 没有重复

— OliverSchönrock19年

10

Tessil有一个很好的有序地图（和设置）的实现，它是MIT许可。您可以在这里找到它：ordered-map

地图范例

#include <iostream>
#include <string>
#include <cstdlib>
#include "ordered_map.h"

int main() {
tsl::ordered_map<char, int> map = {{'d', 1}, {'a', 2}, {'g', 3}};
map.insert({'b', 4});
map['h'] = 5;
map['e'] = 6;

map.erase('a');


// {d, 1} {g, 3} {b, 4} {h, 5} {e, 6}
for(const auto& key_value : map) {
    std::cout << "{" << key_value.first << ", " << key_value.second << "}" << std::endl;
}


map.unordered_erase('b');

// Break order: {d, 1} {g, 3} {e, 6} {h, 5}
for(const auto& key_value : map) {
    std::cout << "{" << key_value.first << ", " << key_value.second << "}" << std::endl;
}
}

— 凝集素
source

4

如果您同时需要两种查找策略，则最终会有两个容器。您可以将a vector与实际值（ints）一起使用，并在其map< string, vector< T >::difference_type> 旁边放置一个，将索引返回到向量中。

为了完成所有这些，您可以将两者封装在一个类中。

但我相信boost具有多个索引的容器。

— 托福
source

3

您想要的（无助于Boost）是我所谓的“有序哈希”，它本质上是哈希和具有字符串或整数键（或同时存在）的链表的混搭。有序哈希在迭代过程中保持元素的顺序，并具有哈希的绝对性能。

我一直在整理一个相对较新的C ++代码段库，该库填补了我认为对于C ++库开发人员而言C ++语言中的空白。到这里：

https://github.com/cubiclesoft/cross-platform-cpp

抓：

templates/detachable_ordered_hash.cpp
templates/detachable_ordered_hash.h
templates/detachable_ordered_hash_util.h

如果将用户控制的数据放入哈希中，则您可能还需要：

security/security_csprng.cpp
security/security_csprng.h

调用它：

#include "templates/detachable_ordered_hash.h"
...
// The 47 is the nearest prime to a power of two
// that is close to your data size.
//
// If your brain hurts, just use the lookup table
// in 'detachable_ordered_hash.cpp'.
//
// If you don't care about some minimal memory thrashing,
// just use a value of 3.  It'll auto-resize itself.
int y;
CubicleSoft::OrderedHash<int> TempHash(47);
// If you need a secure hash (many hashes are vulnerable
// to DoS attacks), pass in two randomly selected 64-bit
// integer keys.  Construct with CSPRNG.
// CubicleSoft::OrderedHash<int> TempHash(47, Key1, Key2);
CubicleSoft::OrderedHashNode<int> *Node;
...
// Push() for string keys takes a pointer to the string,
// its length, and the value to store.  The new node is
// pushed onto the end of the linked list and wherever it
// goes in the hash.
y = 80;
TempHash.Push("key1", 5, y++);
TempHash.Push("key22", 6, y++);
TempHash.Push("key3", 5, y++);
// Adding an integer key into the same hash just for kicks.
TempHash.Push(12345, y++);
...
// Finding a node and modifying its value.
Node = TempHash.Find("key1", 5);
Node->Value = y++;
...
Node = TempHash.FirstList();
while (Node != NULL)
{
  if (Node->GetStrKey())  printf("%s => %d\n", Node->GetStrKey(), Node->Value);
  else  printf("%d => %d\n", (int)Node->GetIntKey(), Node->Value);

  Node = Node->NextList();
}

在研究阶段，我遇到了这个SO线程，以查看是否存在诸如OrderedHash之类的东西，而无需我放入庞大的库中。我很失望。所以我写了我自己的。现在，我已经分享了。

— CubicleSoft
source

2

您无法使用地图执行此操作，但可以使用两个单独的结构-地图和向量，并使它们保持同步-也就是说，当您从地图中删除时，在向量中查找并删除元素。或者，您可以创建一个map<string, pair<int,int>>-，然后在插入时存储地图的size（）以记录位置以及int的值，然后在打印时使用position成员进行排序。

— 费萨尔瓦利
source

2

实现此目标的另一种方法是使用map而不是vector。我将向您展示这种方法并讨论差异：

只需创建一个在后台具有两个地图的类即可。

#include <map>
#include <string>

using namespace std;

class SpecialMap {
  // usual stuff...

 private:
  int counter_;
  map<int, string> insertion_order_;
  map<string, int> data_;
};

然后，您可以data_按正确的顺序将迭代器公开给迭代器。您的迭代方式是insertion_order_，对于从迭代获得的每个元素，请data_使用中的值进行查找insertion_order_

您可以hash_map对insert_order 使用更有效的方法，因为您不必担心直接迭代insertion_order_。

要进行插入，您可以使用如下方法：

void SpecialMap::Insert(const string& key, int value) {
  // This may be an over simplification... You ought to check
  // if you are overwriting a value in data_ so that you can update
  // insertion_order_ accordingly
  insertion_order_[counter_++] = key;
  data_[key] = value;
}

有很多方法可以使设计更好，并担心性能，但这是使您自己开始实现此功能的好基础。您可以将其模板化，并且实际上可以将对作为值存储在data_中，以便可以轻松地在insert_order_中引用该条目。但是，我将这些设计问题留给练习：-)。

更新：我想我应该说些关于插入图_矢量使用地图与矢量的效率

直接在数据中查找，在两种情况下均为O（1）
向量方法中的插入为O（1），地图方法中的插入为O（logn）
向量方法中的删除为O（n），因为您必须扫描要删除的项目。使用映射方法时，它们为O（logn）。

也许如果您不打算使用删除操作那么多，则应该使用向量方法。如果您支持其他顺序（例如优先级）而不是插入顺序，则映射方法会更好。

— 汤姆
source

如果您需要通过“插入ID”获取商品，则地图方法也更好。例如，如果您要插入第5个项目，则可以使用key 5（或4，取决于counter_的起始位置）在insert_order中进行查找。使用矢量方法，如果删除了第5个项目，则实际上将获得插入的第6个项目。

— 汤姆（Tom）” 2009年

2

这是一个只需要标准模板库而无需使用boost的multiindex的解决方案：
您可以使用std::map<std::string,int>;and vector <data>;在map中的何处存储vector中数据位置的索引，而vector按插入顺序存储数据。在这里，数据访问具有O（log n）复杂度。以插入顺序显示数据具有O（n）复杂度。数据插入具有O（log n）复杂度。

例如：

#include<iostream>
#include<map>
#include<vector>

struct data{
int value;
std::string s;
}

typedef std::map<std::string,int> MapIndex;//this map stores the index of data stored 
                                           //in VectorData mapped to a string              
typedef std::vector<data> VectorData;//stores the data in insertion order

void display_data_according_insertion_order(VectorData vectorData){
    for(std::vector<data>::iterator it=vectorData.begin();it!=vectorData.end();it++){
        std::cout<<it->value<<it->s<<std::endl;
    }
}
int lookup_string(std::string s,MapIndex mapIndex){
    std::MapIndex::iterator pt=mapIndex.find(s)
    if (pt!=mapIndex.end())return it->second;
    else return -1;//it signifies that key does not exist in map
}
int insert_value(data d,mapIndex,vectorData){
    if(mapIndex.find(d.s)==mapIndex.end()){
        mapIndex.insert(std::make_pair(d.s,vectorData.size()));//as the data is to be
                                                               //inserted at back 
                                                               //therefore index is
                                                               //size of vector before
                                                               //insertion
        vectorData.push_back(d);
        return 1;
    }
    else return 0;//it signifies that insertion of data is failed due to the presence
                  //string in the map and map stores unique keys
}

— Himanshu Pandey
source

1

这与费萨尔斯的答案有些相关。您可以围绕地图和矢量创建包装类，并轻松地使它们保持同步。正确的封装将使您可以控制访问方法，从而控制要使用的容器...矢量或地图。这样可以避免使用Boost或类似的方法。

— 北极星878
source

1

您需要考虑的一件事是使用的数据元素数量很少。仅使用向量可能会更快。映射中存在一些开销，这可能导致在较小的数据集中进行查找比使用简单的矢量要贵得多。因此，如果您知道将始终使用相同数量的元素，请进行一些基准测试，看看地图和矢量的性能是否符合您的实际想法。您可能会发现只有50个元素的向量中的查找与地图几乎相同。

— 乍得辛普金斯
source

1

//应该像这个男人！

//这样可以保持插入的复杂度为O（logN），删除也为O（logN）。

class SpecialMap {
private:
  int counter_;
  map<int, string> insertion_order_;
  map<string, int> insertion_order_reverse_look_up; // <- for fast delete
  map<string, Data> data_;
};

— 嘉仁
source

0

使用boost::multi_index地图和列表索引。

— 弗拉基米尔·沃兹涅森斯基
source

-1

对（str，int）和static int的映射，该映射在插入调用时递增，索引数据对。放置一个可以返回带有index（）成员的静态int val的结构吗？

— 麦克风
source

2

您应该添加一个示例。

— m02ph3u5

跟踪插入顺序的std :: map？