如何在Python中创建嵌套字典？

149

我有2个CSV文件：“数据”和“映射”：

'映射'文件有4列：Device_Name，GDN，Device_Type，和Device_OS。填充所有四个列。
“数据”文件具有这些相同的列，其中Device_Name填充了列，而其他三列为空白。
我希望我的Python代码来打开这两个文件并为每个Device_Name数据文件，它的映射GDN，Device_Type以及Device_OS从映射文件中值。

我知道只有2列存在时才需要使用dict（需要映射1列），但是当需要映射3列时我不知道如何实现。

以下是我尝试完成的映射的代码Device_Type：

x = dict([])
with open("Pricing Mapping_2013-04-22.csv", "rb") as in_file1:
    file_map = csv.reader(in_file1, delimiter=',')
    for row in file_map:
       typemap = [row[0],row[2]]
       x.append(typemap)

with open("Pricing_Updated_Cleaned.csv", "rb") as in_file2, open("Data Scraper_GDN.csv", "wb") as out_file:
    writer = csv.writer(out_file, delimiter=',')
    for row in csv.reader(in_file2, delimiter=','):
         try:
              row[27] = x[row[11]]
         except KeyError:
              row[27] = ""
         writer.writerow(row)

它返回Attribute Error。

经过研究后，我认为我需要创建一个嵌套的字典，但是我不知道如何执行此操作。

— 阿塔姆斯
source

Device_Namecolumn是两个文件中的键，在此键上，我想将Device_OS，GDN和Device_Type值从映射文件映射到数据文件。

— atams，2013年

你想做类似的事情row[27] = x[row[11]]["Device_OS"]吗？

— 珍妮·卡里拉

另请参见： 搜索在嵌套的字典键 - 蟒蛇dpath

— dreftymac

不一定需要嵌套字典。您可以使用pandas read_csv创建Device_Name索引，然后可以直接join将两个数据框放在它们的index上Device_Name。

— smci

307

嵌套字典是字典中的字典。非常简单的事情。

>>> d = {}
>>> d['dict1'] = {}
>>> d['dict1']['innerkey'] = 'value'
>>> d
{'dict1': {'innerkey': 'value'}}

你也可以使用一个defaultdict从collections包装，以方便创建嵌套的字典。

>>> import collections
>>> d = collections.defaultdict(dict)
>>> d['dict1']['innerkey'] = 'value'
>>> d  # currently a defaultdict type
defaultdict(<type 'dict'>, {'dict1': {'innerkey': 'value'}})
>>> dict(d)  # but is exactly like a normal dictionary.
{'dict1': {'innerkey': 'value'}}

您可以根据需要填充。

我建议在你的代码的东西像下面：

d = {}  # can use defaultdict(dict) instead

for row in file_map:
    # derive row key from something 
    # when using defaultdict, we can skip the next step creating a dictionary on row_key
    d[row_key] = {} 
    for idx, col in enumerate(row):
        d[row_key][idx] = col

根据您的评论：

可能上面的代码令人困惑。我的问题简而言之：我有2个文件a.csv b.csv，a.csv有4列ijkl，b.csv也有这些列。我是这些csv的关键列。jkl列在a.csv中为空，但在b.csv中填充。我想使用'i'作为键列将b.csv中的jk l列的值映射到a.csv文件

我的建议是什么像这样（不使用defaultdict）：

a_file = "path/to/a.csv"
b_file = "path/to/b.csv"

# read from file a.csv
with open(a_file) as f:
    # skip headers
    f.next()
    # get first colum as keys
    keys = (line.split(',')[0] for line in f) 

# create empty dictionary:
d = {}

# read from file b.csv
with open(b_file) as f:
    # gather headers except first key header
    headers = f.next().split(',')[1:]
    # iterate lines
    for line in f:
        # gather the colums
        cols = line.strip().split(',')
        # check to make sure this key should be mapped.
        if cols[0] not in keys:
            continue
        # add key to dict
        d[cols[0]] = dict(
            # inner keys are the header names, values are columns
            (headers[idx], v) for idx, v in enumerate(cols[1:]))

但是请注意，用于解析csv文件的是csv模块。

— 英巴玫瑰
source

可能上面的代码令人困惑。简而言之，我的问题是：我有2个文件a.csv b.csv，a.csv有4列i j k l，b.csv也有这些列。i是这些csv'的关键列。j k l列为空，a.csv但已填充为b.csv。我想j k l使用'i`作为从b.csv到a.csv文件的键列来映射列的值。

— atams，2013年

64

更新：对于嵌套字典的任意长度，请转到此答案。

使用集合中的defaultdict函数。

高性能：当数据集很大时，“ if key not in dict”非常昂贵。

维护成本低：使代码更具可读性，并且可以轻松扩展。

from collections import defaultdict

target_dict = defaultdict(dict)
target_dict[key1][key2] = val

— 俊臣
source

3

from collections import defaultdict target_dict = defaultdict(dict) target_dict['1']['2']给我target_dict['1']['2'] KeyError: '2'

— 鹰头鹰嘴'17

1

您必须先分配值，然后才能获得它。

— Junchen

24

对于任意级别的嵌套：

In [2]: def nested_dict():
   ...:     return collections.defaultdict(nested_dict)
   ...:

In [3]: a = nested_dict()

In [4]: a
Out[4]: defaultdict(<function __main__.nested_dict>, {})

In [5]: a['a']['b']['c'] = 1

In [6]: a
Out[6]:
defaultdict(<function __main__.nested_dict>,
            {'a': defaultdict(<function __main__.nested_dict>,
                         {'b': defaultdict(<function __main__.nested_dict>,
                                      {'c': 1})})})

— 安德鲁
source

2

上面的答案对两行函数有什么作用，您也可以对一行lambda进行处理，如本回答所示。

— Acumenus

3

重要的是要记住，在使用defaultdict和类似的嵌套dict模块（如nested_dict）时，查找不存在的键可能会无意间在dict中创建新的键条目，并造成很多破坏。

这是带有nested_dict模块的Python3示例：

import nested_dict as nd
nest = nd.nested_dict()
nest['outer1']['inner1'] = 'v11'
nest['outer1']['inner2'] = 'v12'
print('original nested dict: \n', nest)
try:
    nest['outer1']['wrong_key1']
except KeyError as e:
    print('exception missing key', e)
print('nested dict after lookup with missing key.  no exception raised:\n', nest)

# Instead, convert back to normal dict...
nest_d = nest.to_dict(nest)
try:
    print('converted to normal dict. Trying to lookup Wrong_key2')
    nest_d['outer1']['wrong_key2']
except KeyError as e:
    print('exception missing key', e)
else:
    print(' no exception raised:\n')

# ...or use dict.keys to check if key in nested dict
print('checking with dict.keys')
print(list(nest['outer1'].keys()))
if 'wrong_key3' in list(nest.keys()):

    print('found wrong_key3')
else:
    print(' did not find wrong_key3')

输出为：

original nested dict:   {"outer1": {"inner2": "v12", "inner1": "v11"}}

nested dict after lookup with missing key.  no exception raised:  
{"outer1": {"wrong_key1": {}, "inner2": "v12", "inner1": "v11"}} 

converted to normal dict. 
Trying to lookup Wrong_key2 

exception missing key 'wrong_key2' 

checking with dict.keys 

['wrong_key1', 'inner2', 'inner1']  
did not find wrong_key3

— 天帆
source