通过键列表访问嵌套字典项?


143

我有一个复杂的字典结构,我想通过一个键列表来访问该字典以解决正确的项。

dataDict = {
    "a":{
        "r": 1,
        "s": 2,
        "t": 3
        },
    "b":{
        "u": 1,
        "v": {
            "x": 1,
            "y": 2,
            "z": 3
        },
        "w": 3
        }
}    

maplist = ["a", "r"]

要么

maplist = ["b", "v", "y"]

我编写了下面的代码,但是可以肯定的是,如果有人有想法,我可以找到一种更好,更有效的方法。

# Get a given data from a dictionary with position provided as a list
def getFromDict(dataDict, mapList):    
    for k in mapList: dataDict = dataDict[k]
    return dataDict

# Set a given data in a dictionary with position provided as a list
def setInDict(dataDict, mapList, value): 
    for k in mapList[:-1]: dataDict = dataDict[k]
    dataDict[mapList[-1]] = value

Answers:


230

使用reduce()遍历词典:

from functools import reduce  # forward compatibility for Python 3
import operator

def getFromDict(dataDict, mapList):
    return reduce(operator.getitem, mapList, dataDict)

并重复使用getFromDict以查找存储值的位置setInDict()

def setInDict(dataDict, mapList, value):
    getFromDict(dataDict, mapList[:-1])[mapList[-1]] = value

除了最后一个元素外,所有元素mapList都需要查找“父”字典以将值添加到其中,然后使用最后一个元素将值设置为右键。

演示:

>>> getFromDict(dataDict, ["a", "r"])
1
>>> getFromDict(dataDict, ["b", "v", "y"])
2
>>> setInDict(dataDict, ["b", "v", "w"], 4)
>>> import pprint
>>> pprint.pprint(dataDict)
{'a': {'r': 1, 's': 2, 't': 3},
 'b': {'u': 1, 'v': {'w': 4, 'x': 1, 'y': 2, 'z': 3}, 'w': 3}}

请注意,Python PEP8样式指南规定了函数的snake_case名称。上面的方法同样适用于列表或字典和列表的混合,因此名称应为get_by_path()and set_by_path()

from functools import reduce  # forward compatibility for Python 3
import operator

def get_by_path(root, items):
    """Access a nested object in root by item sequence."""
    return reduce(operator.getitem, items, root)

def set_by_path(root, items, value):
    """Set a value in a nested object in root by item sequence."""
    get_by_path(root, items[:-1])[items[-1]] = value

1
对于任意嵌套结构,这种遍历多少可靠?它也适用于带有嵌套列表的混合字典吗?如何修改getFromDict()以提供default_value并将default_value设置为None?我是Python的新手,具有多年的PHP开发经验和C开发经验。
Dmitriy Sintsov

2
嵌套映射集还应创建不存在的节点imo:整数键列表,字符串键字典。
Dmitriy Sintsov

1
@ user1353510:碰巧,这里使用常规索引语法,因此它也将支持字典中的列表。只需传递整数索引即可。
马丁·彼得斯

1
@ user1353510:为默认值,使用try:except (KeyError, IndexError): return default_value围绕当前return行。
马丁·彼得斯

1
@Georgy:使用dict.get()更改语义,因为这会返回None而不是KeyError针对缺少的名称进行加注。然后,任何后续名称都会触发一个AttributeErroroperator是一个标准库,这里无需避免。
马丁·彼得斯

40
  1. 公认的解决方案不能直接用于python3-它需要一个from functools import reduce
  2. 而且使用for循环似乎更pythonic 。请参阅“ Python 3.0的新增功能”中的报价。

    已删除reduce()functools.reduce()如果确实需要,请使用;但是,在99%的时间里,显式for循环更易于阅读。

  3. 接下来,可接受的解决方案不设置不存在的嵌套键(它返回KeyError)-有关解决方案,请参见@eafit的答案

因此,为什么不使用kolergy问题中建议的方法来获取值:

def getFromDict(dataDict, mapList):    
    for k in mapList: dataDict = dataDict[k]
    return dataDict

@eafit的答案中的代码用于设置值:

def nested_set(dic, keys, value):
    for key in keys[:-1]:
        dic = dic.setdefault(key, {})
    dic[keys[-1]] = value

两者都可以直接在python 2和3中工作


6
我更喜欢这种解决方案-但要小心。如果我没记错的话,由于Python词典不是一成不变的,因此getFromDict有可能破坏调用方的dataDict。我copy.deepcopy(dataDict)先。当然,(如所写)在第二个函数中需要这种行为。
Dylan F

15

使用reduce很聪明,但是如果嵌套字典中不存在父键,那么OP的set方法可能会出现问题。由于这是我在Google搜索中看到的关于该主题的第一篇SO帖子,因此我想使其更好一点。

在给定索引和值列表的情况下,在嵌套的python字典中设置值)中的set方法似乎对丢失父母键更为健壮。要将其复制:

def nested_set(dic, keys, value):
    for key in keys[:-1]:
        dic = dic.setdefault(key, {})
    dic[keys[-1]] = value

同样,使用遍历键树并获取我为其创建的所有绝对键路径的方法也可以很方便:

def keysInDict(dataDict, parent=[]):
    if not isinstance(dataDict, dict):
        return [tuple(parent)]
    else:
        return reduce(list.__add__, 
            [keysInDict(v,parent+[k]) for k,v in dataDict.items()], [])

它的一种用法是使用以下代码将嵌套树转换为pandas DataFrame(假定嵌套字典中的所有叶子都具有相同的深度)。

def dict_to_df(dataDict):
    ret = []
    for k in keysInDict(dataDict):
        v = np.array( getFromDict(dataDict, k), )
        v = pd.DataFrame(v)
        v.columns = pd.MultiIndex.from_product(list(k) + [v.columns])
        ret.append(v)
    return reduce(pd.DataFrame.join, ret)

为什么将'keys'参数长度任意限制为2或更大nested_set
alancalvitti


3

如何使用递归函数?

获得价值:

def getFromDict(dataDict, maplist):
    first, rest = maplist[0], maplist[1:]

    if rest: 
        # if `rest` is not empty, run the function recursively
        return getFromDict(dataDict[first], rest)
    else:
        return dataDict[first]

并设置一个值:

def setInDict(dataDict, maplist, value):
    first, rest = maplist[0], maplist[1:]

    if rest:
        try:
            if not isinstance(dataDict[first], dict):
                # if the key is not a dict, then make it a dict
                dataDict[first] = {}
        except KeyError:
            # if key doesn't exist, create one
            dataDict[first] = {}

        setInDict(dataDict[first], rest, value)
    else:
        dataDict[first] = value

2

纯Python样式,没有任何导入:

def nested_set(element, value, *keys):
    if type(element) is not dict:
        raise AttributeError('nested_set() expects dict as first argument.')
    if len(keys) < 2:
        raise AttributeError('nested_set() expects at least three arguments, not enough given.')

    _keys = keys[:-1]
    _element = element
    for key in _keys:
        _element = _element[key]
    _element[keys[-1]] = value

example = {"foo": { "bar": { "baz": "ok" } } }
keys = ['foo', 'bar']
nested_set(example, "yay", *keys)
print(example)

输出量

{'foo': {'bar': 'yay'}}

2

如果您不希望在其中一个键不存在的情况下引发错误,请使用另一种方法(这样您的主代码可以连续运行而不会中断):

def get_value(self,your_dict,*keys):
    curr_dict_ = your_dict
    for k in keys:
        v = curr_dict.get(k,None)
        if v is None:
            break
        if isinstance(v,dict):
            curr_dict = v
    return v

在这种情况下,如果不存在任何输入键,则将返回None,这可以用作对您的主代码的检查以执行替代任务。


1

不必每次都想查找一个值时都会对性能造成负面影响,而是如何一次将字典展平,然后只需像 b:v:y

def flatten(mydict):
  new_dict = {}
  for key,value in mydict.items():
    if type(value) == dict:
      _dict = {':'.join([key, _key]):_value for _key, _value in flatten(value).items()}
      new_dict.update(_dict)
    else:
      new_dict[key]=value
  return new_dict

dataDict = {
"a":{
    "r": 1,
    "s": 2,
    "t": 3
    },
"b":{
    "u": 1,
    "v": {
        "x": 1,
        "y": 2,
        "z": 3
    },
    "w": 3
    }
}    

flat_dict = flatten(dataDict)
print flat_dict
{'b:w': 3, 'b:u': 1, 'b:v:y': 2, 'b:v:x': 1, 'b:v:z': 3, 'a:r': 1, 'a:s': 2, 'a:t': 3}

这样,您就可以简单地查找所使用的商品flat_dict['b:v:y'],这将给你1

并且,您可以通过展平字典并保存输出来加快遍历字典的速度,而不是遍历字典,这样从冷启动开始的查找将意味着加载展平的字典并简单地执行键/值查找而无需遍历。


1

递归解决了这个问题:

def get(d,l):
    if len(l)==1: return d[l[0]]
    return get(d[l[0]],l[1:])

使用您的示例:

dataDict = {
    "a":{
        "r": 1,
        "s": 2,
        "t": 3
        },
    "b":{
        "u": 1,
        "v": {
            "x": 1,
            "y": 2,
            "z": 3
        },
        "w": 3
        }
}
maplist1 = ["a", "r"]
maplist2 = ["b", "v", "y"]
print(get(dataDict, maplist1)) # 1
print(get(dataDict, maplist2)) # 2

1

如何检查然后设置dict元素而不处理所有索引两次呢?

解:

def nested_yield(nested, keys_list):
    """
    Get current nested data by send(None) method. Allows change it to Value by calling send(Value) next time
    :param nested: list or dict of lists or dicts
    :param keys_list: list of indexes/keys
    """
    if not len(keys_list):  # assign to 1st level list
        if isinstance(nested, list):
            while True:
                nested[:] = yield nested
        else:
            raise IndexError('Only lists can take element without key')


    last_key = keys_list.pop()
    for key in keys_list:
        nested = nested[key]

    while True:
        try:
            nested[last_key] = yield nested[last_key]
        except IndexError as e:
            print('no index {} in {}'.format(last_key, nested))
            yield None

工作流程示例:

ny = nested_yield(nested_dict, nested_address)
data_element = ny.send(None)
if data_element:
    # process element
    ...
else:
    # extend/update nested data
    ny.send(new_data_element)
    ...
ny.close()

测试

>>> cfg= {'Options': [[1,[0]],[2,[4,[8,16]]],[3,[9]]]}
    ny = nested_yield(cfg, ['Options',1,1,1])
    ny.send(None)
[8, 16]
>>> ny.send('Hello!')
'Hello!'
>>> cfg
{'Options': [[1, [0]], [2, [4, 'Hello!']], [3, [9]]]}
>>> ny.close()

1

参加聚会很晚,但发帖以防将来有帮助。对于我的用例,以下功能效果最好。可以将任何数据类型从字典中拉出

dict是包含我们价值的字典

list是实现我们价值的“步骤”列表

def getnestedvalue(dict, list):

    length = len(list)
    try:
        for depth, key in enumerate(list):
            if depth == length - 1:
                output = dict[key]
                return output
            dict = dict[key]
    except (KeyError, TypeError):
        return None

    return None

1

看到具有两个用于设置和获取嵌套属性的静态方法的答案,这是令人满意的。这些解决方案比使用嵌套树https://gist.github.com/hrldcpr/2012250

这是我的实现。

用法

设置嵌套属性调用 sattr(my_dict, 1, 2, 3, 5) is equal to my_dict[1][2][3][4]=5

获取嵌套的属性调用 gattr(my_dict, 1, 2)

def gattr(d, *attrs):
    """
    This method receives a dict and list of attributes to return the innermost value of the give dict       
    """
    try:
        for at in attrs:
            d = d[at]
        return d
    except(KeyError, TypeError):
        return None


def sattr(d, *attrs):
    """
    Adds "val" to dict in the hierarchy mentioned via *attrs
    For ex:
    sattr(animals, "cat", "leg","fingers", 4) is equivalent to animals["cat"]["leg"]["fingers"]=4
    This method creates necessary objects until it reaches the final depth
    This behaviour is also known as autovivification and plenty of implementation are around
    This implementation addresses the corner case of replacing existing primitives
    https://gist.github.com/hrldcpr/2012250#gistcomment-1779319
    """
    for attr in attrs[:-2]:
        if type(d.get(attr)) is not dict:
            d[attr] = {}
        d = d[attr]
    d[attrs[-2]] = attrs[-1]

1

我建议您python-benedict使用keypath访问嵌套项目。

使用安装它pip

pip install python-benedict

然后:

from benedict import benedict

dataDict = benedict({
    "a":{
        "r": 1,
        "s": 2,
        "t": 3,
    },
    "b":{
        "u": 1,
        "v": {
            "x": 1,
            "y": 2,
            "z": 3,
        },
        "w": 3,
    },
}) 

print(dataDict['a.r'])
# or
print(dataDict['a', 'r'])

这里是完整的文档:https : //github.com/fabiocaccamo/python-benedict


0

如果您还希望能够使用包含嵌套列表和字典的任意json并很好地处理无效的查找路径,这是我的解决方案:

from functools import reduce


def get_furthest(s, path):
    '''
    Gets the furthest value along a given key path in a subscriptable structure.

    subscriptable, list -> any
    :param s: the subscriptable structure to examine
    :param path: the lookup path to follow
    :return: a tuple of the value at the furthest valid key, and whether the full path is valid
    '''

    def step_key(acc, key):
        s = acc[0]
        if isinstance(s, str):
            return (s, False)
        try:
            return (s[key], acc[1])
        except LookupError:
            return (s, False)

    return reduce(step_key, path, (s, True))


def get_val(s, path):
    val, successful = get_furthest(s, path)
    if successful:
        return val
    else:
        raise LookupError('Invalid lookup path: {}'.format(path))


def set_val(s, path, value):
    get_val(s, path[:-1])[path[-1]] = value

0

连接字符串的方法:

def get_sub_object_from_path(dict_name, map_list):
    for i in map_list:
        _string = "['%s']" % i
        dict_name += _string
    value = eval(dict_name)
    return value
#Sample:
_dict = {'new': 'person', 'time': {'for': 'one'}}
map_list = ['time', 'for']
print get_sub_object_from_path("_dict",map_list)
#Output:
#one

0

扩展@DomTomCat和其他方法,这些功能(即,通过Deepcopy返回修改的数据而不影响输入)的setter和mapper可用于nested dictlist

设置者:

def set_at_path(data0, keys, value):
    data = deepcopy(data0)
    if len(keys)>1:
        if isinstance(data,dict):
            return {k:(set_by_path(v,keys[1:],value) if k==keys[0] else v) for k,v in data.items()}
        if isinstance(data,list):
            return [set_by_path(x[1],keys[1:],value) if x[0]==keys[0] else x[1] for x in enumerate(data)]
    else:
        data[keys[-1]]=value
        return data

映射器:

def map_at_path(data0, keys, f):
    data = deepcopy(data0)
    if len(keys)>1:
        if isinstance(data,dict):
            return {k:(map_at_path(v,keys[1:],f) if k==keys[0] else v) for k,v in data.items()}
        if isinstance(data,list):
            return [map_at_path(x[1],keys[1:],f) if x[0]==keys[0] else x[1] for x in enumerate(data)]
    else:
        data[keys[-1]]=f(data[keys[-1]])
        return data

0

您可以eval在python中使用该函数。

def nested_parse(nest, map_list):
    nestq = "nest['" + "']['".join(map_list) + "']"
    return eval(nestq, {'__builtins__':None}, {'nest':nest})

说明

对于您的示例查询: maplist = ["b", "v", "y"]

nestq嵌套字典将在"nest['b']['v']['y']"哪里nest

eval内建函数执行给定的字符串。但是,重要的是要小心使用eval功能可能引起的漏洞。可以在这里找到讨论:

  1. https://nedbatchelder.com/blog/201206/eval_really_is_dangerous.html
  2. https://www.journaldev.com/22504/python-eval-function

nested_parse()函数中,我确保没有__builtins__全局变量可用,而只有可用的局部变量才是nest字典。


By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.