Python字典:获取键列表的值列表


179

是否有内置/快速的方法来使用字典的键列表来获取对应项的列表?

例如,我有:

>>> mydict = {'one': 1, 'two': 2, 'three': 3}
>>> mykeys = ['three', 'one']

如何使用mykeys字典中的相应值作为列表?

>>> mydict.WHAT_GOES_HERE(mykeys)
[3, 1]

Answers:



107

除list-comp外,还有几种其他方法:

  • 如果找不到密钥,则生成列表并引发异常: map(mydict.__getitem__, mykeys)
  • 带有Noneif键的构建列表:map(mydict.get, mykeys)

另外,使用operator.itemgetter可以返回一个元组:

from operator import itemgetter
myvalues = itemgetter(*mykeys)(mydict)
# use `list(...)` if list is required

注意:在Python3中,map返回迭代器而不是列表。使用list(map(...))了列表。


54

速度比较:

Python 2.7.11 |Anaconda 2.4.1 (64-bit)| (default, Dec  7 2015, 14:10:42) [MSC v.1500 64 bit (AMD64)] on win32
In[1]: l = [0,1,2,3,2,3,1,2,0]
In[2]: m = {0:10, 1:11, 2:12, 3:13}
In[3]: %timeit [m[_] for _ in l]  # list comprehension
1000000 loops, best of 3: 762 ns per loop
In[4]: %timeit map(lambda _: m[_], l)  # using 'map'
1000000 loops, best of 3: 1.66 µs per loop
In[5]: %timeit list(m[_] for _ in l)  # a generator expression passed to a list constructor.
1000000 loops, best of 3: 1.65 µs per loop
In[6]: %timeit map(m.__getitem__, l)
The slowest run took 4.01 times longer than the fastest. This could mean that an intermediate result is being cached 
1000000 loops, best of 3: 853 ns per loop
In[7]: %timeit map(m.get, l)
1000000 loops, best of 3: 908 ns per loop
In[33]: from operator import itemgetter
In[34]: %timeit list(itemgetter(*l)(m))
The slowest run took 9.26 times longer than the fastest. This could mean that an intermediate result is being cached 
1000000 loops, best of 3: 739 ns per loop

因此,列表理解和itemgetter是最快的方法。

更新:对于大型随机列表和地图,我得到了一些不同的结果:

Python 2.7.11 |Anaconda 2.4.1 (64-bit)| (default, Dec  7 2015, 14:10:42) [MSC v.1500 64 bit (AMD64)] on win32
In[2]: import numpy.random as nprnd
l = nprnd.randint(1000, size=10000)
m = dict([(_, nprnd.rand()) for _ in range(1000)])
from operator import itemgetter
import operator
f = operator.itemgetter(*l)
%timeit f(m)
%timeit list(itemgetter(*l)(m))
%timeit [m[_] for _ in l]  # list comprehension
%timeit map(m.__getitem__, l)
%timeit list(m[_] for _ in l)  # a generator expression passed to a list constructor.
%timeit map(m.get, l)
%timeit map(lambda _: m[_], l)
1000 loops, best of 3: 1.14 ms per loop
1000 loops, best of 3: 1.68 ms per loop
100 loops, best of 3: 2 ms per loop
100 loops, best of 3: 2.05 ms per loop
100 loops, best of 3: 2.19 ms per loop
100 loops, best of 3: 2.53 ms per loop
100 loops, best of 3: 2.9 ms per loop

因此,在这种情况下,明确的获胜者是f = operator.itemgetter(*l); f(m),而明确的局外人:map(lambda _: m[_], l)

适用于Python 3.6.4的更新:

import numpy.random as nprnd
l = nprnd.randint(1000, size=10000)
m = dict([(_, nprnd.rand()) for _ in range(1000)])
from operator import itemgetter
import operator
f = operator.itemgetter(*l)
%timeit f(m)
%timeit list(itemgetter(*l)(m))
%timeit [m[_] for _ in l]  # list comprehension
%timeit list(map(m.__getitem__, l))
%timeit list(m[_] for _ in l)  # a generator expression passed to a list constructor.
%timeit list(map(m.get, l))
%timeit list(map(lambda _: m[_], l)
1.66 ms ± 74.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
2.1 ms ± 93.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
2.58 ms ± 88.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
2.36 ms ± 60.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
2.98 ms ± 142 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
2.7 ms ± 284 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
3.14 ms ± 62.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

因此,Python 3.6.4的结果几乎相同。


15

这是三种方式。

KeyError未找到密钥时引发:

result = [mapping[k] for k in iterable]

缺少键的默认值。

result = [mapping.get(k, default_value) for k in iterable]

跳过丢失的键。

result = [mapping[k] for k in iterable if k in mapping]

found_keys = mapping.keys() & iterable给出TypeError: unsupported operand type(s) for &: 'list' and 'list'python 2.7; `found_keys = [如果可迭代的密钥,则为mapping.keys()中的密钥的密钥]效果最佳
NotGaeL


7

试试这个:

mydict = {'one': 1, 'two': 2, 'three': 3}
mykeys = ['three', 'one'] # if there are many keys, use a set

[mydict[k] for k in mykeys]
=> [3, 1]

@PeterDeGlopper,您很困惑。items()是首选,它不需要进行额外的查找,这里没有任何len(mydict)*len(mykeys)操作!(请注意,我用一组)
奥斯卡·洛佩斯

@ÓscarLópez是的,您正在检查字典中的每个元素。iteritems在需要它们之前不会产生它们,因此它避免了构造中间列表,但是您仍然要为mydict中的每个k运行“ k in mykeys”(因为它是列表,所以它是len(mykeys)的顺序)。与仅运行在mykey上的简单列表理解相比,完全不必要。
彼得·德格洛珀

@ inspectorG4dget @PeterDeGlopper成员工作在mykeys是固定的时间里,我使用的是一套,而不是一个名单
奥斯卡·洛佩斯

2
将OP的列表转换为集合至少可以使其线性,但是在错误的数据结构以及丢失顺序的情况下,它仍然是线性的。考虑10k词典和mykeys中有2个键的情况。您的解决方案进行了10k组成员资格测试,而对于简单的列表理解,则需要两次字典查找。通常,可以安全地假设键的数目小于字典元素的数目-如果不是,则您的方法将省略重复的元素。
Peter DeGlopper


1

Pandas非常优雅地做到了这一点,尽管通常对列表的理解在技术上总是Python风格的。我现在没有时间进行速度比较(我稍后会再放入):

import pandas as pd
mydict = {'one': 1, 'two': 2, 'three': 3}
mykeys = ['three', 'one']
temp_df = pd.DataFrame().append(mydict)
# You can export DataFrames to a number of formats, using a list here. 
temp_df[mykeys].values[0]
# Returns: array([ 3.,  1.])

# If you want a dict then use this instead:
# temp_df[mykeys].to_dict(orient='records')[0]
# Returns: {'one': 1.0, 'three': 3.0}

-1

或仅仅是mydict.keys()那是对字典的内置方法调用。也探索mydict.values()mydict.items()

//哦,OP帖子让我感到困惑。


5
内置方法很有用,但它们不提供给定键列表中的对应项列表。该答案不是对该特定问题的正确答案。
stenix 2015年

-1

关闭 Python之后:从给定顺序的字典值创建列表的有效方法

检索密钥而不建立列表:

from __future__ import (absolute_import, division, print_function,
                        unicode_literals)

import collections


class DictListProxy(collections.Sequence):
    def __init__(self, klist, kdict, *args, **kwargs):
        super(DictListProxy, self).__init__(*args, **kwargs)
        self.klist = klist
        self.kdict = kdict

    def __len__(self):
        return len(self.klist)

    def __getitem__(self, key):
        return self.kdict[self.klist[key]]


myDict = {'age': 'value1', 'size': 'value2', 'weigth': 'value3'}
order_list = ['age', 'weigth', 'size']

dlp = DictListProxy(order_list, myDict)

print(','.join(dlp))
print()
print(dlp[1])

输出:

value1,value3,value2

value3

哪个与列表给出的顺序匹配


-2
reduce(lambda x,y: mydict.get(y) and x.append(mydict[y]) or x, mykeys,[])

万一有字典中没有的键。

By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.