什么是最快（访问）Python中类似结构的对象？

Question 1

我正在优化一些代码，这些代码的主要瓶颈正在运行并访问大量类似于结构的对象。目前，我使用namedtuples来提高可读性。但是使用'timeit'进行的一些快速基准测试表明，在性能是一个重要因素的情况下，这确实是错误的方法：

以a，b，c命名的元组：

>>> timeit("z = a.c", "from __main__ import a")
0.38655471766332994

使用__slots__，a，b，c的类：

>>> timeit("z = b.c", "from __main__ import b")
0.14527461047146062

带有键a，b，c的字典：

>>> timeit("z = c['c']", "from __main__ import c")
0.11588272541098377

使用常量键的具有三个值的元组：

>>> timeit("z = d[2]", "from __main__ import d")
0.11106188992948773

使用常量键列出三个值：

>>> timeit("z = e[2]", "from __main__ import e")
0.086038238242508669

使用本地键的具有三个值的元组：

>>> timeit("z = d[key]", "from __main__ import d, key")
0.11187358437882722

使用本地键列出三个值：

>>> timeit("z = e[key]", "from __main__ import e, key")
0.088604143037173344

首先，这些小timeit测试是否会使它们无效？我每次都跑了几次，以确保没有随机的系统事件引发它们，并且结果几乎相同。

似乎字典在性能和可读性之间提供了最佳的平衡，而类次之。这是不幸的，因为出于我的目的，我还需要对象类似于序列。因此，我选择了namedtuple。

列表的速度要快得多，但是常量键是无法维护的。我必须创建一堆索引常量，即KEY_1 = 1，KEY_2 = 2，依此类推，这也不理想。

我会坚持这些选择吗？还是我错过了其他选择？

Question 2

要记住的一件事是，命名元组已优化为作为元组进行访问。如果将访问器更改为a[2]而不是a.c，则将看到与元组相似的性能。原因是名称访问者正在有效地转换为对self [idx]的调用，因此要付出索引编制和名称查找的代价。

如果您的使用模式使得按名称访问很常见，但不像元组那样访问，则可以编写一个等效于namedtuple的快速等效项，其功能相反：延迟索引查找以按名称访问。但是，您将在索引查找后付出代价。例如，这是一个快速实现：

def makestruct(name, fields):
    fields = fields.split()
    import textwrap
    template = textwrap.dedent("""\
    class {name}(object):
        __slots__ = {fields!r}
        def __init__(self, {args}):
            {self_fields} = {args}
        def __getitem__(self, idx): 
            return getattr(self, fields[idx])
    """).format(
        name=name,
        fields=fields,
        args=','.join(fields), 
        self_fields=','.join('self.' + f for f in fields))
    d = {'fields': fields}
    exec template in d
    return d[name]

但是定时__getitem__必须调用时非常糟糕：

namedtuple.a  :  0.473686933517 
namedtuple[0] :  0.180409193039
struct.a      :  0.180846214294
struct[0]     :  1.32191514969

即，与__slots__属性访问类的性能相同（毫不奇怪-这就是事实），但是由于基于索引的访问中进行了两次查找，因此付出了巨大的代价。（值得注意的是，__slots__这实际上并没有太大的帮助。它可以节省内存，但是没有它们，访问时间几乎相同。）

第三种选择是复制数据，例如。list的子类，并将值存储在属性和listdata中。但是，您实际上并没有获得等效于列表的性能。子类化对速度有很大的影响（引入纯Python重载检查）。因此，在这种情况下，struct [0]仍需要大约0.5s（原始列表为0.18），并且内存使用量会增加一倍，因此这可能不值得。

Question 3

这个问题（互联网时间）已经很老了，所以我想我今天将尝试使用常规CPython（2.7.6）和pypy（2.2.1）复制您的测试，并查看各种方法的比较。（我还为命名元组添加了索引查找。）

这有点微基准，所以YMMV，但是pypy似乎将命名元组访问的速度提高了30倍，而CPython却快了（而字典访问的速度仅提高了3倍）。

from collections import namedtuple

STest = namedtuple("TEST", "a b c")
a = STest(a=1,b=2,c=3)

class Test(object):
    __slots__ = ["a","b","c"]

    a=1
    b=2
    c=3

b = Test()

c = {'a':1, 'b':2, 'c':3}

d = (1,2,3)
e = [1,2,3]
f = (1,2,3)
g = [1,2,3]
key = 2

if __name__ == '__main__':
    from timeit import timeit

    print("Named tuple with a, b, c:")
    print(timeit("z = a.c", "from __main__ import a"))

    print("Named tuple, using index:")
    print(timeit("z = a[2]", "from __main__ import a"))

    print("Class using __slots__, with a, b, c:")
    print(timeit("z = b.c", "from __main__ import b"))

    print("Dictionary with keys a, b, c:")
    print(timeit("z = c['c']", "from __main__ import c"))

    print("Tuple with three values, using a constant key:")    
    print(timeit("z = d[2]", "from __main__ import d"))

    print("List with three values, using a constant key:")
    print(timeit("z = e[2]", "from __main__ import e"))

    print("Tuple with three values, using a local key:")
    print(timeit("z = d[key]", "from __main__ import d, key"))

    print("List with three values, using a local key:")
    print(timeit("z = e[key]", "from __main__ import e, key"))

Python结果：

Named tuple with a, b, c:
0.124072679784
Named tuple, using index:
0.0447055962367
Class using __slots__, with a, b, c:
0.0409136944224
Dictionary with keys a, b, c:
0.0412045334915
Tuple with three values, using a constant key:
0.0449477955531
List with three values, using a constant key:
0.0331083467148
Tuple with three values, using a local key:
0.0453569025139
List with three values, using a local key:
0.033030056702

PyPy结果：

Named tuple with a, b, c:
0.00444889068604
Named tuple, using index:
0.00265598297119
Class using __slots__, with a, b, c:
0.00208616256714
Dictionary with keys a, b, c:
0.013897895813
Tuple with three values, using a constant key:
0.00275301933289
List with three values, using a constant key:
0.002760887146
Tuple with three values, using a local key:
0.002769947052
List with three values, using a local key:
0.00278806686401

Question 4

此问题可能很快就会过时。CPython开发人员显然已经大大改善了通过属性名称访问命名元组值的性能。这些更改计划于2019年10月底在Python 3.8中发布。

请参阅：https://bugs.python.org/issue32492和https://github.com/python/cpython/pull/10495。

Question 5

几点要点和想法：

您正计划连续多次访问同一索引。您的实际程序可能使用随机或线性访问，这将具有不同的行为。特别是，将会有更多的CPU缓存未命中。使用实际程序可能会得到略有不同的结果。
OrderedDictionary写为包装器dict，因此，它比慢dict。那不是解决方案。
您是否尝试过新式和旧式课程？（新样式类继承自object；旧样式类不继承）
您是否尝试过使用psyco或Unladen Swallow？（2020更新-这两个项目已死）
您的内部循环是修改数据还是访问数据？在进入循环之前，可以将数据转换为最有效的形式，但可以在程序中的其他位置使用最方便的形式。

Question 6

我很想要么（a）发明一种特定于工作负载的缓存，然后将数据的存储和检索工作转移到类似memcachedb的进程中，以提高可伸缩性，而不是单独提高性能，或者（b）以C扩展的形式重写，与本机数据存储。可能是有序词典类型。

您可以从以下内容开始：http : //www.xs4all.nl/~anthon/Python/ordereddict/

Question 7

您可以通过添加__iter__，和__getitem__方法来使类具有序列性，以使它们具有（可索引和可迭代）性。

会OrderedDict工作吗？有几种可用的实现，它包含在Python31collections模块中。