列表理解和功能比“ for循环”快吗？

155

在Python的性能方面，是一个列表理解或功能，如map()，filter()和reduce()比for循环快？从技术上讲，为什么它们以C速度运行，而for循环以python虚拟机速度运行？

假设在我正在开发的游戏中，我需要使用for循环绘制复杂而庞大的地图。这个问题肯定是相关的，例如，如果列表理解确实确实更快，那么它将是避免滞后的更好选择（尽管代码具有视觉上的复杂性）。

146

以下是粗略的准则和基于经验的有根据的猜测。您应该timeit或配置您的具体用例以获得确切的数字，并且这些数字有时可能与以下内容不一致。

列表理解通常比精确等效的for循环（实际上是构建列表）要快一点，这很可能是因为它不必append在每次迭代时都查找列表及其方法。但是，列表理解仍然会执行字节码级别的循环：

>>> dis.dis(<the code object for `[x for x in range(10)]`>)
 1           0 BUILD_LIST               0
             3 LOAD_FAST                0 (.0)
       >>    6 FOR_ITER                12 (to 21)
             9 STORE_FAST               1 (x)
            12 LOAD_FAST                1 (x)
            15 LIST_APPEND              2
            18 JUMP_ABSOLUTE            6
       >>   21 RETURN_VALUE

由于创建和扩展列表的开销很大，因此使用列表推导代替不构建列表的循环，无意义地累积无意义的值的列表然后将其扔掉通常会较慢。列表理解并不是比一个好的旧循环本质上更快的魔术。

至于功能列表处理功能：虽然这些都是用C语言编写，并可能超越Python编写的相同的功能，它们是不是一定是最快的选择。如果该函数也是用C编写的，则可以提高速度。但是在大多数情况下，使用lambda（或其他Python函数）重复设置Python堆栈框架等开销会消耗掉所有的节省。在没有函数调用的情况下，简单地在线完成相同的工作（例如，用列表理解代替mapor filter）通常会稍快一些。

假设在我正在开发的游戏中，我需要使用for循环绘制复杂而庞大的地图。这个问题肯定是相关的，例如，如果列表理解确实确实更快，那么它将是避免滞后的更好选择（尽管代码具有视觉上的复杂性）。

很有可能，如果用良好的非“优化” Python编写时这样的代码还不够快，那么就没有足够的Python级微优化可以使它足够快了，您应该开始考虑降级为C。微观优化通常可以大大加快Python代码的速度，对此（绝对值）的限制很低。而且，即使在达到极限之前，咬住子弹并写一些C语言也变得更具成本效益（加速15％，加速300％）。

25

如果查看python.org上的信息，则可以看到以下摘要：

Version Time (seconds)
Basic loop 3.47
Eliminate dots 2.45
Local variable & no dots 1.79
Using map function 0.54

但是您确实应该详细阅读以上文章，以了解造成性能差异的原因。

我也强烈建议您使用timeit对代码计时。在一天的结束时，可能会出现例如for在满足条件时需要跳出循环的情况。它可能比调用找出结果更快map。

— 安东尼·孔
source

17

虽然该页面是一本不错的书，并且具有部分相关性，但仅引用这些数字并没有帮助，甚至可能会产生误导。

1

这没有表明您正在安排什么时间。相对性能将根据loop / listcomp / map中的内容而有很大差异。

— user2357112支持Monica 2014年

@delnan我同意。我已修改答案，以敦促OP阅读文档以了解性能差异。

— 安东尼·孔

@ user2357112您必须阅读我为上下文链接的Wiki页面。我将其发布以供OP参考。

— 安东尼·孔

13

你具体问左右map()，filter()和reduce()，但我相信你想了解一般的函数式编程。在对一组点中所有点之间的距离进行计算的问题上进行了自我测试之后，结果表明，函数编程（使用starmap内置itertools模块中的函数）比for循环要慢一些（使用1.25倍的时间）。事实）。这是我使用的示例代码：

import itertools, time, math, random

class Point:
    def __init__(self,x,y):
        self.x, self.y = x, y

point_set = (Point(0, 0), Point(0, 1), Point(0, 2), Point(0, 3))
n_points = 100
pick_val = lambda : 10 * random.random() - 5
large_set = [Point(pick_val(), pick_val()) for _ in range(n_points)]
    # the distance function
f_dist = lambda x0, x1, y0, y1: math.sqrt((x0 - x1) ** 2 + (y0 - y1) ** 2)
    # go through each point, get its distance from all remaining points 
f_pos = lambda p1, p2: (p1.x, p2.x, p1.y, p2.y)

extract_dists = lambda x: itertools.starmap(f_dist, 
                          itertools.starmap(f_pos, 
                          itertools.combinations(x, 2)))

print('Distances:', list(extract_dists(point_set)))

t0_f = time.time()
list(extract_dists(large_set))
dt_f = time.time() - t0_f

功能版本是否比程序版本更快？

def extract_dists_procedural(pts):
    n_pts = len(pts)
    l = []    
    for k_p1 in range(n_pts - 1):
        for k_p2 in range(k_p1, n_pts):
            l.append((pts[k_p1].x - pts[k_p2].x) ** 2 +
                     (pts[k_p1].y - pts[k_p2].y) ** 2)
    return l

t0_p = time.time()
list(extract_dists_procedural(large_set)) 
    # using list() on the assumption that
    # it eats up as much time as in the functional version

dt_p = time.time() - t0_p

f_vs_p = dt_p / dt_f
if f_vs_p >= 1.0:
    print('Time benefit of functional progamming:', f_vs_p, 
          'times as fast for', n_points, 'points')
else:
    print('Time penalty of functional programming:', 1 / f_vs_p, 
          'times as slow for', n_points, 'points')

— andreipmbcn
source

2

看起来很复杂，无法回答这个问题。您可以削减它以使其更有意义吗？

— 亚伦·霍尔

2

@AaronHall实际上，我发现andreipmbcn的答案很有趣，因为它是一个不平凡的例子。我们可以玩的代码。

— Anthony Kong

@AaronHall，是否要我编辑文本段落，使其听起来更清晰直接，还是要我编辑代码？

— andreipmbcn 2014年

9

我写了一个简单的脚本来测试速度，这就是我发现的结果。实际上对于我来说，for循环最快。真的让我感到惊讶，请检查波纹管（计算平方和）。

from functools import reduce
import datetime


def time_it(func, numbers, *args):
    start_t = datetime.datetime.now()
    for i in range(numbers):
        func(args[0])
    print (datetime.datetime.now()-start_t)

def square_sum1(numbers):
    return reduce(lambda sum, next: sum+next**2, numbers, 0)


def square_sum2(numbers):
    a = 0
    for i in numbers:
        i = i**2
        a += i
    return a

def square_sum3(numbers):
    sqrt = lambda x: x**2
    return sum(map(sqrt, numbers))

def square_sum4(numbers):
    return(sum([int(i)**2 for i in numbers]))


time_it(square_sum1, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum2, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum3, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum4, 100000, [1, 2, 5, 3, 1, 2, 5, 3])

0:00:00.302000 #Reduce
0:00:00.144000 #For loop
0:00:00.318000 #Map
0:00:00.390000 #List comprehension

— 阿尔法三世
source

使用python 3.6.1的差异不是很大。Reduce和Map降至0.24，列表理解为0.29。For更高，为0.18。

— jjmerelo

消除intin square_sum4也使其比for循环快很多，也慢一点。

— jjmerelo

6

我修改了@Alisa的代码，并用来cProfile说明为什么列表理解速度更快：

from functools import reduce
import datetime

def reduce_(numbers):
    return reduce(lambda sum, next: sum + next * next, numbers, 0)

def for_loop(numbers):
    a = []
    for i in numbers:
        a.append(i*2)
    a = sum(a)
    return a

def map_(numbers):
    sqrt = lambda x: x*x
    return sum(map(sqrt, numbers))

def list_comp(numbers):
    return(sum([i*i for i in numbers]))

funcs = [
        reduce_,
        for_loop,
        map_,
        list_comp
        ]

if __name__ == "__main__":
    # [1, 2, 5, 3, 1, 2, 5, 3]
    import cProfile
    for f in funcs:
        print('=' * 25)
        print("Profiling:", f.__name__)
        print('=' * 25)
        pr = cProfile.Profile()
        for i in range(10**6):
            pr.runcall(f, [1, 2, 5, 3, 1, 2, 5, 3])
        pr.create_stats()
        pr.print_stats()

结果如下：

=========================
Profiling: reduce_
=========================
         11000000 function calls in 1.501 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  1000000    0.162    0.000    1.473    0.000 profiling.py:4(reduce_)
  8000000    0.461    0.000    0.461    0.000 profiling.py:5(<lambda>)
  1000000    0.850    0.000    1.311    0.000 {built-in method _functools.reduce}
  1000000    0.028    0.000    0.028    0.000 {method 'disable' of '_lsprof.Profiler' objects}


=========================
Profiling: for_loop
=========================
         11000000 function calls in 1.372 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  1000000    0.879    0.000    1.344    0.000 profiling.py:7(for_loop)
  1000000    0.145    0.000    0.145    0.000 {built-in method builtins.sum}
  8000000    0.320    0.000    0.320    0.000 {method 'append' of 'list' objects}
  1000000    0.027    0.000    0.027    0.000 {method 'disable' of '_lsprof.Profiler' objects}


=========================
Profiling: map_
=========================
         11000000 function calls in 1.470 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  1000000    0.264    0.000    1.442    0.000 profiling.py:14(map_)
  8000000    0.387    0.000    0.387    0.000 profiling.py:15(<lambda>)
  1000000    0.791    0.000    1.178    0.000 {built-in method builtins.sum}
  1000000    0.028    0.000    0.028    0.000 {method 'disable' of '_lsprof.Profiler' objects}


=========================
Profiling: list_comp
=========================
         4000000 function calls in 0.737 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  1000000    0.318    0.000    0.709    0.000 profiling.py:18(list_comp)
  1000000    0.261    0.000    0.261    0.000 profiling.py:19(<listcomp>)
  1000000    0.131    0.000    0.131    0.000 {built-in method builtins.sum}
  1000000    0.027    0.000    0.027    0.000 {method 'disable' of '_lsprof.Profiler' objects}

恕我直言：

reduce并且map总体上来说很慢。不仅如此，sum与map返回sum列表相比，在返回的迭代器上使用慢
for_loop 使用append，这在某种程度上当然很慢
列表理解不仅花费了最少的时间来构建列表，而且与之sum相比，它也变得更快map

— tjysdsg
source

5

给Alphii答案增加一点扭曲，实际上for循环将是第二好，并且比循环慢6倍map

from functools import reduce
import datetime


def time_it(func, numbers, *args):
    start_t = datetime.datetime.now()
    for i in range(numbers):
        func(args[0])
    print (datetime.datetime.now()-start_t)

def square_sum1(numbers):
    return reduce(lambda sum, next: sum+next**2, numbers, 0)


def square_sum2(numbers):
    a = 0
    for i in numbers:
        a += i**2
    return a

def square_sum3(numbers):
    a = 0
    map(lambda x: a+x**2, numbers)
    return a

def square_sum4(numbers):
    a = 0
    return [a+i**2 for i in numbers]

time_it(square_sum1, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum2, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum3, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum4, 100000, [1, 2, 5, 3, 1, 2, 5, 3])

主要的更改是消除了缓慢的sum通话，以及int()在最后一种情况下可能不必要的通话。实际上，将for循环和map放在相同的术语中就可以说是事实。请记住，lambda是功能性概念，从理论上讲不应该具有副作用，但是，它们可以具有诸如添加到的副作用a。在这种情况下使用Python 3.6.1，Ubuntu 14.04，Intel（R）Core（TM）i7-4770 CPU @ 3.40GHz时的结果

0:00:00.257703 #Reduce
0:00:00.184898 #For loop
0:00:00.031718 #Map
0:00:00.212699 #List comprehension

— 杰梅洛
source

2

square_sum3和square_sum4不正确。他们不会给总和。@alisca chen在下面的回答实际上是正确的。

— ShikharDua

3

我设法修改了@alpiii的一些代码，并发现List理解比for循环快一点。它可能是由引起的int()，在列表理解和for循环之间是不公平的。

from functools import reduce
import datetime

def time_it(func, numbers, *args):
    start_t = datetime.datetime.now()
    for i in range(numbers):
        func(args[0])
    print (datetime.datetime.now()-start_t)

def square_sum1(numbers):
    return reduce(lambda sum, next: sum+next*next, numbers, 0)

def square_sum2(numbers):
    a = []
    for i in numbers:
        a.append(i*2)
    a = sum(a)
    return a

def square_sum3(numbers):
    sqrt = lambda x: x*x
    return sum(map(sqrt, numbers))

def square_sum4(numbers):
    return(sum([i*i for i in numbers]))

time_it(square_sum1, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum2, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum3, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum4, 100000, [1, 2, 5, 3, 1, 2, 5, 3])

0:00:00.101122 #Reduce

0:00:00.089216 #For loop

0:00:00.101532 #Map

0:00:00.068916 #List comprehension

— 陈丽丽
source