遍历一个numpy数组

135

有没有那么冗长的替代方案：

for x in xrange(array.shape[0]):
    for y in xrange(array.shape[1]):
        do_stuff(x, y)

我想出了这个：

for x, y in itertools.product(map(xrange, array.shape)):
    do_stuff(x, y)

这节省了一个缩进，但仍然很丑陋。

我希望看起来像这样的伪代码：

for x, y in array.indices:
    do_stuff(x, y)

有没有类似的东西存在？

python numpy

— 拉姆·拉赫姆（Ram Rachum）
source

我在python 2.7中，正在使用itertools解决方案；我在评论中读到，使用itertools会更快。但是，（也许是因为我在2.7中）我也不得不在for循环中解包map。for x, y in itertools.product(*map(xrange, array.shape)):

— ALM

NumPy参考中有一个称为“ 遍历

— Casey

相关：stackoverflow.com/questions/29493183/...

— Eulenfuchswiesel

187

我认为您正在寻找ndenumerate。

>>> a =numpy.array([[1,2],[3,4],[5,6]])
>>> for (x,y), value in numpy.ndenumerate(a):
...  print x,y
... 
0 0
0 1
1 0
1 1
2 0
2 1

关于性能。它比列表理解要慢一些。

X = np.zeros((100, 100, 100))

%timeit list([((i,j,k), X[i,j,k]) for i in range(X.shape[0]) for j in range(X.shape[1]) for k in range(X.shape[2])])
1 loop, best of 3: 376 ms per loop

%timeit list(np.ndenumerate(X))
1 loop, best of 3: 570 ms per loop

如果您担心性能，可以通过查看实现来进一步优化ndenumerate，它实现了两件事，转换为数组并循环。如果知道有数组，则可以调用.coords平面迭代器的属性。

a = X.flat
%timeit list([(a.coords, x) for x in a.flat])
1 loop, best of 3: 305 ms per loop

— 西吉
source

1

请注意，此方法有效，但速度非常慢。您最好手动进行迭代。

— 马蒂

43

如果只需要索引，可以尝试numpy.ndindex：

>>> a = numpy.arange(9).reshape(3, 3)
>>> [(x, y) for x, y in numpy.ndindex(a.shape)]
[(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)]

— 哨兵
source

15

见nditer

import numpy as np
Y = np.array([3,4,5,6])
for y in np.nditer(Y, op_flags=['readwrite']):
    y += 3

Y == np.array([6, 7, 8, 9])

y = 3将无法使用y *= 0，y += 3而是使用。

— C19
source

2

或使用y [...] = 3

— 唐纳德·霍布森