将列表分为N个长度大致相等的部分

149

将列表分为大致相等的最佳方法是什么？例如，如果列表包含7个元素并将其分为2部分，则我们希望在一部分中获得3个元素，而另一部分应包含4个元素。

我正在寻找类似的东西even_split(L, n)，打破L成n部分。

def chunks(L, n):
    """ Yield successive n-sized chunks from L.
    """
    for i in range(0, len(L), n):
        yield L[i:i+n]

上面的代码给出了3个块，而不是3个块。我可以简单地进行转置（对此进行迭代，并取每列的第一个元素，将其称为第一部分，然后取其第二，然后将其置于第二部分，依此类推），但这会破坏项目的顺序。

python list chunks

— 威姆
source

65

由于舍入错误，此代码已损坏。不要使用它！！！

assert len(chunkIt([1,2,3], 10)) == 10  # fails

这是一个可行的方法：

def chunkIt(seq, num):
    avg = len(seq) / float(num)
    out = []
    last = 0.0

    while last < len(seq):
        out.append(seq[int(last):int(last + avg)])
        last += avg

    return out

测试：

>>> chunkIt(range(10), 3)
[[0, 1, 2], [3, 4, 5], [6, 7, 8, 9]]
>>> chunkIt(range(11), 3)
[[0, 1, 2], [3, 4, 5, 6], [7, 8, 9, 10]]
>>> chunkIt(range(12), 3)
[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]]

— 马克斯·沙瓦布（Max Shawabkeh）
source

9

您的示例不适用于>>> chunkIt(range(8), 6)=>[[0], [1], [2, 3], [4], [5], [6], [7]]

— nopper 2013年

1

@nopper，我添加了一个“ if num == 1”作为条件来处理这种边缘情况。

— paulie4，2013年

24

新访客：请不要使用或赞这个代码，它已损坏。例如chunkIt(range(10), 9)应该返回9个部分，但不会。

— wim

3

由于答案已被多次编辑，因此该注释线程确实令人困惑。这是一个好答案吗？不是一个好的答案？

— conchoecia

6

@conchoecia不好的回答，继续向下滚动。到目前为止，该编辑仅进行了一次编辑，而只是微不足道的编辑（将2个空格缩进更改为4个）。不幸的是，OP“ user248237dfsf”已经3年没有出现在该网站上了，因此几乎没有希望更改接受的答案。

— WIM

182

您可以相当简单地将其编写为列表生成器：

def split(a, n):
    k, m = divmod(len(a), n)
    return (a[i * k + min(i, m):(i + 1) * k + min(i + 1, m)] for i in range(n))

例：

>>> list(split(range(11), 3))
[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10]]

— tixxit
source

插入n = min(n, len(a)) # don't create empty buckets1号线，以避免在场景中创建空水桶像list(split(range(X, Y)))在那里X < Y

— abanana

看来我无法编辑我的评论-我应该补充一点，如果列表为空，则我先前的修订可能会导致零除错误，因此需要从外部进行控制或将其添加到解决方案中。

— abanana '17

4

在SO的N个答案中，这是唯一通过我所有测试的答案。j！

— avishayp

2

stackoverflow.com/a/37414115/210971使用相同的方法，但也适用于空列表和0拆分计数器。

— LookAheadAtYourTypes

美丽！另外，可以通过在return语句中交换k和n来使n用作batch_size ：)

— haraprasadj

161

这就是存在的理由为numpy.array_split*：

>>> import numpy as np
>>> print(*np.array_split(range(10), 3))
[0 1 2 3] [4 5 6] [7 8 9]
>>> print(*np.array_split(range(10), 4))
[0 1 2] [3 4 5] [6 7] [8 9]
>>> print(*np.array_split(range(10), 5))
[0 1] [2 3] [4 5] [6 7] [8 9]

_{*归功于6号房的比雷埃夫斯零}

— 威姆
source

1

什么是*中print了？

— yuqli

2

嘿@yuqli，它将某物的列表转换为函数的单个参数。尝试print(L)和`print（* L）。另请参阅stackoverflow.com/a/36908/2184122或搜索“ python使用星号”。

— 罗伯特·拉格

121

只要您不想要像连续块这样的愚蠢的东西：

>>> def chunkify(lst,n):
...     return [lst[i::n] for i in xrange(n)]
... 
>>> chunkify(range(13), 3)
[[0, 3, 6, 9, 12], [1, 4, 7, 10], [2, 5, 8, 11]]

— 工作
source

14

我不会说连续的块很傻。例如，也许您想对块进行排序（例如，chunk [0] <chunk [1]）。

— tixxit 2010年

1

我是开玩笑的。但是，如果您真的不在乎，则列表理解的这种方式非常简洁。

— 工作

3

这是n

— smci 2014年

8

发送该输出入“拉链”给你的有序列表：zip(*chunkify(range(13), 3))结果[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11)]

— 一族

2

该解决方案效果很好，直到您需要保持列表顺序为止。

— s7anley

18

更改代码以产生n块，而不是n：

def chunks(l, n):
    """ Yield n successive chunks from l.
    """
    newn = int(len(l) / n)
    for i in xrange(0, n-1):
        yield l[i*newn:i*newn+newn]
    yield l[n*newn-newn:]

l = range(56)
three_chunks = chunks (l, 3)
print three_chunks.next()
print three_chunks.next()
print three_chunks.next()

这使：

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17]
[18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]
[36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55]

这会将多余的元素分配给最终的组，这不是完美的，但完全在您的“大约N个相等的部分”的规范内:-)那样，我的意思是56个元素会更好（19,19,18），而这给出了（18,18,20）。

您可以使用以下代码获得更加平衡的输出：

#!/usr/bin/python
def chunks(l, n):
    """ Yield n successive chunks from l.
    """
    newn = int(1.0 * len(l) / n + 0.5)
    for i in xrange(0, n-1):
        yield l[i*newn:i*newn+newn]
    yield l[n*newn-newn:]

l = range(56)
three_chunks = chunks (l, 3)
print three_chunks.next()
print three_chunks.next()
print three_chunks.next()

输出：

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18]
[19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37]
[38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55]

— 紫罗兰色
source

这给了我一个奇怪的结果。for p in chunks（range（54），3）：print len（p）返回

固定，那就是最终产量。

— paxdiablo，2010年

另请参阅链接上的

— 独处

出于实际考虑，这是最有用的答案。谢谢！

— mVChr 2014年

当我使用它进行操作时for x in chunks(mylist,num): print x，得到了所需的块，但是在它们之间却得到了一个空列表。知道为什么吗？也就是说，我得到很多[]，每个块之后一个。

— synaptik

12

如果将n元素划分为大致的k块，则可以使n % k块1的元素比其他块大，以分配额外的元素。

以下代码将为您提供块的长度：

[(n // k) + (1 if i < (n % k) else 0) for i in range(k)]

示例：n=11, k=3结果[4, 4, 3]

然后，您可以轻松计算这些块的开始索引：

[i * (n // k) + min(i, n % k) for i in range(k)]

示例：n=11, k=3结果[0, 4, 8]

使用i+1th块作为边界，我们得到带有len i的list 的th块是ln

l[i * (n // k) + min(i, n % k):(i+1) * (n // k) + min(i+1, n % k)]

最后一步，使用列表理解从所有块创建一个列表：

[l[i * (n // k) + min(i, n % k):(i+1) * (n // k) + min(i+1, n % k)] for i in range(k)]

示例：n=11, k=3, l=range(n)结果[range(0, 4), range(4, 8), range(8, 11)]

— Pe
source

6

这将通过单个表达式进行拆分：

>>> myList = range(18)
>>> parts = 5
>>> [myList[(i*len(myList))//parts:((i+1)*len(myList))//parts] for i in range(parts)]
[[0, 1, 2], [3, 4, 5, 6], [7, 8, 9], [10, 11, 12, 13], [14, 15, 16, 17]]

本示例中的列表大小为18，分为5部分。零件的大小不超过一个元素。

— 比塔哥拉斯
source

6

见more_itertools.divide：

n = 2

[list(x) for x in mit.divide(n, range(5, 11))]
# [[5, 6, 7], [8, 9, 10]]

[list(x) for x in mit.divide(n, range(5, 12))]
# [[5, 6, 7, 8], [9, 10, 11]]

通过安装> pip install more_itertools。

— pylang
source

4

这是一个None使列表相等长度的添加项

>>> from itertools import izip_longest
>>> def chunks(l, n):
    """ Yield n successive chunks from l. Pads extra spaces with None
    """
    return list(zip(*izip_longest(*[iter(l)]*n)))

>>> l=range(54)

>>> chunks(l,3)
[(0, 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36, 39, 42, 45, 48, 51), (1, 4, 7, 10, 13, 16, 19, 22, 25, 28, 31, 34, 37, 40, 43, 46, 49, 52), (2, 5, 8, 11, 14, 17, 20, 23, 26, 29, 32, 35, 38, 41, 44, 47, 50, 53)]

>>> chunks(l,4)
[(0, 4, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48, 52), (1, 5, 9, 13, 17, 21, 25, 29, 33, 37, 41, 45, 49, 53), (2, 6, 10, 14, 18, 22, 26, 30, 34, 38, 42, 46, 50, None), (3, 7, 11, 15, 19, 23, 27, 31, 35, 39, 43, 47, 51, None)]

>>> chunks(l,5)
[(0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50), (1, 6, 11, 16, 21, 26, 31, 36, 41, 46, 51), (2, 7, 12, 17, 22, 27, 32, 37, 42, 47, 52), (3, 8, 13, 18, 23, 28, 33, 38, 43, 48, 53), (4, 9, 14, 19, 24, 29, 34, 39, 44, 49, None)]

— 约翰·拉鲁伊
source

4

这是我的解决方案：

def chunks(l, amount):
    if amount < 1:
        raise ValueError('amount must be positive integer')
    chunk_len = len(l) // amount
    leap_parts = len(l) % amount
    remainder = amount // 2  # make it symmetrical
    i = 0
    while i < len(l):
        remainder += leap_parts
        end_index = i + chunk_len
        if remainder >= amount:
            remainder -= amount
            end_index += 1
        yield l[i:end_index]
        i = end_index

产生

    >>> list(chunks([1, 2, 3, 4, 5, 6, 7], 3))
    [[1, 2], [3, 4, 5], [6, 7]]

— 莱特鲁巴赫
source

4

这是一个可以处理任何正数（整数）的块的生成器。如果块的数量大于输入列表的长度，则某些块将为空。该算法在长块和短块之间交替而不是分离它们。

我还包括一些用于测试该ragged_chunks功能的代码。

''' Split a list into "ragged" chunks

    The size of each chunk is either the floor or ceiling of len(seq) / chunks

    chunks can be > len(seq), in which case there will be empty chunks

    Written by PM 2Ring 2017.03.30
'''

def ragged_chunks(seq, chunks):
    size = len(seq)
    start = 0
    for i in range(1, chunks + 1):
        stop = i * size // chunks
        yield seq[start:stop]
        start = stop

# test

def test_ragged_chunks(maxsize):
    for size in range(0, maxsize):
        seq = list(range(size))
        for chunks in range(1, size + 1):
            minwidth = size // chunks
            #ceiling division
            maxwidth = -(-size // chunks)
            a = list(ragged_chunks(seq, chunks))
            sizes = [len(u) for u in a]
            deltas = all(minwidth <= u <= maxwidth for u in sizes)
            assert all((sum(a, []) == seq, sum(sizes) == size, deltas))
    return True

if test_ragged_chunks(100):
    print('ok')

通过将乘法导出到调用中，我们可以使此方法稍微更有效range，但是我认为以前的版本更具可读性（和DRYer）。

def ragged_chunks(seq, chunks):
    size = len(seq)
    start = 0
    for i in range(size, size * chunks + 1, size):
        stop = i // chunks
        yield seq[start:stop]
        start = stop

— 2号环
source

3

看看numpy.split：

>>> a = numpy.array([1,2,3,4])
>>> numpy.split(a, 2)
[array([1, 2]), array([3, 4])]

— 达洛洛格姆
source

5

numpy.array_split（）甚至更合适，因为它会大致拆分。

— Yariv 2013年

11

如果无法通过分割数将数组大小整除，则此方法将无效。

— 2013年

1

这是错误的答案，你的解决方案回报ndarrays的列表，而不是列出的清单

— Chłopž拉速

3

使用numpy.linspace方法实现。

只需指定要分割数组的部分数即可，分割的大小几乎相等。

范例：

import numpy as np   
a=np.arange(10)
print "Input array:",a 
parts=3
i=np.linspace(np.min(a),np.max(a)+1,parts+1)
i=np.array(i,dtype='uint16') # Indices should be floats
split_arr=[]
for ind in range(i.size-1):
    split_arr.append(a[i[ind]:i[ind+1]]
print "Array split in to %d parts : "%(parts),split_arr

给出：

Input array: [0 1 2 3 4 5 6 7 8 9]
Array split in to 3 parts :  [array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8, 9])]

— 阿密特
source

3

我的解决方案，易于理解

def split_list(lst, n):
    splitted = []
    for i in reversed(range(1, n + 1)):
        split_point = len(lst)//i
        splitted.append(lst[:split_point])
        lst = lst[split_point:]
    return splitted

这页上最短的一线（由我的女孩写）

def split(l, n):
    return [l[int(i*len(l)/n):int((i+1)*len(l)/n-1)] for i in range(n)]

— ChłopZ Lasu
source

仅供参考：单线断裂，产生错误的结果。另一个作品精美。

— Paulo Freitas

2

使用列表理解：

def divide_list_to_chunks(list_, n):
    return [list_[start::n] for start in range(n)]

— Liscju
source

这没有解决使所有块均匀的问题。

— SuperBiasedMan 2015年

0

另一种方法是这样的，这里的想法是使用石斑鱼，但摆脱掉None。在这种情况下，我们将从列表的第一部分中的所有元素组成所有的“ small_parts”，从列表的第二部分中组成“ larger_parts”。“较大部分”的长度为len（small_parts）+1。我们需要将x视为两个不同的子部分。

from itertools import izip_longest

import numpy as np

def grouper(n, iterable, fillvalue=None): # This is grouper from itertools
    "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return izip_longest(fillvalue=fillvalue, *args)

def another_chunk(x,num):
    extra_ele = len(x)%num #gives number of parts that will have an extra element 
    small_part = int(np.floor(len(x)/num)) #gives number of elements in a small part

    new_x = list(grouper(small_part,x[:small_part*(num-extra_ele)]))
    new_x.extend(list(grouper(small_part+1,x[small_part*(num-extra_ele):])))

    return new_x

我设置的方式返回一个元组列表：

>>> x = range(14)
>>> another_chunk(x,3)
[(0, 1, 2, 3), (4, 5, 6, 7, 8), (9, 10, 11, 12, 13)]
>>> another_chunk(x,4)
[(0, 1, 2), (3, 4, 5), (6, 7, 8, 9), (10, 11, 12, 13)]
>>> another_chunk(x,5)
[(0, 1), (2, 3, 4), (5, 6, 7), (8, 9, 10), (11, 12, 13)]
>>>

— 阿卡瓦尔
source

0

这是另一种变体，将“剩余”元素平均分布在所有块中，一次一个直到剩下一个都没有。在此实现中，较大的块出现在过程开始时。

def chunks(l, k):
  """ Yield k successive chunks from l."""
  if k < 1:
    yield []
    raise StopIteration
  n = len(l)
  avg = n/k
  remainders = n % k
  start, end = 0, avg
  while start < n:
    if remainders > 0:
      end = end + 1
      remainders = remainders - 1
    yield l[start:end]
    start, end = end, end+avg

例如，从14个元素的列表中生成4个块：

>>> list(chunks(range(14), 4))
[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10], [11, 12, 13]]
>>> map(len, list(chunks(range(14), 4)))
[4, 4, 3, 3]

— 杰里斯
source

0

与求职者的答案相同，但考虑的是列表大小小于块数的列表。

def chunkify(lst,n):
    [ lst[i::n] for i in xrange(n if n < len(lst) else len(lst)) ]

如果n（块的数量）为7，lst（要分割的列表）为[1、2、3]，则块为[[0]，[1]，[2]]而不是[[0]，[1] ]，[2]，[]，[]，[]，[]]

— 伊利亚·图夫切夫（Ilya Tuvschev）
source

0

您还可以使用：

split=lambda x,n: x if not x else [x[:n]]+[split([] if not -(len(x)-n) else x[-(len(x)-n):],n)][0]

split([1,2,3,4,5,6,7,8,9],2)

[[1, 2], [3, 4], [5, 6], [7, 8], [9]]

— 卡洛斯·德尔·奥乔
source

0

def evenly(l, n):
    len_ = len(l)
    split_size = len_ // n
    split_size = n if not split_size else split_size
    offsets = [i for i in range(0, len_, split_size)]
    return [l[offset:offset + split_size] for offset in offsets]

例：

l = [a for a in range(97)] 应该由10个部分组成，每个部分都有9个元素，最后一个除外。

输出：

[[0, 1, 2, 3, 4, 5, 6, 7, 8],
 [9, 10, 11, 12, 13, 14, 15, 16, 17],
 [18, 19, 20, 21, 22, 23, 24, 25, 26],
 [27, 28, 29, 30, 31, 32, 33, 34, 35],
 [36, 37, 38, 39, 40, 41, 42, 43, 44],
 [45, 46, 47, 48, 49, 50, 51, 52, 53],
 [54, 55, 56, 57, 58, 59, 60, 61, 62],
 [63, 64, 65, 66, 67, 68, 69, 70, 71],
 [72, 73, 74, 75, 76, 77, 78, 79, 80],
 [81, 82, 83, 84, 85, 86, 87, 88, 89],
 [90, 91, 92, 93, 94, 95, 96]]

— 埃姆雷姆拉
source

0

假设您要拆分列表[1、2、3、4、5、6、7、8]分成3个元素列表

像[[1,2,3]，[4、5、6]，[7、8]]一样，如果最后剩下的元素少于3个，则将它们分组在一起。

my_list = [1, 2, 3, 4, 5, 6, 7, 8]
my_list2 = [my_list[i:i+3] for i in range(0, len(my_list), 3)]
print(my_list2)

输出： [[1,2,3]，[4、5、6]，[7、8]

其中一个部分的长度为3。将3替换为您自己的块大小。

— 阿里·萨贾德（Ali Sajjad）
source

0

1>

import numpy as np

data # your array

total_length = len(data)
separate = 10
sub_array_size = total_length // separate
safe_separate = sub_array_size * separate

splited_lists = np.split(np.array(data[:safe_separate]), separate)
splited_lists[separate - 1] = np.concatenate(splited_lists[separate - 1], 
np.array(data[safe_separate:total_length]))

splited_lists # your output

2>

splited_lists = np.array_split(np.array(data), separate)

— 贝克·威科姆（Bbake Waikhom）
source

0

def chunk_array(array : List, n: int) -> List[List]:
    chunk_size = len(array) // n 
    chunks = []
    i = 0
    while i < len(array):
        # if less than chunk_size left add the remainder to last element
        if len(array) - (i + chunk_size + 1) < 0:
            chunks[-1].append(*array[i:i + chunk_size])
            break
        else:
            chunks.append(array[i:i + chunk_size])
            i += chunk_size
    return chunks

这是我的版本（灵感来自Max's）

— 索拉鲍丹
source

-1

舍入linspace并将其用作索引比amit12690提出的解决方案更简单。

function chunks=chunkit(array,num)

index = round(linspace(0,size(array,2),num+1));

chunks = cell(1,num);

for x = 1:num
chunks{x} = array(:,index(x)+1:index(x+1));
end
end

— 疾风
source

-1

#!/usr/bin/python


first_names = ['Steve', 'Jane', 'Sara', 'Mary','Jack','Bob', 'Bily', 'Boni', 'Chris','Sori', 'Will', 'Won','Li']

def chunks(l, n):
for i in range(0, len(l), n):
    # Create an index range for l of n items:
    yield l[i:i+n]

result = list(chunks(first_names, 5))
print result

从此链接中选择的，这对我有所帮助。我有一个预定义的列表。

— Swateek
source

-1

说您想分为5部分：

p1, p2, p3, p4, p5 = np.split(df, 5)

— 丹尼尔
source

4

这不能为问题提供答案，例如，如果您事先不知道要分成五个部分，将如何编写它。另外，您（我猜是）假设是numpy，也许是熊猫数据框。OP正在询问通用列表。

— NickD

-1

在这种情况下，我自己编写了代码：

def chunk_ports(port_start, port_end, portions):
    if port_end < port_start:
        return None

    total = port_end - port_start + 1

    fractions = int(math.floor(float(total) / portions))

    results = []

    # No enough to chuck.
    if fractions < 1:
        return None

    # Reverse, so any additional items would be in the first range.
    _e = port_end
    for i in range(portions, 0, -1):
        print "i", i

        if i == 1:
            _s = port_start
        else:
            _s = _e - fractions + 1

        results.append((_s, _e))

        _e = _s - 1

    results.reverse()

    return results

split_ports（1，10，9）将返回

[(1, 2), (3, 3), (4, 4), (5, 5), (6, 6), (7, 7), (8, 8), (9, 9), (10, 10)]

— 陈密码
source

-1

这段代码对我有用（Python3兼容）：

def chunkify(tab, num):
    return [tab[i*num: i*num+num] for i in range(len(tab)//num+(1 if len(tab)%num else 0))]

示例（适用于bytearray类型，但也适用于list）：

b = bytearray(b'\x01\x02\x03\x04\x05\x06\x07\x08')
>>> chunkify(b,3)
[bytearray(b'\x01\x02\x03'), bytearray(b'\x04\x05\x06'), bytearray(b'\x07\x08')]
>>> chunkify(b,4)
[bytearray(b'\x01\x02\x03\x04'), bytearray(b'\x05\x06\x07\x08')]

— grafi71
source

-1

这个提供长度为<= n，> = 0的块

定义

 chunkify(lst, n):
    num_chunks = int(math.ceil(len(lst) / float(n))) if n < len(lst) else 1
    return [lst[n*i:n*(i+1)] for i in range(num_chunks)]

例如

>>> chunkify(range(11), 3)
[[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10]]
>>> chunkify(range(11), 8)
[[0, 1, 2, 3, 4, 5, 6, 7], [8, 9, 10]]

— 安东尼·曼宁·富兰克林
source

-1

我尝试了大部分解决方案，但是它们不适用于我的情况，因此我创建了一个适用于大多数情况和任何类型的数组的新函数：

import math

def chunkIt(seq, num):
    seqLen = len(seq)
    total_chunks = math.ceil(seqLen / num)
    items_per_chunk = num
    out = []
    last = 0

    while last < seqLen:
        out.append(seq[last:(last + items_per_chunk)])
        last += items_per_chunk

    return out

— 恩杰洛·波洛托（Angelo Polotto）
source