查找列表模式

126

给定项目列表，回想一下该模式列表是最常出现的项目。

我想知道如何创建一个函数，该函数可以找到列表的模式，但是如果列表不具有模式（例如，列表中的所有项目仅出现一次），则会显示一条消息。我想使此功能不导入任何功能。我正在尝试从头开始实现自己的功能。

python mode

— 蓝灯笼
source

抱歉，但是您能解释一下“列表模式”到底是什么意思吗？

— 维卡斯2012年

5

@Vikas：模式是最频繁出现的元素（如果有）。如果有多个元素，则一些定义将其扩展为采用所有此类元素的算术平均值。

— 杰里米·罗马

这里有这么多错误的答案！例如assert(mode[1, 1, 1]) == None和 assert(mode[1, 2, 3, 4]) == None。要使一个数字成为a mode，它必须比列表中至少一个其他数字出现更多次，并且不能是列表中唯一的数字。

— lifebalance

156

您可以使用max功能和键。看看使用'key'和lambda表达式的python max函数。

max(set(lst), key=lst.count)

— 大卫·道
source

6

考虑到OP不需要任何额外的导入，这是OP的正确答案。大卫

— Jason Parham，2015年

12

在我看来，这将在O(n**2)。可以？

— lirtosiast 2015年

7

这具有二次运行时间

— Padraic Cunningham 2015年

20

也可以使用max(lst, key=lst.count)。（而且我真的不给清单打电话list。）

— Stefan Pochmann

2

谁能解释这对双峰分布如何起作用？例如a = [22, 33, 11, 22, 11]; print(max(set(a), key=a.count))返回11。它将始终返回最小模式吗？如果是这样，为什么？

— battey

99

您可以使用具有-esque函数Counter的collections软件包中提供的mode

from collections import Counter
data = Counter(your_list_in_here)
data.most_common()   # Returns all unique items and their counts
data.most_common(1)  # Returns the highest occurring item

注意：Counter在python 2.7中是新的，并且在早期版本中不可用。

— 克里斯蒂安·威茨
source

19

该问题说明用户希望从头开始创建功能-即，不导入任何内容。

— dbliss 2015年

3

最后一行返回一个包含元组的列表，该元组包含一个模式及其频率。只需使用模式即可Counter(your_list_in_here).most_common(1)[0][0]。如果有多个模式，则返回任意一个。

— 罗里·道顿

1

假设n最常见modes。如果Counter（your_list_in_here）.most_common（1）[0] [0]为您提供第一种模式，您将如何获得另一种最常见的模式mode？只需更换最后0用1？一个人可以根据mode自己的喜好定制一种功能

1

如果有多个模式，我该如何返回这些数字中最大的一个？

— Akin Hwan

59

Python 3.4包含了method statistics.mode，所以它很简单：

>>> from statistics import mode
>>> mode([1, 1, 2, 3, 3, 3, 3, 4])
 3

列表中可以有任何类型的元素，而不仅仅是数字：

>>> mode(["red", "blue", "blue", "red", "green", "red", "red"])
 'red'

— 贾巴尔多内多
source

17

使用模式（[1，1,1,1，2，3，3，3，3，4]）时抛出错误，其中1和3重复相同的时间。理想情况下，应返回最大但相等次数的最小数。StatisticsError：没有唯一模式；发现2个相同的通用值

— aman_novice '16

4

尚未使用此3.4统计信息包，但在这种情况下scipy.stats.mode将返回最小值，在这种情况下为1。但是，在某些情况下，我更喜欢抛出错误……

— wordsmith

2

@aman_novice，此问题已在Python 3.8中解决。docs.python.org/3/library/statistics.html#statistics.mode

— Michael D

2

python 3.8还添加了multimode，当存在多个模式时，它将返回多种模式。

— stason

30

从一些统计软件（即SciPy和MATLAB）中获取一个叶子，它们只会返回最小的最常见值，因此，如果两个值相等地频繁出现，则会返回其中的最小值。希望有一个例子可以帮助：

>>> from scipy.stats import mode

>>> mode([1, 2, 3, 4, 5])
(array([ 1.]), array([ 1.]))

>>> mode([1, 2, 2, 3, 3, 4, 5])
(array([ 2.]), array([ 2.]))

>>> mode([1, 2, 2, -3, -3, 4, 5])
(array([-3.]), array([ 2.]))

有什么原因导致您无法遵守该约定？

— 克里斯
source

4

为什么有多个模式时仅返回最小模式？

— zyxue

@zyxue简单的统计约定

— 基督

2

@chrisfs并使其返回最大模式（如果有多个模式）？

— Akin Hwan

25

有许多简单的方法可以在Python中找到列表模式，例如：

import statistics
statistics.mode([1,2,3,3])
>>> 3

或者，您可以通过计数找到最大值

max(array, key = array.count)

这两种方法的问题在于它们不能在多种模式下使用。第一个返回错误，而第二个返回第一个模式。

为了找到集合的模式，您可以使用以下功能：

def mode(array):
    most = max(list(map(array.count, array)))
    return list(set(filter(lambda x: array.count(x) == most, array)))

— Mathwizurd
source

3

使用该模式时，如果有两个元素出现的时间相同，则会产生错误。

— Abhishek Mishra

抱歉，看到此评论真的很晚。Statistics.mode（array）会在多个模式下返回错误，但其他方法都不会。

— mathwizurd

8

扩展在列表为空时不起作用的社区答案，这是mode的有效代码：

def mode(arr):
        if arr==[]:
            return None
        else:
            return max(set(arr), key=arr.count)

— 卡迪（Kardi Teknomo）
source

3

如果您对最小，最大或所有模式都感兴趣：

def get_small_mode(numbers, out_mode):
    counts = {k:numbers.count(k) for k in set(numbers)}
    modes = sorted(dict(filter(lambda x: x[1] == max(counts.values()), counts.items())).keys())
    if out_mode=='smallest':
        return modes[0]
    elif out_mode=='largest':
        return modes[-1]
    else:
        return modes

— 塔舒卡
source

2

我写了这个方便的功能来找到模式。

def mode(nums):
    corresponding={}
    occurances=[]
    for i in nums:
            count = nums.count(i)
            corresponding.update({i:count})

    for i in corresponding:
            freq=corresponding[i]
            occurances.append(freq)

    maxFreq=max(occurances)

    keys=corresponding.keys()
    values=corresponding.values()

    index_v = values.index(maxFreq)
    global mode
    mode = keys[index_v]
    return mode

— 用户名
source

2

如果两个项目具有相同的编号，则此方法将失败。发生的次数。

— akshaynagpal 2014年

2

简短，但有点丑陋：

def mode(arr) :
    m = max([arr.count(a) for a in arr])
    return [x for x in arr if arr.count(x) == m][0] if m>1 else None

使用字典，稍微不那么难看：

def mode(arr) :
    f = {}
    for a in arr : f[a] = f.get(a,0)+1
    m = max(f.values())
    t = [(x,f[x]) for x in f if f[x]==m]
    return m > 1 t[0][0] else None

— 卡尔
source

2

稍长一些，但是可以有多种模式，并且可以获取具有最多计数或数据类型混合的字符串。

def getmode(inplist):
    '''with list of items as input, returns mode
    '''
    dictofcounts = {}
    listofcounts = []
    for i in inplist:
        countofi = inplist.count(i) # count items for each item in list
        listofcounts.append(countofi) # add counts to list
        dictofcounts[i]=countofi # add counts and item in dict to get later
    maxcount = max(listofcounts) # get max count of items
    if maxcount ==1:
        print "There is no mode for this dataset, values occur only once"
    else:
        modelist = [] # if more than one mode, add to list to print out
        for key, item in dictofcounts.iteritems():
            if item ==maxcount: # get item from original list with most counts
                modelist.append(str(key))
        print "The mode(s) are:",' and '.join(modelist)
        return modelist

— 廷普约翰斯
source

2

要使一个数字成为a mode，它必须比列表中至少一个其他数字出现更多次，并且不能是列表中唯一的数字。因此，我重构了@mathwizurd的答案（使用该difference方法），如下所示：

def mode(array):
    '''
    returns a set containing valid modes
    returns a message if no valid mode exists
      - when all numbers occur the same number of times
      - when only one number occurs in the list 
      - when no number occurs in the list 
    '''
    most = max(map(array.count, array)) if array else None
    mset = set(filter(lambda x: array.count(x) == most, array))
    return mset if set(array) - mset else "list does not have a mode!"

这些测试成功通过：

mode([]) == None 
mode([1]) == None
mode([1, 1]) == None 
mode([1, 1, 2, 2]) == None

— 生活平衡
source

1

为什么不只是

def print_mode (thelist):
  counts = {}
  for item in thelist:
    counts [item] = counts.get (item, 0) + 1
  maxcount = 0
  maxitem = None
  for k, v in counts.items ():
    if v > maxcount:
      maxitem = k
      maxcount = v
  if maxcount == 1:
    print "All values only appear once"
  elif counts.values().count (maxcount) > 1:
    print "List has multiple modes"
  else:
    print "Mode of list:", maxitem

它没有应该进行的一些错误检查，但是它将在不导入任何功能的情况下找到模式，并且如果所有值仅出现一次，则将打印一条消息。它还不清楚是否有多个项目共享相同的最大计数。

— op
source

因此，我想做的就是检测显示相同计数的多个项目，然后显示具有相同计数的所有项目

— bluelantern 2012年

您实际上自己尝试过吗？我的代码在这里的扩展使其可以打印具有相同计数的所有项目，这非常简单。

— lxop

1

该函数将返回一个函数的一个或多个模式，无论返回多少，以及返回数据集中一个或多个模式的频率。如果没有模式（即所有项目仅发生一次），该函数将返回错误字符串。这类似于上面的A_nagpal的函数，但据我拙见，它更完整，而且我认为对于任何Python新手（例如您的人）来说，阅读此问题更容易理解。

 def l_mode(list_in):
    count_dict = {}
    for e in (list_in):   
        count = list_in.count(e)
        if e not in count_dict.keys():
            count_dict[e] = count
    max_count = 0 
    for key in count_dict: 
        if count_dict[key] >= max_count:
            max_count = count_dict[key]
    corr_keys = [] 
    for corr_key, count_value in count_dict.items():
        if count_dict[corr_key] == max_count:
            corr_keys.append(corr_key)
    if max_count == 1 and len(count_dict) != 1: 
        return 'There is no mode for this data set. All values occur only once.'
    else: 
        corr_keys = sorted(corr_keys)
        return corr_keys, max_count

— 用户名
source

我之所以这么说是因为您说过“该函数返回错误字符串”。return 'There is no mode for this data set. All values occur only once.'可以将读取的行转换为以下错误消息，traceback例如：if条件：下一行带有缩进引发ValueError（'此数据集没有模式。所有值仅出现一次。'）这是不同类型的列表您可能会提出的错误。

1

这将返回所有模式：

def mode(numbers)
    largestCount = 0
    modes = []
    for x in numbers:
        if x in modes:
            continue
        count = numbers.count(x)
        if count > largestCount:
            del modes[:]
            modes.append(x)
            largestCount = count
        elif count == largestCount:
            modes.append(x)
    return modes

— 蒂姆·奥顿
source

1

简单代码，无需输入即可查找列表模式：

nums = #your_list_goes_here
nums.sort()
counts = dict()
for i in nums:
    counts[i] = counts.get(i, 0) + 1
mode = max(counts, key=counts.get)

在多种模式下，它应该返回最小节点。

— baby_yoda
source

0

def mode(inp_list):
    sort_list = sorted(inp_list)
    dict1 = {}
    for i in sort_list:        
            count = sort_list.count(i)
            if i not in dict1.keys():
                dict1[i] = count

    maximum = 0 #no. of occurences
    max_key = -1 #element having the most occurences

    for key in dict1:
        if(dict1[key]>maximum):
            maximum = dict1[key]
            max_key = key 
        elif(dict1[key]==maximum):
            if(key<max_key):
                maximum = dict1[key]
                max_key = key

    return max_key

— Akshaynagpal
source

0

def mode(data):
    lst =[]
    hgh=0
    for i in range(len(data)):
        lst.append(data.count(data[i]))
    m= max(lst)
    ml = [x for x in data if data.count(x)==m ] #to find most frequent values
    mode = []
    for x in ml: #to remove duplicates of mode
        if x not in mode:
        mode.append(x)
    return mode
print mode([1,2,2,2,2,7,7,5,5,5,5])

— Venkata Prasanth T
source

0

这是一个简单的函数，它获取列表中出现的第一种模式。它使用列表元素作为键和出现次数来创建字典，然后读取字典值以获取模式。

def findMode(readList):
    numCount={}
    highestNum=0
    for i in readList:
        if i in numCount.keys(): numCount[i] += 1
        else: numCount[i] = 1
    for i in numCount.keys():
        if numCount[i] > highestNum:
            highestNum=numCount[i]
            mode=i
    if highestNum != 1: print(mode)
    elif highestNum == 1: print("All elements of list appear once.")

— 短信冯·德·坦
source

0

如果您想使用一种对课堂有用的清晰方法，并且仅通过理解使用列表和词典，则可以执行以下操作：

def mode(my_list):
    # Form a new list with the unique elements
    unique_list = sorted(list(set(my_list)))
    # Create a comprehensive dictionary with the uniques and their count
    appearance = {a:my_list.count(a) for a in unique_list} 
    # Calculate max number of appearances
    max_app = max(appearance.values())
    # Return the elements of the dictionary that appear that # of times
    return {k: v for k, v in appearance.items() if v == max_app}

— 玛丽亚·弗朗西斯·加斯卡
source

0

#function to find mode
def mode(data):  
    modecnt=0
#for count of number appearing
    for i in range(len(data)):
        icount=data.count(data[i])
#for storing count of each number in list will be stored
        if icount>modecnt:
#the loop activates if current count if greater than the previous count 
            mode=data[i]
#here the mode of number is stored 
            modecnt=icount
#count of the appearance of number is stored
    return mode
print mode(data1)

您应该用评论或更多详细信息来解释您的答案

— Michael

0

您可以在这里找到列表的均值，中位数和众数：

import numpy as np
from scipy import stats

#to take input
size = int(input())
numbers = list(map(int, input().split()))

print(np.mean(numbers))
print(np.median(numbers))
print(int(stats.mode(numbers)[0]))

— 潘卡
source

0

import numpy as np
def get_mode(xs):
    values, counts = np.unique(xs, return_counts=True)
    max_count_index = np.argmax(counts) #return the index with max value counts
    return values[max_count_index]
print(get_mode([1,7,2,5,3,3,8,3,2]))

— sim卡
source

0

对于那些寻求最小模式的人，例如：使用numpy的双峰分布情况。

import numpy as np
mode = np.argmax(np.bincount(your_list))

— V3K3R
source

0

数据集的模式是该集中最常出现的成员。如果有两个成员最常出现且次数相同，则数据具有两种模式。这就是所谓的双峰。

如果有两种以上的模式，那么数据将被称为multimodal。如果数据集中的所有成员都出现相同的次数，则数据集中根本没有模式。

以下功能modes()可用于在给定的数据列表中查找模式：

import numpy as np; import pandas as pd

def modes(arr):
    df = pd.DataFrame(arr, columns=['Values'])
    dat = pd.crosstab(df['Values'], columns=['Freq'])
    if len(np.unique((dat['Freq']))) > 1:
        mode = list(dat.index[np.array(dat['Freq'] == max(dat['Freq']))])
        return mode
    else:
        print("There is NO mode in the data set")

输出：

# For a list of numbers in x as
In [1]: x = [2, 3, 4, 5, 7, 9, 8, 12, 2, 1, 1, 1, 3, 3, 2, 6, 12, 3, 7, 8, 9, 7, 12, 10, 10, 11, 12, 2]
In [2]: modes(x)
Out[2]: [2, 3, 12]
# For a list of repeated numbers in y as
In [3]: y = [2, 2, 3, 3, 4, 4, 10, 10]
In [4]: modes(y)
There is NO mode in the data set
# For a list of stings/characters in z as
In [5]: z = ['a', 'b', 'b', 'b', 'e', 'e', 'e', 'd', 'g', 'g', 'c', 'g', 'g', 'a', 'a', 'c', 'a']
In [6]: modes(z)
Out[6]: ['a', 'g']

如果我们不想从这些包中导入numpy或pandas调用任何函数，则要获得相同的输出，modes()可以将函数编写为：

def modes(arr):
    cnt = []
    for i in arr:
        cnt.append(arr.count(i))
    uniq_cnt = []
    for i in cnt:
        if i not in uniq_cnt:
            uniq_cnt.append(i)
    if len(uniq_cnt) > 1:
        m = []
        for i in list(range(len(cnt))):
            if cnt[i] == max(uniq_cnt):
                m.append(arr[i])
        mode = []
        for i in m:
            if i not in mode:
                mode.append(i)
        return mode
    else:
        print("There is NO mode in the data set")

— ub
source