如何计算列表中的唯一值

126

因此，我试图制作一个程序来询问用户输入并将值存储在数组/列表中。
然后，当输入空白行时，它将告诉用户这些值中有多少是唯一的。
我出于现实原因而不是问题集来构建它。

enter: happy
enter: rofl
enter: happy
enter: mpg8
enter: Cpp
enter: Cpp
enter:
There are 4 unique words!

我的代码如下：

# ask for input
ipta = raw_input("Word: ")

# create list 
uniquewords = [] 
counter = 0
uniquewords.append(ipta)

a = 0   # loop thingy
# while loop to ask for input and append in list
while ipta: 
  ipta = raw_input("Word: ")
  new_words.append(input1)
  counter = counter + 1

for p in uniquewords:

..这就是到目前为止我所获得的一切。
我不确定如何计算列表中单词的唯一数量？
如果有人可以发布解决方案，以便我可以学习它，或者至少告诉我它会是多么棒，谢谢！

— 乔尔·阿奎（Joel Aqu）。
source

4

您能否在代码示例中修复缩进，这在Python中很重要！

— 代码箱2012年

1

您已删除了代码，而不是对其进行编辑以使其变得可读！在那儿有代码会

— 很有帮助

1

@codebox对不起，现在就做

— Joel Aqu。

244

另外，使用collections.Counter重构代码：

from collections import Counter

words = ['a', 'b', 'c', 'a']

Counter(words).keys() # equals to list(set(words))
Counter(words).values() # counts the elements' frequency

输出：

['a', 'c', 'b']
[2, 1, 1]

— 维杜尔
source

46

不回答乔尔的问题，但正是我所期待的，谢谢！

— Huw Walters

完善。和靶心。感谢@Vidul

— Parag Tyagi

Counter(words).values()很好我们假设计数是按单词列表的第一个出现顺序？我的意思是，我假设计数将为我们提供a，b，c，d的计数……

— Monica Heddneck

注意：如果你想表示这是一个字典一样count_dict = {'a': 2, 'b': 1, 'c': 1}，你可以做count_dict = dict(Counter(words).items())

— 彼得·

219

您可以使用集合删除重复项，然后使用len函数计算集合中的元素：

len(set(new_words))

— 码箱
source

37

values, counts = np.unique(words, return_counts=True)

— 詹姆斯·希尔斯霍恩
source

16

使用一组：

words = ['a', 'b', 'c', 'a']
unique_words = set(words)             # == set(['a', 'b', 'c'])
unique_word_count = len(unique_words) # == 3

有了这个，您的解决方案就可以很简单：

words = []
ipta = raw_input("Word: ")

while ipta:
  words.append(ipta)
  ipta = raw_input("Word: ")

unique_word_count = len(set(words))

print "There are %d unique words!" % unique_word_count

— 莱纳斯·蒂尔
source

6

aa="XXYYYSBAA"
bb=dict(zip(list(aa),[list(aa).count(i) for i in list(aa)]))
print(bb)
# output:
# {'X': 2, 'Y': 3, 'S': 1, 'B': 1, 'A': 2}

— 疯狂杰霍克
source

1

请解释这与其他答案

— 有何

4

对于ndarray，有一个称为unique的numpy方法：

np.unique(array_name)

例子：

>>> np.unique([1, 1, 2, 2, 3, 3])
array([1, 2, 3])
>>> a = np.array([[1, 1], [2, 3]])
>>> np.unique(a)
array([1, 2, 3])

对于系列，有一个函数调用value_counts（）：

Series_name.value_counts()

— 用户名
source

1

ipta = raw_input("Word: ") ## asks for input
words = [] ## creates list
unique_words = set(words)

— 用户名
source

1

尽管集合是最简单的方法，但是您也可以使用some_dict.has(key)字典并仅使用唯一的键和值来填充字典。

假设您已经填充words[]了用户的输入，请创建一个字典，将列表中的唯一单词映射到数字：

word_map = {}
i = 1
for j in range(len(words)):
    if not word_map.has_key(words[j]):
        word_map[words[j]] = i
        i += 1                                                             
num_unique_words = len(new_map) # or num_unique_words = i, however you prefer

— JMB
source

1

使用熊猫的其他方法

import pandas as pd

LIST = ["a","a","c","a","a","v","d"]
counts,values = pd.Series(LIST).value_counts().values, pd.Series(LIST).value_counts().index
df_results = pd.DataFrame(list(zip(values,counts)),columns=["value","count"])

然后，您可以以任何所需的格式导出结果

— HazimoRa3d
source

1

怎么样：

import pandas as pd
#List with all words
words=[]

#Code for adding words
words.append('test')


#When Input equals blank:
pd.Series(words).nunique()

它返回列表中有多少个唯一值

— john_data
source

欢迎来到StackOverflow！看来此解决方案假定使用pandas框架。最好在答案中提及它，因为其他用户可能不清楚。

— 谢尔盖·舒宾

0

以下应该工作。lambda函数过滤掉重复的单词。

inputs=[]
input = raw_input("Word: ").strip()
while input:
    inputs.append(input)
    input = raw_input("Word: ").strip()
uniques=reduce(lambda x,y: ((y in x) and x) or x+[y], inputs, [])
print 'There are', len(uniques), 'unique words'

— 约翰·王
source

0

我会自己使用一套，但这是另一种方式：

uniquewords = []
while True:
    ipta = raw_input("Word: ")
    if ipta == "":
        break
    if not ipta in uniquewords:
        uniquewords.append(ipta)
print "There are", len(uniquewords), "unique words!"

— 尼古拉·穆萨蒂（Nicola Musatti）
source

0

ipta = raw_input("Word: ") ## asks for input
words = [] ## creates list

while ipta: ## while loop to ask for input and append in list
  words.append(ipta)
  ipta = raw_input("Word: ")
  words.append(ipta)
#Create a set, sets do not have repeats
unique_words = set(words)

print "There are " +  str(len(unique_words)) + " unique words!"

— 好奇
source