如何比较python中的两个字符串？

Question 1

我有两个像

string1="abc def ghi"

和

string2="def ghi abc"

如何在不打断单词的情况下使这两个字符串相同？

Question 2

似乎问题不是关于字符串相等，而是集合相等。您只能通过拆分字符串并将其转换为集合来以这种方式进行比较：

s1 = 'abc def ghi'
s2 = 'def ghi abc'
set1 = set(s1.split(' '))
set2 = set(s2.split(' '))
print set1 == set2

结果将是

True

Question 3

如果您想知道两个字符串是否相等，则只需

print string1 == string2

但是，如果您想知道它们是否都具有相同的字符集并且发生相同的次数，则可以使用collections.Counter，例如

>>> string1, string2 = "abc def ghi", "def ghi abc"
>>> from collections import Counter
>>> Counter(string1) == Counter(string2)
True

Question 4

>>> s1="abc def ghi"
>>> s2="def ghi abc"
>>> s1 == s2  # For string comparison 
False
>>> sorted(list(s1)) == sorted(list(s2)) # For comparing if they have same characters. 
True
>>> sorted(list(s1))
[' ', ' ', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i']
>>> sorted(list(s2))
[' ', ' ', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i']

Question 5

像这样：

if string1 == string2:
    print 'they are the same'

更新：如果您想查看每个子字符串是否可能存在于另一个子字符串中：

elem1 = [x for x in string1.split()]
elem2 = [x for x in string2.split()]

for item in elem1:
    if item in elem2:
        print item

Question 6

为此，您可以在python中使用默认的difflib

from difflib import SequenceMatcher

def similar(a, b):
    return SequenceMatcher(None, a, b).ratio()

然后调用like（）作为

similar(string1, string2)

它将返回compare为，ratio> = threshold以获取匹配结果

Question 7

直接比较中的平等：

string1 = "sample"
string2 = "sample"

if string1 == string2 :
    print("Strings are equal with text : ", string1," & " ,string2)
else :
    print ("Strings are not equal")

字符集相等：

string1 = 'abc def ghi'
string2 = 'def ghi abc'

set1 = set(string1.split(' '))
set2 = set(string2.split(' '))

print set1 == set2

if string1 == string2 :
    print("Strings are equal with text : ", string1," & " ,string2)
else :
    print ("Strings are not equal")

Question 8

我将提供几种解决方案，您可以选择一种满足您需求的解决方案：

1）如果仅关注字符，即，两个字符串中相同的字符且每个字符的频率相同，请使用：

''.join(sorted(string1)).strip() == ''.join(sorted(string2)).strip()

2）如果您还担心两个字符串中的空格数（空格字符），则只需使用以下代码段：

sorted(string1) == sorted(string2)

3）如果您在考虑单词而不是单词的顺序，并检查两个字符串是否具有相等的单词频率，而不管它们的顺序/出现情况如何，则可以使用：

sorted(string1.split()) == sorted(string2.split())

4）扩展以上内容，如果您不关心频率计数，而只需要确保两个字符串包含相同的单词集，则可以使用以下内容：

set(string1.split()) == set(string2.split())

Question 9

如果您只需要检查两个字符串是否完全相同，

text1 = 'apple'

text2 = 'apple'

text1 == text2

结果将是

True

如果您需要匹配的百分比，

import difflib

text1 = 'Since 1958.'

text2 = 'Since 1958'

output = str(int(difflib.SequenceMatcher(None, text1, text2).ratio()*100))

匹配的百分比输出将是

'95'

Question 10

我认为difflib是完成这项工作的好库

   >>>import difflib 
   >>> diff = difflib.Differ()
   >>> a='he is going home'
   >>> b='he is goes home'
   >>> list(diff.compare(a,b))
     ['  h', '  e', '   ', '  i', '  s', '   ', '  g', '  o', '+ e', '+ s', '- i', '- n', '- g', '   ', '  h', '  o', '  m', '  e']
    >>> list(diff.compare(a.split(),b.split()))
      ['  he', '  is', '- going', '+ goes', '  home']

Question 11

打开两个文件，然后通过分割其单词内容进行比较；

log_file_A='file_A.txt'

log_file_B='file_B.txt'

read_A=open(log_file_A,'r')
read_A=read_A.read()
print read_A

read_B=open(log_file_B,'r')
read_B=read_B.read()
print read_B

File_A_set = set(read_A.split(' '))
File_A_set = set(read_B.split(' '))
print File_A_set == File_B_set

Question 12

如果您想要一个非常简单的答案：

s_1 = "abc def ghi"
s_2 = "def ghi abc"
flag = 0
for i in s_1:
    if i not in s_2:
        flag = 1
if flag == 0:
    print("a == b")
else:
    print("a != b")

Question 13

尝试将两个字符串都大写或小写。然后，您可以使用==比较运算符。

Question 14

这是一个非常基本的示例，但是在进行逻辑比较（==）或之后string1.lower() == string2.lower()，尝试两个字符串之间的距离的一些基本指标可能会很有用。

您到处都可以找到与这些指标或其他指标相关的示例，还可以尝试使用Fuzzywuzzy软件包（https://github.com/seatgeek/fuzzywuzzy）。

import Levenshtein
import difflib

print(Levenshtein.ratio('String1', 'String2'))
print(difflib.SequenceMatcher(None, 'String1', 'String2').ratio())

Question 15

您可以使用简单的循环来检查两个字符串是否相等。但是，理想情况下，您可以使用return s1 == s2之类的东西

s1 = 'hello'
s2 = 'hello'

a = []
for ele in s1:
    a.append(ele)
for i in range(len(s2)):
    if a[i]==s2[i]:
        a.pop()
if len(a)>0:
    return False
else:
    return True