如何以相同的顺序比较两个具有相同元素的JSON对象相等？

99

我如何测试python中两个JSON对象是否相等，而忽略列表的顺序？

例如 ...

JSON文件a：

{
    "errors": [
        {"error": "invalid", "field": "email"},
        {"error": "required", "field": "name"}
    ],
    "success": false
}

JSON文档b：

{
    "success": false,
    "errors": [
        {"error": "required", "field": "name"},
        {"error": "invalid", "field": "email"}
    ]
}

a并且b应该比较相等，即使"errors"列表的顺序不同。

— 佩特·弗里伯格
source

2

重复stackoverflow.com/questions/11141644/...

— user2085282

1

为什么不解码它们并进行比较呢？还是说“数组”或list元素的顺序无关紧要？

— mgilson 2014年

@ user2085282这个问题还有另一个问题。

— user193661

2

请原谅我的天真，但为什么呢？列表元素有特定的顺序是有原因的。

— ATOzTOA

1

如该答案所述，对JSON数组进行了排序，因此严格意义上讲，这些对象包含的排序顺序不同的数组将不相等。stackoverflow.com/a/7214312/18891

— Eric Ness

141

如果您希望两个具有相同元素但顺序不同的对象相等，那么比较明显的事情就是比较它们的排序后的副本-例如，以JSON字符串a和表示的字典b：

import json

a = json.loads("""
{
    "errors": [
        {"error": "invalid", "field": "email"},
        {"error": "required", "field": "name"}
    ],
    "success": false
}
""")

b = json.loads("""
{
    "success": false,
    "errors": [
        {"error": "required", "field": "name"},
        {"error": "invalid", "field": "email"}
    ]
}
""")

>>> sorted(a.items()) == sorted(b.items())
False

...但这是行不通的，因为在每种情况下，"errors"顶层dict的项都是一个列表，其中相同元素的顺序不同，并且sorted()除“一个可迭代的。

为了解决这个问题，我们可以定义一个ordered函数，该函数将对找到的所有列表进行递归排序（并将字典转换(key, value)成对列表，以便它们可排序）：

def ordered(obj):
    if isinstance(obj, dict):
        return sorted((k, ordered(v)) for k, v in obj.items())
    if isinstance(obj, list):
        return sorted(ordered(x) for x in obj)
    else:
        return obj

如果我们将此功能应用于a和b，结果比较相等：

>>> ordered(a) == ordered(b)
True

— 零比雷埃夫斯
source

1

非常感谢您，零比雷埃夫斯。这正是我需要的一般解决方案。但唯一的问题是该代码仅适用于python 2.x，不适用于python3。我收到以下错误：TypeError：不可排序的类型：dict（）<dict（）无论如何解决方案现在很清楚。我将尝试使其适用于python3。非常感谢

1

@HoussamHsm当您初次提到不可排序的dicts问题时，我打算将其修复以与Python 3.x一起使用，但不知何故，它脱离了我。现在，它在2.x和3.x中都可以使用:-)

— 比雷埃夫斯零号2015年

当有这样的列表时['astr', {'adict': 'something'}]，我TypeError在尝试对它们进行排序时得到了。

— 震孝浩

1

@ Blairg23您误解了这个问题，它是关于将JSON对象包含元素相同但顺序不同但列表不同的JSON对象进行比较，而不是按字典的任何假定顺序进行。

— 比雷埃夫斯零（Zero Piraeus）

1

@ Blairg23我同意可以更清楚地写出这个问题（尽管如果您查看编辑历史记录，它比开始时要好得多）。回复：字典和命令– 是的，我知道 ;-)

— 比雷埃夫斯零（Zero Piraeus）

44

另一种方法是使用json.dumps(X, sort_keys=True)选项：

import json
a, b = json.dumps(a, sort_keys=True), json.dumps(b, sort_keys=True)
a == b # a normal string comparison

这适用于嵌套字典和列表。

— stpk
source

{"error":"a"}, {"error":"b"}与{"error":"b"}, {"error":"a"} 它无法将后一种情况

— 归

@ Blairg23，但是如果字典中嵌套了列表，该怎么办？您不能只比较顶级dict并称其为一天，这不是这个问题的意思。

— stpk

3

如果您有列表，这将不起作用。例如 json.dumps({'foo': [3, 1, 2]}, sort_keys=True) == json.dumps({'foo': [2, 1, 3]}, sort_keys=True)

— Danil '18

6

@Danil，也许不应该。列表是有序结构，如果它们仅在顺序上有所不同，则应将它们视为不同的。也许对于您的用例而言，顺序并不重要，但我们不应该假设那样。

— stpk

因为列表是按索引排序的，所以不会使用它们。在大多数情况下，[0，1]不应等于[1，0]。因此，这对于正常情况是一个很好的解决方案，但对于上面的问题却不是。仍然+1

— 哈里森

18

对其进行解码，并将其作为mgilson注释进行比较。

字典的顺序无关紧要，只要键和值匹配即可。（字典在Python中没有顺序）

>>> {'a': 1, 'b': 2} == {'b': 2, 'a': 1}
True

但是顺序在清单中很重要。排序将解决列表的问题。

>>> [1, 2] == [2, 1]
False
>>> [1, 2] == sorted([2, 1])
True

>>> a = '{"errors": [{"error": "invalid", "field": "email"}, {"error": "required", "field": "name"}], "success": false}'
>>> b = '{"errors": [{"error": "required", "field": "name"}, {"error": "invalid", "field": "email"}], "success": false}'
>>> a, b = json.loads(a), json.loads(b)
>>> a['errors'].sort()
>>> b['errors'].sort()
>>> a == b
True

上面的示例适用于问题中的JSON。有关一般解决方案，请参见Zero Piraeus的答案。

— 虚假的
source

2

对于以下两个字典“ dictWithListsInValue”和“ reorderedDictWithReorderedListsInValue”，它们只是彼此的重新排序版本

dictObj = {"foo": "bar", "john": "doe"}
reorderedDictObj = {"john": "doe", "foo": "bar"}
dictObj2 = {"abc": "def"}
dictWithListsInValue = {'A': [{'X': [dictObj2, dictObj]}, {'Y': 2}], 'B': dictObj2}
reorderedDictWithReorderedListsInValue = {'B': dictObj2, 'A': [{'Y': 2}, {'X': [reorderedDictObj, dictObj2]}]}
a = {"L": "M", "N": dictWithListsInValue}
b = {"L": "M", "N": reorderedDictWithReorderedListsInValue}

print(sorted(a.items()) == sorted(b.items()))  # gives false

给我错误的结果即错误。

所以我这样创建了自己的cutstom ObjectComparator：

def my_list_cmp(list1, list2):
    if (list1.__len__() != list2.__len__()):
        return False

    for l in list1:
        found = False
        for m in list2:
            res = my_obj_cmp(l, m)
            if (res):
                found = True
                break

        if (not found):
            return False

    return True


def my_obj_cmp(obj1, obj2):
    if isinstance(obj1, list):
        if (not isinstance(obj2, list)):
            return False
        return my_list_cmp(obj1, obj2)
    elif (isinstance(obj1, dict)):
        if (not isinstance(obj2, dict)):
            return False
        exp = set(obj2.keys()) == set(obj1.keys())
        if (not exp):
            # print(obj1.keys(), obj2.keys())
            return False
        for k in obj1.keys():
            val1 = obj1.get(k)
            val2 = obj2.get(k)
            if isinstance(val1, list):
                if (not my_list_cmp(val1, val2)):
                    return False
            elif isinstance(val1, dict):
                if (not my_obj_cmp(val1, val2)):
                    return False
            else:
                if val2 != val1:
                    return False
    else:
        return obj1 == obj2

    return True


dictObj = {"foo": "bar", "john": "doe"}
reorderedDictObj = {"john": "doe", "foo": "bar"}
dictObj2 = {"abc": "def"}
dictWithListsInValue = {'A': [{'X': [dictObj2, dictObj]}, {'Y': 2}], 'B': dictObj2}
reorderedDictWithReorderedListsInValue = {'B': dictObj2, 'A': [{'Y': 2}, {'X': [reorderedDictObj, dictObj2]}]}
a = {"L": "M", "N": dictWithListsInValue}
b = {"L": "M", "N": reorderedDictWithReorderedListsInValue}

print(my_obj_cmp(a, b))  # gives true

这给了我正确的预期输出！

逻辑很简单：

如果对象的类型为“列表”，则将第一个列表的每个项目与第二个列表的项目进行比较，直到找到为止；如果在通过第二个列表之后未找到该项目，则“找到”为= false。返回“找到的”值

否则，如果要比较的对象的类型为“ dict”，则比较两个对象中所有相应键的存在值。（执行递归比较）

否则，只需调用obj1 == obj2即可。默认情况下，它适用于字符串和数字的对象，并且eq（）的定义适当。

（请注意，可以通过删除在object2中找到的项目来进一步改进该算法，以便object1的下一个项目不会将自身与object2中已经找到的项目进行比较。）

— 尼克斯·维吉
source

您能解决代码缩进问题吗？

— colidyre

@colidyre现在可以缩进吗？

— NiksVij

不，仍然在那里。在功能头之后，块也必须缩进。

— colidyre

是。我再次重新编辑。我将其复制粘贴到IDE中，并且现在可以使用。

— NiksVij

1

您可以编写自己的equals函数：

在以下情况下，字典是相等的：1）所有键都相等，2）所有值都相等
如果满足以下条件，则列表相等：所有项目均相同且顺序相同
如果原语相等 a == b

因为您处理JSON，你就会有标准的Python类型：dict，list等等，所以你可以做硬类型检查if type(obj) == 'dict':，等等。

粗略示例（未经测试）：

def json_equals(jsonA, jsonB):
    if type(jsonA) != type(jsonB):
        # not equal
        return False
    if type(jsonA) == dict:
        if len(jsonA) != len(jsonB):
            return False
        for keyA in jsonA:
            if keyA not in jsonB or not json_equal(jsonA[keyA], jsonB[keyA]):
                return False
    elif type(jsonA) == list:
        if len(jsonA) != len(jsonB):
            return False
        for itemA, itemB in zip(jsonA, jsonB):
            if not json_equal(itemA, itemB):
                return False
    else:
        return jsonA == jsonB

— 高登豆
source

0

对于其他想要调试两个JSON对象（通常有一个引用和一个target）的人，可以使用以下解决方案。它将列出从目标到引用的不同/不匹配路径的“ 路径 ”。

level 选项用于选择您要研究的深度。

show_variables 可以打开该选项以显示相关变量。

def compareJson(example_json, target_json, level=-1, show_variables=False):
  _different_variables = _parseJSON(example_json, target_json, level=level, show_variables=show_variables)
  return len(_different_variables) == 0, _different_variables

def _parseJSON(reference, target, path=[], level=-1, show_variables=False):  
  if level > 0 and len(path) == level:
    return []
  
  _different_variables = list()
  # the case that the inputs is a dict (i.e. json dict)  
  if isinstance(reference, dict):
    for _key in reference:      
      _path = path+[_key]
      try:
        _different_variables += _parseJSON(reference[_key], target[_key], _path, level, show_variables)
      except KeyError:
        _record = ''.join(['[%s]'%str(p) for p in _path])
        if show_variables:
          _record += ': %s <--> MISSING!!'%str(reference[_key])
        _different_variables.append(_record)
  # the case that the inputs is a list/tuple
  elif isinstance(reference, list) or isinstance(reference, tuple):
    for index, v in enumerate(reference):
      _path = path+[index]
      try:
        _target_v = target[index]
        _different_variables += _parseJSON(v, _target_v, _path, level, show_variables)
      except IndexError:
        _record = ''.join(['[%s]'%str(p) for p in _path])
        if show_variables:
          _record += ': %s <--> MISSING!!'%str(v)
        _different_variables.append(_record)
  # the actual comparison about the value, if they are not the same, record it
  elif reference != target:
    _record = ''.join(['[%s]'%str(p) for p in path])
    if show_variables:
      _record += ': %s <--> %s'%(str(reference), str(target))
    _different_variables.append(_record)

  return _different_variables

— 陈杰一
source