比较两个集合是否相等，无论它们中的项顺序如何

162

我想比较两个集合（在C＃中），但是我不确定有效实现这一点的最佳方法。

我已经阅读了有关Enumerable.SequenceEqual的其他主题，但这并不是我想要的。

在我的情况下，如果两个集合都包含相同的项（无论顺序如何），它们将相等。

例：

collection1 = {1, 2, 3, 4};
collection2 = {2, 4, 1, 3};

collection1 == collection2; // true

我通常要做的是遍历一个集合的每个项目，看是否在另一个集合中，然后遍历另一个集合的每个项目，看它是否在第一个集合中。（我从比较长度开始）。

if (collection1.Count != collection2.Count)
    return false; // the collections are not equal

foreach (Item item in collection1)
{
    if (!collection2.Contains(item))
        return false; // the collections are not equal
}

foreach (Item item in collection2)
{
    if (!collection1.Contains(item))
        return false; // the collections are not equal
}

return true; // the collections are equal

但是，这并不完全正确，并且可能不是比较两个集合是否相等的最有效方法。

我能想到的一个例子是错误的：

collection1 = {1, 2, 3, 3, 4}
collection2 = {1, 2, 2, 3, 4}

这将与我的实现相同。我应该只计算找到每个项目的次数并确保两个集合中的计数相等吗？

这些示例使用某种C＃（我们称其为伪C＃），但是用您希望使用的任何语言给出答案，都没有关系。

注意：为了简单起见，我在示例中使用了整数，但是我也希望能够使用引用类型的对象（它们不能正确地用作键，因为仅比较对象的引用，而不是内容）。

— 姆比亚德
source

1

算法怎么样？所有答案都通过比较某些东西，泛型列表比较linq等相关。真的，我们是否向某人承诺过，我们将永远不会将算法用作老式的程序员？

— Nuri YILMAZ 2012年

您不是要检查是否相等，而是要检查是否相等。这很挑剔，但很重要。和很久以前。这是一个很好的Q + A。

— CAD猛击

您可能对这篇文章感兴趣，该文章讨论了下面描述的基于字典的方法的调整版本。最简单的字典方法的一个问题是它们不能正确处理null，因为.NET的Dictionary类不允许使用null键。

— ChaseMedallion '16

112

事实证明，Microsoft已经在其测试框架中对此进行了介绍：CollectionAssert.AreEquivalent

备注

如果两个集合的元素数量相同，但顺序相同，则它们是等效的。如果元素的值相等，则元素相等，如果它们引用相同的对象，则元素相等。

使用反射器，我修改了AreEquivalent（）背后的代码以创建相应的相等比较器。它比现有的答案更完整，因为它考虑了空值，实现了IEqualityComparer并具有一些效率和边缘情况检查。另外，它是微软 :)

public class MultiSetComparer<T> : IEqualityComparer<IEnumerable<T>>
{
    private readonly IEqualityComparer<T> m_comparer;
    public MultiSetComparer(IEqualityComparer<T> comparer = null)
    {
        m_comparer = comparer ?? EqualityComparer<T>.Default;
    }

    public bool Equals(IEnumerable<T> first, IEnumerable<T> second)
    {
        if (first == null)
            return second == null;

        if (second == null)
            return false;

        if (ReferenceEquals(first, second))
            return true;

        if (first is ICollection<T> firstCollection && second is ICollection<T> secondCollection)
        {
            if (firstCollection.Count != secondCollection.Count)
                return false;

            if (firstCollection.Count == 0)
                return true;
        }

        return !HaveMismatchedElement(first, second);
    }

    private bool HaveMismatchedElement(IEnumerable<T> first, IEnumerable<T> second)
    {
        int firstNullCount;
        int secondNullCount;

        var firstElementCounts = GetElementCounts(first, out firstNullCount);
        var secondElementCounts = GetElementCounts(second, out secondNullCount);

        if (firstNullCount != secondNullCount || firstElementCounts.Count != secondElementCounts.Count)
            return true;

        foreach (var kvp in firstElementCounts)
        {
            var firstElementCount = kvp.Value;
            int secondElementCount;
            secondElementCounts.TryGetValue(kvp.Key, out secondElementCount);

            if (firstElementCount != secondElementCount)
                return true;
        }

        return false;
    }

    private Dictionary<T, int> GetElementCounts(IEnumerable<T> enumerable, out int nullCount)
    {
        var dictionary = new Dictionary<T, int>(m_comparer);
        nullCount = 0;

        foreach (T element in enumerable)
        {
            if (element == null)
            {
                nullCount++;
            }
            else
            {
                int num;
                dictionary.TryGetValue(element, out num);
                num++;
                dictionary[element] = num;
            }
        }

        return dictionary;
    }

    public int GetHashCode(IEnumerable<T> enumerable)
    {
        if (enumerable == null) throw new ArgumentNullException(nameof(enumerable));

        int hash = 17;

        foreach (T val in enumerable.OrderBy(x => x))
            hash = hash * 23 + (val?.GetHashCode() ?? 42);

        return hash;
    }
}

用法示例：

var set = new HashSet<IEnumerable<int>>(new[] {new[]{1,2,3}}, new MultiSetComparer<int>());
Console.WriteLine(set.Contains(new [] {3,2,1})); //true
Console.WriteLine(set.Contains(new [] {1, 2, 3, 3})); //false

或者，如果您只想直接比较两个集合：

var comp = new MultiSetComparer<string>();
Console.WriteLine(comp.Equals(new[] {"a","b","c"}, new[] {"a","c","b"})); //true
Console.WriteLine(comp.Equals(new[] {"a","b","c"}, new[] {"a","b"})); //false

最后，您可以使用自己选择的相等比较器：

var strcomp = new MultiSetComparer<string>(StringComparer.OrdinalIgnoreCase);
Console.WriteLine(strcomp.Equals(new[] {"a", "b"}, new []{"B", "A"})); //true

— 奥哈德·施耐德
source

7

我不确定100％，但是我认为您的回答违反了Microsoft针对逆向工程的使用条款。

— 伊恩·达拉斯

1

您好Ohad，请阅读以下有关主题的长期辩论，stackoverflow.com/ questions/371328/…如果更改对象哈希码，而在哈希集中，则它将中断哈希集的正确操作，并可能导致异常。规则如下：如果两个对象相等-它们必须具有相同的哈希码。如果两个对象具有相同的哈希码，则不必相等。哈希码在整个对象的生命周期中必须保持不变！这就是为什么您要推动ICompareable和IEqualrity。

— James Roeiter

2

@JamesRoeiter也许我的评论有误导性。当字典遇到它已经包含的散列码，它检查实际平等与EqualityComparer（你所提供的一个或任一EqualityComparer.Default，可以检查反射器或参考源来验证这一点）。的确，如果在此方法运行时对象发生更改（尤其是其哈希码更改），则结果是意外的，但这仅表示此方法在此上下文中不是线程安全的。

— Ohad Schneider

1

@JamesRoeiter假设x和y是我们要比较的两个对象。如果它们具有不同的哈希码，我们就知道它们是不同的（因为相等的项具有相等的哈希码），并且以上实现是正确的。如果它们具有相同的哈希码，则字典实现将使用指定的（或如果未指定）检查实际的相等性，并且该实现也是正确的。EqualityComparerEqualityComparer.Default

— Ohad Schneider 2013年

1

@CADbloke Equals由于IEqualityComparer<T>接口的缘故，必须命名该方法。您应该看的是比较器本身的名称。在这种情况下，MultiSetComparer这才有意义。

— Ohad Schneider

98

一个简单而有效的解决方案是对两个集合进行排序，然后比较它们是否相等：

bool equal = collection1.OrderBy(i => i).SequenceEqual(
                 collection2.OrderBy(i => i));

该算法为O（N * logN），而您的上述解决方案为O（N ^ 2）。

如果集合具有某些属性，则可以实现更快的解决方案。例如，如果两个集合都是哈希集，则它们不能包含重复项。同样，检查哈希集是否包含某些元素也非常快。在这种情况下，类似于您的算法可能最快。

— 萨尼·辛格·胡图恩
source

1

您只需要添加一个使用System.Linq的对象即可。首先使它起作用

— JuniorMayhé2010年

如果此代码在循环内且collection1得到更新，而collection2保持不变，则请注意，即使两个collection具有相同的对象，调试器对于此“等于”变量也将显示false。

— 少年梅耶

5

@Chaulky-我相信需要OrderBy。请参阅：dotnetfiddle.net/jA8iwE

— Brett

另一个答案被称为“上方”？可能是stackoverflow.com/a/50465/3195477？

— UuDdLrLrSs

32

创建一个字典“ dict”，然后为第一个集合中的每个成员执行dict [member] ++;

然后，以相同的方式遍历第二个集合，但是对每个成员执行dict [member]-。

最后，遍历字典中的所有成员：

    private bool SetEqual (List<int> left, List<int> right) {

        if (left.Count != right.Count)
            return false;

        Dictionary<int, int> dict = new Dictionary<int, int>();

        foreach (int member in left) {
            if (dict.ContainsKey(member) == false)
                dict[member] = 1;
            else
                dict[member]++;
        }

        foreach (int member in right) {
            if (dict.ContainsKey(member) == false)
                return false;
            else
                dict[member]--;
        }

        foreach (KeyValuePair<int, int> kvp in dict) {
            if (kvp.Value != 0)
                return false;
        }

        return true;

    }

编辑：据我所知，这与最有效的算法的顺序相同。假设字典使用O（1）查找，此算法为O（N）。

— 丹尼尔·詹宁斯
source

这几乎是我想要的。但是，即使我不使用整数，我也希望能够做到这一点。我想使用参考对象，但是它们在字典中不能像键一样正常工作。

— mbillard

单声道，如果您的商品不具有可比性，那么您的问题就没有意义了。如果它们不能用作“字典”中的键，则没有可用的解决方案。

— Skolima

1

我认为Mono表示键不能排序。但是Daniel的解决方案显然打算使用哈希表而不是树来实现，并且只要存在等效测试和哈希函数就可以使用。

— 埃里克森

当然可以提供帮助，但由于缺少要点（我在回答中涵盖了这一点）而被接受，但未被接受。

— mbillard

1

FWIW，您可以使用以下命令简化最后一个foreach循环和return语句：return dict.All(kvp => kvp.Value == 0);

— Tyson Williams

18

这是我（在D.Jennings的大力影响下）比较方法的通用实现（在C＃中）：

/// <summary>
/// Represents a service used to compare two collections for equality.
/// </summary>
/// <typeparam name="T">The type of the items in the collections.</typeparam>
public class CollectionComparer<T>
{
    /// <summary>
    /// Compares the content of two collections for equality.
    /// </summary>
    /// <param name="foo">The first collection.</param>
    /// <param name="bar">The second collection.</param>
    /// <returns>True if both collections have the same content, false otherwise.</returns>
    public bool Execute(ICollection<T> foo, ICollection<T> bar)
    {
        // Declare a dictionary to count the occurence of the items in the collection
        Dictionary<T, int> itemCounts = new Dictionary<T,int>();

        // Increase the count for each occurence of the item in the first collection
        foreach (T item in foo)
        {
            if (itemCounts.ContainsKey(item))
            {
                itemCounts[item]++;
            }
            else
            {
                itemCounts[item] = 1;
            }
        }

        // Wrap the keys in a searchable list
        List<T> keys = new List<T>(itemCounts.Keys);

        // Decrease the count for each occurence of the item in the second collection
        foreach (T item in bar)
        {
            // Try to find a key for the item
            // The keys of a dictionary are compared by reference, so we have to
            // find the original key that is equivalent to the "item"
            // You may want to override ".Equals" to define what it means for
            // two "T" objects to be equal
            T key = keys.Find(
                delegate(T listKey)
                {
                    return listKey.Equals(item);
                });

            // Check if a key was found
            if(key != null)
            {
                itemCounts[key]--;
            }
            else
            {
                // There was no occurence of this item in the first collection, thus the collections are not equal
                return false;
            }
        }

        // The count of each item should be 0 if the contents of the collections are equal
        foreach (int value in itemCounts.Values)
        {
            if (value != 0)
            {
                return false;
            }
        }

        // The collections are equal
        return true;
    }
}

— 姆比亚德
source

12

做得很好，但是请注意：1.与Daniel Jennings解决方案相反，这不是O（N）而是O（N ^ 2），因为在bar集合的foreach循环内有find函数；2.您可以对方法进行一般化以接受IEnumerable <T>而不是ICollection <T>，而无需对代码做进一步修改

— Ohad Schneider 2010年

The keys of a dictionary are compared by reference, so we have to find the original key that is equivalent to the "item"- 这不是真的。该算法基于错误的假设，虽然可行，但效率极低。

— 安东宁Lejsek

10

您可以使用Hashset。查看SetEquals方法。

— 乔尔·高维罗
source

2

当然，使用HashSet假定没有重复项，但如果这样，HashSet是最好的选择

— Mark Cidade'Oct

7

如果使用Shouldly，则可以将ShouldAllBe与Contains一起使用。

collection1 = {1, 2, 3, 4};
collection2 = {2, 4, 1, 3};

collection1.ShouldAllBe(item=>collection2.Contains(item)); // true

最后，您可以编写一个扩展。

public static class ShouldlyIEnumerableExtensions
{
    public static void ShouldEquivalentTo<T>(this IEnumerable<T> list, IEnumerable<T> equivalent)
    {
        list.ShouldAllBe(l => equivalent.Contains(l));
    }
}

更新

ShouldBe方法上存在一个可选参数。

collection1.ShouldBe(collection2, ignoreOrder: true); // true

— 皮埃尔·里奥内尔·斯加德
source

1

我刚刚发现最新版本的ShouldBe方法bool ignoreOrder上有一个参数。

— 皮埃尔·里奥内尔·斯加德

5

编辑：我一想到就提出这确实只适用于集-它不会正确处理具有重复项的集合。例如，从该算法的角度来看，{1，1，2}和{2，2，1}被认为是相等的。但是，如果您的集合是集合（或者可以通过这种方式来衡量它们的相等性），我希望您发现以下有用。

我使用的解决方案是：

return c1.Count == c2.Count && c1.Intersect(c2).Count() == c1.Count;

Linq在后台执行字典操作，所以它也是O（N）。（请注意，如果集合的大小不同，则为O（1））。

我使用Daniel建议的“ SetEqual”方法，Igor建议的OrderBy / SequenceEquals方法以及我的建议进行了完整性检查。结果如下，显示了Igor的O（N * LogN）和我的和Daniel的O（N）。

我认为Linq相交代码的简单性使其成为首选解决方案。

__Test Latency(ms)__
N, SetEquals, OrderBy, Intersect    
1024, 0, 0, 0    
2048, 0, 0, 0    
4096, 31.2468, 0, 0    
8192, 62.4936, 0, 0    
16384, 156.234, 15.6234, 0    
32768, 312.468, 15.6234, 46.8702    
65536, 640.5594, 46.8702, 31.2468    
131072, 1312.3656, 93.7404, 203.1042    
262144, 3765.2394, 187.4808, 187.4808    
524288, 5718.1644, 374.9616, 406.2084    
1048576, 11420.7054, 734.2998, 718.6764    
2097152, 35090.1564, 1515.4698, 1484.223

此代码的唯一问题是，它仅在比较值类型或将指针与引用类型进行比较时才起作用。集合中可能有同一对象的两个不同实例，因此我需要能够指定如何比较每个实例。您可以将比较委托传递给相交方法吗？

— mbillard，2009年

当然，您可以传递比较器委托。但是，请注意上述关于我添加的集合的限制，这对其适用性造成了很大的限制。

Intersect方法返回一个不同的集合。给定a = {1,1,2}和b = {2,2,1}，则a.Intersect（b）.Count（）！= a.Count，这会使您的表达式正确返回false。{1,2} .Count！= {1,1,2} .Count请参阅链接 [/ link]（请注意，在进行比较之前，双方都是不同的。）

— Griffin

5

在没有重复和没有顺序的情况下，可以使用以下EqualityComparer将集合用作字典键：

public class SetComparer<T> : IEqualityComparer<IEnumerable<T>> 
where T:IComparable<T>
{
    public bool Equals(IEnumerable<T> first, IEnumerable<T> second)
    {
        if (first == second)
            return true;
        if ((first == null) || (second == null))
            return false;
        return first.ToHashSet().SetEquals(second);
    }

    public int GetHashCode(IEnumerable<T> enumerable)
    {
        int hash = 17;

        foreach (T val in enumerable.OrderBy(x => x))
            hash = hash * 23 + val.GetHashCode();

        return hash;
    }
}

这是我使用的ToHashSet（）实现。该散列码算法来自有效的Java（由乔恩飞碟双向的方式）。

— 奥哈德·施耐德
source

Comparer类的Serializable有什么意义？：o同样，您也可以更改输入以ISet<T>表示它是针对集合的（即没有重复项）。

— nawfal

@nawfal谢谢，不知道当我将其标记为Serializable ...时我在想什么。至于ISet，这里的想法是将IEnumerableset视为一个集合（因为您有个IEnumerable开始），尽管考虑了0个以上的投票5年可能不是最敏锐的想法：P

— Ohad Schneider

4

static bool SetsContainSameElements<T>(IEnumerable<T> set1, IEnumerable<T> set2) {
    var setXOR = new HashSet<T>(set1);
    setXOR.SymmetricExceptWith(set2);
    return (setXOR.Count == 0);
}

解决方案需要.NET 3.5和System.Collections.Generic名称空间。根据Microsoft的说法，SymmetricExceptWith是O（n + m）运算，其中n代表第一个集合中的元素数，m代表第二个集合中的元素数。如有必要，您总是可以向该函数添加一个相等比较器。

— palswim
source

3

为什么不使用.Except（）

// Create the IEnumerable data sources.
string[] names1 = System.IO.File.ReadAllLines(@"../../../names1.txt");
string[] names2 = System.IO.File.ReadAllLines(@"../../../names2.txt");
// Create the query. Note that method syntax must be used here.
IEnumerable<string> differenceQuery =   names1.Except(names2);
// Execute the query.
Console.WriteLine("The following lines are in names1.txt but not names2.txt");
foreach (string s in differenceQuery)
     Console.WriteLine(s);

http://msdn.microsoft.com/en-us/library/bb397894.aspx

— 科拉耶姆
source

2

Except无法计算重复项。对于集合{1,2,2}和{1,1,2}，它将返回true。

— Cristian Diaconescu

@CristiDiaconescu，您可以先执行“ .Distinct（）”以删除所有重复项

— Korayem

OP正在要求[1, 1, 2] != [1, 2, 2]。使用Distinct会使它们看起来相等。

— Cristian Diaconescu

2

重复的帖子，但请查看我的比较收藏的解决方案。很简单：

这将执行相等比较，而不考虑顺序：

var list1 = new[] { "Bill", "Bob", "Sally" };
var list2 = new[] { "Bob", "Bill", "Sally" };
bool isequal = list1.Compare(list2).IsSame;

这将检查是否已添加/删除项目：

var list1 = new[] { "Billy", "Bob" };
var list2 = new[] { "Bob", "Sally" };
var diff = list1.Compare(list2);
var onlyinlist1 = diff.Removed; //Billy
var onlyinlist2 = diff.Added;   //Sally
var inbothlists = diff.Equal;   //Bob

这将查看字典中的哪些项目已更改：

var original = new Dictionary<int, string>() { { 1, "a" }, { 2, "b" } };
var changed = new Dictionary<int, string>() { { 1, "aaa" }, { 2, "b" } };
var diff = original.Compare(changed, (x, y) => x.Value == y.Value, (x, y) => x.Value == y.Value);
foreach (var item in diff.Different)
  Console.Write("{0} changed to {1}", item.Key.Value, item.Value.Value);
//Will output: a changed to aaa

原始帖子在这里。

— 用户名
source

1

埃里克森几乎是对的：由于您希望匹配重复项，因此您需要一个Bag。在Java中，这类似于：

(new HashBag(collection1)).equals(new HashBag(collection2))

我确定C＃具有内置的Set实现。我会先使用它；如果性能存在问题，则可以始终使用其他Set实现，但可以使用相同的Set接口。

— 詹姆斯·A·罗森
source

1

如果对某人有用，这是我的ohadsc答案的扩展方法变体

static public class EnumerableExtensions 
{
    static public bool IsEquivalentTo<T>(this IEnumerable<T> first, IEnumerable<T> second)
    {
        if ((first == null) != (second == null))
            return false;

        if (!object.ReferenceEquals(first, second) && (first != null))
        {
            if (first.Count() != second.Count())
                return false;

            if ((first.Count() != 0) && HaveMismatchedElement<T>(first, second))
                return false;
        }

        return true;
    }

    private static bool HaveMismatchedElement<T>(IEnumerable<T> first, IEnumerable<T> second)
    {
        int firstCount;
        int secondCount;

        var firstElementCounts = GetElementCounts<T>(first, out firstCount);
        var secondElementCounts = GetElementCounts<T>(second, out secondCount);

        if (firstCount != secondCount)
            return true;

        foreach (var kvp in firstElementCounts)
        {
            firstCount = kvp.Value;
            secondElementCounts.TryGetValue(kvp.Key, out secondCount);

            if (firstCount != secondCount)
                return true;
        }

        return false;
    }

    private static Dictionary<T, int> GetElementCounts<T>(IEnumerable<T> enumerable, out int nullCount)
    {
        var dictionary = new Dictionary<T, int>();
        nullCount = 0;

        foreach (T element in enumerable)
        {
            if (element == null)
            {
                nullCount++;
            }
            else
            {
                int num;
                dictionary.TryGetValue(element, out num);
                num++;
                dictionary[element] = num;
            }
        }

        return dictionary;
    }

    static private int GetHashCode<T>(IEnumerable<T> enumerable)
    {
        int hash = 17;

        foreach (T val in enumerable.OrderBy(x => x))
            hash = hash * 23 + val.GetHashCode();

        return hash;
    }
}

— 埃里克·J。
source

有什么想法吗？

— nawfal

我仅将其用于小型集合，因此没有考虑Big-O的复杂性或进行基准测试。HaveMismatchedElements本身为O（M * N），因此对于大型集合可能效果不佳。

— Eric J.

如果IEnumerable<T>s是查询，则调用Count()不是一个好主意。Ohad最初的答案是检查它们是否ICollection<T>是更好的主意。

— nawfal

1

这里是一个解决方案，它是在改进这一个。

public static bool HasSameElementsAs<T>(
        this IEnumerable<T> first, 
        IEnumerable<T> second, 
        IEqualityComparer<T> comparer = null)
    {
        var firstMap = first
            .GroupBy(x => x, comparer)
            .ToDictionary(x => x.Key, x => x.Count(), comparer);

        var secondMap = second
            .GroupBy(x => x, comparer)
            .ToDictionary(x => x.Key, x => x.Count(), comparer);

        if (firstMap.Keys.Count != secondMap.Keys.Count)
            return false;

        if (firstMap.Keys.Any(k1 => !secondMap.ContainsKey(k1)))
            return false;

        return firstMap.Keys.All(x => firstMap[x] == secondMap[x]);
    }

— N73k
source

0

有许多解决此问题的方法。如果您不关心重复项，则不必对两者都进行排序。首先，请确保它们具有相同数量的项目。之后，对其中一个集合进行排序。然后对已排序集合中第二个集合中的每个项目进行bin搜索。如果找不到给定项，则停止并返回false。这样做的复杂性：-对第一个集合进行排序：N Log（N）-从第二个到第一个搜索每个项目：NLOG（N），因此假设它们匹配，您将得到2 * N * LOG（N），然后查找所有内容。这类似于对两者进行分类的复杂性。如果有区别，这也使您可以尽早停止。但是，请记住，如果在进行比较之前对两者都进行了排序，并且尝试使用qsort之类的方法进行排序，则排序会更加昂贵。为此进行了优化。另一种选择是使用位掩码索引，这对于知道元素范围的小型集合非常有用。这将为您提供O（n）性能。另一种选择是使用哈希并查找它。对于小型集合，通常最好进行排序或位掩码索引。哈希表的缺点是位置较差，因此请记住这一点。同样，只有当您不这样做时不在乎重复。如果要考虑重复项，请对两者进行排序。

0

在许多情况下，唯一合适的答案是Igor Ostrovsky的答案，其他答案则基于对象哈希码。但是，当您为对象生成哈希代码时，只能基于其IMMUTABLE字段（例如，对象ID字段（对于数据库实体））来执行此操作- 为什么在重写Equals方法时覆盖GetHashCode如此重要？

这意味着，如果您比较两个集合，那么即使不同项目的字段都不相等，compare方法的结果也可能是正确的。要深入比较集合，您需要使用Igor的方法并实现IEqualirity。

请阅读我和施耐德先生在他投票最多的帖子上的评论。

詹姆士

— 詹姆斯·罗伊特
source

0

允许重复IEnumerable<T>（如果不希望\可能）和“忽略顺序”，您应该可以使用.GroupBy()。

我不是复杂度测量方面的专家，但我的基本理解是这应该是O（n）。我理解O（n ^ 2）来自在另一个O（n）操作中执行O（n）操作ListA.Where(a => ListB.Contains(a)).ToList()。评估ListB中的每个项目是否与ListA中的每个项目相等。

就像我说的那样，我对复杂性的理解是有限的，因此如果我错了，请对此进行纠正。

public static bool IsSameAs<T, TKey>(this IEnumerable<T> source, IEnumerable<T> target, Expression<Func<T, TKey>> keySelectorExpression)
    {
        // check the object
        if (source == null && target == null) return true;
        if (source == null || target == null) return false;

        var sourceList = source.ToList();
        var targetList = target.ToList();

        // check the list count :: { 1,1,1 } != { 1,1,1,1 }
        if (sourceList.Count != targetList.Count) return false;

        var keySelector = keySelectorExpression.Compile();
        var groupedSourceList = sourceList.GroupBy(keySelector).ToList();
        var groupedTargetList = targetList.GroupBy(keySelector).ToList();

        // check that the number of grouptings match :: { 1,1,2,3,4 } != { 1,1,2,3,4,5 }
        var groupCountIsSame = groupedSourceList.Count == groupedTargetList.Count;
        if (!groupCountIsSame) return false;

        // check that the count of each group in source has the same count in target :: for values { 1,1,2,3,4 } & { 1,1,1,2,3,4 }
        // key:count
        // { 1:2, 2:1, 3:1, 4:1 } != { 1:3, 2:1, 3:1, 4:1 }
        var countsMissmatch = groupedSourceList.Any(sourceGroup =>
                                                        {
                                                            var targetGroup = groupedTargetList.Single(y => y.Key.Equals(sourceGroup.Key));
                                                            return sourceGroup.Count() != targetGroup.Count();
                                                        });
        return !countsMissmatch;
    }

— 乔什·古斯特
source

0

这个简单的解决方案迫使IEnumerable的泛型类型得以实现IComparable。因为 OrderBy的定义。

如果您不想做这样的假设，但仍然想使用此解决方案，则可以使用以下代码：

bool equal = collection1.OrderBy(i => i?.GetHashCode())
   .SequenceEqual(collection2.OrderBy(i => i?.GetHashCode()));

— 乔汉姆
source

0

如果出于单元测试断言的目的进行比较，则可以在进行比较之前将效率提高到窗口之外，并简单地将每个列表转换为字符串表示形式（csv），这可能是有意义的。这样，默认的测试断言消息将在错误消息中显示差异。

用法：

using Microsoft.VisualStudio.TestTools.UnitTesting;

// define collection1, collection2, ...

Assert.Equal(collection1.OrderBy(c=>c).ToCsv(), collection2.OrderBy(c=>c).ToCsv());

辅助程序扩展方法：

public static string ToCsv<T>(
    this IEnumerable<T> values,
    Func<T, string> selector,
    string joinSeparator = ",")
{
    if (selector == null)
    {
        if (typeof(T) == typeof(Int16) ||
            typeof(T) == typeof(Int32) ||
            typeof(T) == typeof(Int64))
        {
            selector = (v) => Convert.ToInt64(v).ToStringInvariant();
        }
        else if (typeof(T) == typeof(decimal))
        {
            selector = (v) => Convert.ToDecimal(v).ToStringInvariant();
        }
        else if (typeof(T) == typeof(float) ||
                typeof(T) == typeof(double))
        {
            selector = (v) => Convert.ToDouble(v).ToString(CultureInfo.InvariantCulture);
        }
        else
        {
            selector = (v) => v.ToString();
        }
    }

    return String.Join(joinSeparator, values.Select(v => selector(v)));
}

— 克鲁库塞克
source