如何从HashSet <T>检索实际项目?


85

我读过这个问题,为什么这是不可能的,但还没有找到一个解决问题的办法。

我想从.NET检索项目HashSet<T>。我正在寻找一种具有此签名的方法:

/// <summary>
/// Determines if this set contains an item equal to <paramref name="item"/>, 
/// according to the comparison mechanism that was used when the set was created. 
/// The set is not changed. If the set does contain an item equal to 
/// <paramref name="item"/>, then the item from the set is returned.
/// </summary>
bool TryGetItem<T>(T item, out T foundItem);

用这种方法在集合中搜索项目将为O(1)。从a检索项目的唯一方法HashSet<T>是枚举O(n)的所有项目。

除了自己制作HashSet<T>或使用以外,我没有找到任何解决此问题的方法Dictionary<K, V>。还有其他想法吗?

注意:
我不想检查HashSet<T>包含的项目。我想获取对存储在中的项目的引用,HashSet<T>因为我需要对其进行更新(而不用另一个实例替换它)。我将传递给的项目TryGetItem将是相等的(根据我传递给构造函数的比较机制),但它不是相同的引用。


1
为什么不使用Contains并返回您作为输入传递的项目?
Mathias


2
如果你需要查找基于一个关键值的对象,那么字典<T>可能是更合适的集合,其存储英寸
ThatBlairGuy

@ThatBlairGuy:你是对的。我想我将在内部使用Dictionary来实现自己的Set集合来存储我的项目。密钥将是项目的HashCode。我将获得与HashSet大致相同的性能,并且每次需要从集合中添加/删除/获取项目时,都无需提供密钥。
弗朗索瓦·C

2
@mathias因为散列集可能包含等于输入但实际上并不相同的项。例如,您可能希望具有引用类型的哈希集,但想要比较内容,而不是相等的引用。
NounVerber

Answers:


25

您的要求是在一年前添加到.NET Core中的最近又添加到了.NET 4.7.2中

在.NET Framework 4.7.2中,我们向标准Collection类型添加了一些API,这些API将启用以下新功能。
-将“ TryGetValue”添加到SortedSet和HashSet中,以匹配其他集合类型中使用的Try模式。

签名如下(在.NET 4.7.2及更高版本中找到):

    //
    // Summary:
    //     Searches the set for a given value and returns the equal value it finds, if any.
    //
    // Parameters:
    //   equalValue:
    //     The value to search for.
    //
    //   actualValue:
    //     The value from the set that the search found, or the default value of T when
    //     the search yielded no match.
    //
    // Returns:
    //     A value indicating whether the search was successful.
    public bool TryGetValue(T equalValue, out T actualValue);

PS:如果您感兴趣,将来会添加相关功能-HashSet.GetOrAdd(T)。


65

这实际上是集合集中的一个巨大遗漏。您将只需要一个词典字典或一个HashSet即可检索对象引用。如此之多的人都在要求它,为什么它没有得到解决,这超出了我。

如果没有第三方库,最好的解决方法是使用Dictionary<T, T>与值相同的键,因为Dictionary将其条目存储为哈希表。在性能方面,它与HashSet相同,但是当然会浪费内存(每个条目的指针大小)。

Dictionary<T, T> myHashedCollection;
...
if(myHashedCollection.ContainsKey[item])
    item = myHashedCollection[item]; //replace duplicate
else
    myHashedCollection.Add(item, item); //add previously unknown item
...
//work with unique item

1
我建议他字典的键应该是他当前在哈希表的EqualityComparer中放置的键。当您没有真正说出项目相等时,我觉得使用EqualityComparer很脏(否则您可以只使用创建的项目进行比较)。我将创建一个代表键的类/结构。当然,这是以增加内存为代价的。
Ed T

1
由于密钥存储在Value中,因此我建议使用从KeyedCollection继承的collection而不是Dictionary。msdn.microsoft.com/zh-cn/library/ms132438(v=vs.110).aspx
拒绝访问

11

此方法已添加到.NET Framework 4.7.2(及其之前的.NET Core 2.0)中;见HashSet<T>.TryGetValue。引用来源

/// <summary>
/// Searches the set for a given value and returns the equal value it finds, if any.
/// </summary>
/// <param name="equalValue">The value to search for.
/// </param>
/// <param name="actualValue">
/// The value from the set that the search found, or the default value
/// of <typeparamref name="T"/> when the search yielded no match.</param>
/// <returns>A value indicating whether the search was successful.</returns>
/// <remarks>
/// This can be useful when you want to reuse a previously stored reference instead of 
/// a newly constructed one (so that more sharing of references can occur) or to look up
/// a value that has more complete data than the value you currently have, although their
/// comparer functions indicate they are equal.
/// </remarks>
public bool TryGetValue(T equalValue, out T actualValue)

1
以及SortedSet一样。
nawfal

4

如何重载字符串相等比较器:

  class StringEqualityComparer : IEqualityComparer<String>
{
    public string val1;
    public bool Equals(String s1, String s2)
    {
        if (!s1.Equals(s2)) return false;
        val1 = s1;
        return true;
    }

    public int GetHashCode(String s)
    {
        return s.GetHashCode();
    }
}
public static class HashSetExtension
{
    public static bool TryGetValue(this HashSet<string> hs, string value, out string valout)
    {
        if (hs.Contains(value))
        {
            valout=(hs.Comparer as StringEqualityComparer).val1;
            return true;
        }
        else
        {
            valout = null;
            return false;
        }
    }
}

然后将HashSet声明为:

HashSet<string> hs = new HashSet<string>(new StringEqualityComparer());

这全部与内存管理有关-返回哈希集中的实际项目,而不是相同的副本。因此,在上面的代码中,我们找到了具有相同内容的字符串,然后返回对此的引用。对于字符串,这类似于实习生所做的事情。
mp666

@zumalifeguard @ mp666不能保证按原样工作。这将需要有人实例化HashSet以提供特定的值转换器。最佳解决方案是TryGetValue传递专门对象的新实例StringEqualityComparer(否则as StringEqualityComparer可能会导致null导致.val1属性访问抛出)。这样,StringEqualityComparer可以成为HashSetExtension中的嵌套私有类。此外,如果重写了相等比较器,则StringEqualityComparer应该调用默认值。
Graeme Wicksted '16

您需要将您的HashSet声明为:HashSet <string> valueCash = new HashSet <string>(new StringEqualityComparer())
mp666 2016年

1
肮脏的骇客。我知道它是如何工作的,但是它的懒惰使它只能作为一种解决方案
M.kazem Akhgary '17

2

好的,你可以这样

YourObject x = yourHashSet.Where(w => w.Name.Contains("strin")).FirstOrDefault();

这是为了获取所选对象的新实例。为了更新您的对象,那么您应该使用:

yourHashSet.Where(w => w.Name.Contains("strin")).FirstOrDefault().MyProperty = "something";

这是一种有趣的方式,只需要尝试将第二个自动换行即可-这样,如果您搜索列表中未包含的内容,则会得到NullReferenceExpection。但是这是朝正确方向迈出的一步吗?
Piotr Kula

11
LINQ在foreach循环中遍历集合,即O(n)查找时间。尽管这是解决问题的方法,但它首先达到了使用HashSet的目的。
Niklas Ekman


2

另一个技巧将通过访问InternalIndexOfHashSet的内部函数来进行反射。请记住,字段名是硬编码的,因此,如果在即将发布的.NET版本中更改了字段名,则该字段名将被破坏。

注意:如果使用Mono,则应将字段名称从更改m_slots_slots

internal static class HashSetExtensions<T>
{
    public delegate bool GetValue(HashSet<T> source, T equalValue, out T actualValue);

    public static GetValue TryGetValue { get; }

    static HashSetExtensions() {
        var targetExp = Expression.Parameter(typeof(HashSet<T>), "target");
        var itemExp   = Expression.Parameter(typeof(T), "item");
        var actualValueExp = Expression.Parameter(typeof(T).MakeByRefType(), "actualValueExp");

        var indexVar = Expression.Variable(typeof(int), "index");
        // ReSharper disable once AssignNullToNotNullAttribute
        var indexExp = Expression.Call(targetExp, typeof(HashSet<T>).GetMethod("InternalIndexOf", BindingFlags.NonPublic | BindingFlags.Instance), itemExp);

        var truePart = Expression.Block(
            Expression.Assign(
                actualValueExp, Expression.Field(
                    Expression.ArrayAccess(
                        // ReSharper disable once AssignNullToNotNullAttribute
                        Expression.Field(targetExp, typeof(HashSet<T>).GetField("m_slots", BindingFlags.NonPublic | BindingFlags.Instance)), indexVar),
                    "value")),
            Expression.Constant(true));

        var falsePart = Expression.Constant(false);

        var block = Expression.Block(
            new[] { indexVar },
            Expression.Assign(indexVar, indexExp),
            Expression.Condition(
                Expression.GreaterThanOrEqual(indexVar, Expression.Constant(0)),
                truePart,
                falsePart));

        TryGetValue = Expression.Lambda<GetValue>(block, targetExp, itemExp, actualValueExp).Compile();
    }
}

public static class Extensions
{
    public static bool TryGetValue2<T>(this HashSet<T> source, T equalValue,  out T actualValue) {
        if (source.Count > 0) {
            if (HashSetExtensions<T>.TryGetValue(source, equalValue, out actualValue)) {
                return true;
            }
        }
        actualValue = default;
        return false;
    }
}

测试:

var x = new HashSet<int> { 1, 2, 3 };
if (x.TryGetValue2(1, out var value)) {
    Console.WriteLine(value);
}


1

@ mp666答案的修改实现,因此它可用于任何类型的HashSet,并允许覆盖默认的相等比较器。

public interface IRetainingComparer<T> : IEqualityComparer<T>
{
    T Key { get; }
    void ClearKeyCache();
}

/// <summary>
/// An <see cref="IEqualityComparer{T}"/> that retains the last key that successfully passed <see cref="IEqualityComparer{T}.Equals(T,T)"/>.
/// This class relies on the fact that <see cref="HashSet{T}"/> calls the <see cref="IEqualityComparer{T}.Equals(T,T)"/> with the first parameter
/// being an existing element and the second parameter being the one passed to the initiating call to <see cref="HashSet{T}"/> (eg. <see cref="HashSet{T}.Contains(T)"/>).
/// </summary>
/// <typeparam name="T">The type of object being compared.</typeparam>
/// <remarks>This class is thread-safe but may should not be used with any sort of parallel access (PLINQ).</remarks>
public class RetainingEqualityComparerObject<T> : IRetainingComparer<T> where T : class
{
    private readonly IEqualityComparer<T> _comparer;

    [ThreadStatic]
    private static WeakReference<T> _retained;

    public RetainingEqualityComparerObject(IEqualityComparer<T> comparer)
    {
        _comparer = comparer;
    }

    /// <summary>
    /// The retained instance on side 'a' of the <see cref="Equals"/> call which successfully met the equality requirement agains side 'b'.
    /// </summary>
    /// <remarks>Uses a <see cref="WeakReference{T}"/> so unintended memory leaks are not encountered.</remarks>
    public T Key
    {
        get
        {
            T retained;
            return _retained == null ? null : _retained.TryGetTarget(out retained) ? retained : null;
        }
    }


    /// <summary>
    /// Sets the retained <see cref="Key"/> to the default value.
    /// </summary>
    /// <remarks>This should be called prior to performing an operation that calls <see cref="Equals"/>.</remarks>
    public void ClearKeyCache()
    {
        _retained = _retained ?? new WeakReference<T>(null);
        _retained.SetTarget(null);
    }

    /// <summary>
    /// Test two objects of type <see cref="T"/> for equality retaining the object if successful.
    /// </summary>
    /// <param name="a">An instance of <see cref="T"/>.</param>
    /// <param name="b">A second instance of <see cref="T"/> to compare against <paramref name="a"/>.</param>
    /// <returns>True if <paramref name="a"/> and <paramref name="b"/> are equal, false otherwise.</returns>
    public bool Equals(T a, T b)
    {
        if (!_comparer.Equals(a, b))
        {
            return false;
        }

        _retained = _retained ?? new WeakReference<T>(null);
        _retained.SetTarget(a);
        return true;
    }

    /// <summary>
    /// Gets the hash code value of an instance of <see cref="T"/>.
    /// </summary>
    /// <param name="o">The instance of <see cref="T"/> to obtain a hash code from.</param>
    /// <returns>The hash code value from <paramref name="o"/>.</returns>
    public int GetHashCode(T o)
    {
        return _comparer.GetHashCode(o);
    }
}

/// <summary>
/// An <see cref="IEqualityComparer{T}"/> that retains the last key that successfully passed <see cref="IEqualityComparer{T}.Equals(T,T)"/>.
/// This class relies on the fact that <see cref="HashSet{T}"/> calls the <see cref="IEqualityComparer{T}.Equals(T,T)"/> with the first parameter
/// being an existing element and the second parameter being the one passed to the initiating call to <see cref="HashSet{T}"/> (eg. <see cref="HashSet{T}.Contains(T)"/>).
/// </summary>
/// <typeparam name="T">The type of object being compared.</typeparam>
/// <remarks>This class is thread-safe but may should not be used with any sort of parallel access (PLINQ).</remarks>
public class RetainingEqualityComparerStruct<T> : IRetainingComparer<T> where T : struct 
{
    private readonly IEqualityComparer<T> _comparer;

    [ThreadStatic]
    private static T _retained;

    public RetainingEqualityComparerStruct(IEqualityComparer<T> comparer)
    {
        _comparer = comparer;
    }

    /// <summary>
    /// The retained instance on side 'a' of the <see cref="Equals"/> call which successfully met the equality requirement agains side 'b'.
    /// </summary>
    public T Key => _retained;


    /// <summary>
    /// Sets the retained <see cref="Key"/> to the default value.
    /// </summary>
    /// <remarks>This should be called prior to performing an operation that calls <see cref="Equals"/>.</remarks>
    public void ClearKeyCache()
    {
        _retained = default(T);
    }

    /// <summary>
    /// Test two objects of type <see cref="T"/> for equality retaining the object if successful.
    /// </summary>
    /// <param name="a">An instance of <see cref="T"/>.</param>
    /// <param name="b">A second instance of <see cref="T"/> to compare against <paramref name="a"/>.</param>
    /// <returns>True if <paramref name="a"/> and <paramref name="b"/> are equal, false otherwise.</returns>
    public bool Equals(T a, T b)
    {
        if (!_comparer.Equals(a, b))
        {
            return false;
        }

        _retained = a;
        return true;
    }

    /// <summary>
    /// Gets the hash code value of an instance of <see cref="T"/>.
    /// </summary>
    /// <param name="o">The instance of <see cref="T"/> to obtain a hash code from.</param>
    /// <returns>The hash code value from <paramref name="o"/>.</returns>
    public int GetHashCode(T o)
    {
        return _comparer.GetHashCode(o);
    }
}

/// <summary>
/// Provides TryGetValue{T} functionality similar to that of <see cref="IDictionary{TKey,TValue}"/>'s implementation.
/// </summary>
public class ExtendedHashSet<T> : HashSet<T>
{
    /// <summary>
    /// This class is guaranteed to wrap the <see cref="IEqualityComparer{T}"/> with one of the <see cref="IRetainingComparer{T}"/>
    /// implementations so this property gives convenient access to the interfaced comparer.
    /// </summary>
    private IRetainingComparer<T> RetainingComparer => (IRetainingComparer<T>)Comparer;

    /// <summary>
    /// Creates either a <see cref="RetainingEqualityComparerStruct{T}"/> or <see cref="RetainingEqualityComparerObject{T}"/>
    /// depending on if <see cref="T"/> is a reference type or a value type.
    /// </summary>
    /// <param name="comparer">(optional) The <see cref="IEqualityComparer{T}"/> to wrap. This will be set to <see cref="EqualityComparer{T}.Default"/> if none provided.</param>
    /// <returns>An instance of <see cref="IRetainingComparer{T}"/>.</returns>
    private static IRetainingComparer<T> Create(IEqualityComparer<T> comparer = null)
    {
        return (IRetainingComparer<T>) (typeof(T).IsValueType ? 
            Activator.CreateInstance(typeof(RetainingEqualityComparerStruct<>)
                .MakeGenericType(typeof(T)), comparer ?? EqualityComparer<T>.Default)
            :
            Activator.CreateInstance(typeof(RetainingEqualityComparerObject<>)
                .MakeGenericType(typeof(T)), comparer ?? EqualityComparer<T>.Default));
    }

    public ExtendedHashSet() : base(Create())
    {
    }

    public ExtendedHashSet(IEqualityComparer<T> comparer) : base(Create(comparer))
    {
    }

    public ExtendedHashSet(IEnumerable<T> collection) : base(collection, Create())
    {
    }

    public ExtendedHashSet(IEnumerable<T> collection, IEqualityComparer<T> comparer) : base(collection, Create(comparer))
    {
    }

    /// <summary>
    /// Attempts to find a key in the <see cref="HashSet{T}"/> and, if found, places the instance in <paramref name="original"/>.
    /// </summary>
    /// <param name="value">The key used to search the <see cref="HashSet{T}"/>.</param>
    /// <param name="original">
    /// The matched instance from the <see cref="HashSet{T}"/> which is not neccessarily the same as <paramref name="value"/>.
    /// This will be set to null for reference types or default(T) for value types when no match found.
    /// </param>
    /// <returns>True if a key in the <see cref="HashSet{T}"/> matched <paramref name="value"/>, False if no match found.</returns>
    public bool TryGetValue(T value, out T original)
    {
        var comparer = RetainingComparer;
        comparer.ClearKeyCache();

        if (Contains(value))
        {
            original = comparer.Key;
            return true;
        }

        original = default(T);
        return false;
    }
}

public static class HashSetExtensions
{
    /// <summary>
    /// Attempts to find a key in the <see cref="HashSet{T}"/> and, if found, places the instance in <paramref name="original"/>.
    /// </summary>
    /// <param name="hashSet">The instance of <see cref="HashSet{T}"/> extended.</param>
    /// <param name="value">The key used to search the <see cref="HashSet{T}"/>.</param>
    /// <param name="original">
    /// The matched instance from the <see cref="HashSet{T}"/> which is not neccessarily the same as <paramref name="value"/>.
    /// This will be set to null for reference types or default(T) for value types when no match found.
    /// </param>
    /// <returns>True if a key in the <see cref="HashSet{T}"/> matched <paramref name="value"/>, False if no match found.</returns>
    /// <exception cref="ArgumentNullException">If <paramref name="hashSet"/> is null.</exception>
    /// <exception cref="ArgumentException">
    /// If <paramref name="hashSet"/> does not have a <see cref="HashSet{T}.Comparer"/> of type <see cref="IRetainingComparer{T}"/>.
    /// </exception>
    public static bool TryGetValue<T>(this HashSet<T> hashSet, T value, out T original)
    {
        if (hashSet == null)
        {
            throw new ArgumentNullException(nameof(hashSet));
        }

        if (hashSet.Comparer.GetType().IsInstanceOfType(typeof(IRetainingComparer<T>)))
        {
            throw new ArgumentException($"HashSet must have an equality comparer of type '{nameof(IRetainingComparer<T>)}' to use this functionality", nameof(hashSet));
        }

        var comparer = (IRetainingComparer<T>)hashSet.Comparer;
        comparer.ClearKeyCache();

        if (hashSet.Contains(value))
        {
            original = comparer.Key;
            return true;
        }

        original = default(T);
        return false;
    }
}

1
由于您使用的是Linq扩展方法Enumerable.Contains,它将枚举集合的所有元素并进行比较,从而失去了该集合的哈希实现所提供的任何好处。然后,您最好只编写set.SingleOrDefault(e => set.Comparer.Equals(e, obj)),它具有与解决方案相同的行为和性能特征。
Daniel AA Pelsmaeker

@Virtlink很好-绝对正确。我将修改答案。
Graeme Wicksted '16

但是,如果要包装内部使用比较器的HashSet,它将可以工作。像这样:Utillib / ExtHashSet
Daniel AA Pelsmaeker

@Virtlink谢谢!最后,我将HashSet包装为一个选项,但提供了比较器和扩展方法,以实现更多功能。现在它是线程安全的,不会泄漏内存...但是它的代码比我期望的要多得多!
Graeme Wicksted '16

@Francois编写上面的代码更多是为了找出“最佳”时间/内存解决方案;但是,我不建议您使用这种方法。结合使用带有自定义IEqualityComparer的Dictionary <T,T>更加简单明了且面向未来!
Graeme Wicksted '16

-2

HashSet有一个Contains(T)方法。

如果需要自定义比较方法(例如,存储人员对象,但将SSN用于相等性比较),则可以指定IEqualityComparer


-11

您也可以使用ToList()方法并将索引器应用于该方法。

HashSet<string> mySet = new HashSet();
mySet.Add("mykey");
string key = mySet.toList()[0];

我不知道为什么当我采用这种逻辑时,您为什么不赞成。我需要从以Dictionary <string,ISet <String >>开头的结构中提取值,其中ISet包含x个值。获取这些值的最直接方法是遍历字典,以拉出键和ISet值。然后,我遍历ISet以显示各个值。它不是很优雅,但是很有效。
j.hull
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.