序列化和反序列化.NET对象的最快方法


87

我正在寻找序列化和反序列化.NET对象的最快方法。这是我到目前为止的内容:

public class TD
{
    public List<CT> CTs { get; set; }
    public List<TE> TEs { get; set; }
    public string Code { get; set; }
    public string Message { get; set; }
    public DateTime StartDate { get; set; }
    public DateTime EndDate { get; set; }

    public static string Serialize(List<TD> tData)
    {
        var serializer = new XmlSerializer(typeof(List<TD>));

        TextWriter writer = new StringWriter();
        serializer.Serialize(writer, tData);

        return writer.ToString();
    }

    public static List<TD> Deserialize(string tData)
    {
        var serializer = new XmlSerializer(typeof(List<TD>));

        TextReader reader = new StringReader(tData);

        return (List<TD>)serializer.Deserialize(reader);
    }        
}

2
性能还是代码足迹?
ulrichb 2010年

您是在问我是否需要性能数据或代码?
阿龙

3
他在问,以“最快的方式”是指性能还是代码占用量。BinaryFormatter在代码和实现方面的速度非常快,但是像Marc这样的解决方案在基准测试中性能会更快。
科迪·格雷

好吧,我明白了,我的意思是性能……
aron

有很多链接。一个这样的:blogs.msdn.com/b/youssefm/archive/2009/07/10/...
nawfal

Answers:


57

这是您使用protobuf-net的模型(已发明CTTE)(但仍保留了使用的能力,这可能是有用的-特别是对于迁移);我谦虚地提出(如果需要的话,有很多证据)这.NET中最快的(或者肯定是最快的)通用串行器之一。XmlSerializer

如果需要字符串,只需将base-64编码即可。

[XmlType]
public class CT {
    [XmlElement(Order = 1)]
    public int Foo { get; set; }
}
[XmlType]
public class TE {
    [XmlElement(Order = 1)]
    public int Bar { get; set; }
}
[XmlType]
public class TD {
    [XmlElement(Order=1)]
    public List<CT> CTs { get; set; }
    [XmlElement(Order=2)]
    public List<TE> TEs { get; set; }
    [XmlElement(Order = 3)]
    public string Code { get; set; }
    [XmlElement(Order = 4)]
    public string Message { get; set; }
    [XmlElement(Order = 5)]
    public DateTime StartDate { get; set; }
    [XmlElement(Order = 6)]
    public DateTime EndDate { get; set; }

    public static byte[] Serialize(List<TD> tData) {
        using (var ms = new MemoryStream()) {
            ProtoBuf.Serializer.Serialize(ms, tData);
            return ms.ToArray();
        }            
    }

    public static List<TD> Deserialize(byte[] tData) {
        using (var ms = new MemoryStream(tData)) {
            return ProtoBuf.Serializer.Deserialize<List<TD>>(ms);
        }
    }
}

2
G'day Marc,热爱您完成的协议缓冲区工作,我知道这篇文章已有5年的历史了,但此处答案(Binoj)中引用的netserializer具有衡量指标,表明您的实现不是最快的。这是一个公平的声明/广告,还是需要权衡取舍?谢谢
杰里米·汤普森

好的,我现在看到的是,NetSerialization仅适用于与我正在寻找版本容忍序列化的版本相同的版本
Jeremy Thompson

1
任何认为这很快的人都必须吸烟,它在很多情况下可能足够快,并且可能比那里的许多其他序列化要快,但是与手动解析相比,它实际上快吗?天哪
BjarkeCK

@BjarkeCK序列化程序本质上涉及更多点,因为它们需要做很多事情来防止人们将自己的脚踢掉(特别是在迭代版本时);大多数人不想花他们的生活调试序列化代码,所以:一个好的串行器-而无疑慢于一个完全实现的版本不耐受手工执行-通常是大多数人一个很好的妥协
马克·Gravell

1
@BjarkeCK我非常不同意;这还没有远程对于大多数人来说是有用的。接下来是什么-每天编写我们自己的收藏集?不:即使做得很好,也很难。当然,如果您实际上需要最快的输出:您将不得不动手-但是对于大多数人而言,这样做将是对他们时间的真正浪费。充其量,这将花费更长的时间。与使用可用库相比,它们的代码更有可能出现bug,不可靠并且运行速度较慢。大多数人应该专注于他们的应用程序需要的东西,而不是这个细节。
马克·格雷韦尔

33

5
那不是速度。太慢了 在链接的文章中说“越小越好”。
Timur Nuriyasov'2

2
@TimurNuriyasov,这是时间采取的运做
马克西姆

2
所以你说二进制最慢?我不这么认为!我猜它正确地指的是速度,而不是时间。
贾维德

2
二进制是最慢的。试试看 但是我会说这是最简单的,因为它不需要任何自定义解析内容即可正确处理多态对象(接口等)
Kamarey 16'Aug

1
@Kamarey看看下面......我的二进制测试的方式比别人快。
杰里米·霍洛瓦克斯

19

对此感兴趣,我决定用我能做的最接近的“苹果对苹果”测试来测试建议的方法。我用以下代码编写了一个控制台应用程序:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;
using System.Runtime.Serialization.Formatters.Binary;
using System.Text;
using System.Threading.Tasks;

namespace SerializationTests
{
    class Program
    {
        static void Main(string[] args)
        {
            var count = 100000;
            var rnd = new Random(DateTime.UtcNow.GetHashCode());
            Console.WriteLine("Generating {0} arrays of data...", count);
            var arrays = new List<int[]>();
            for (int i = 0; i < count; i++)
            {
                var elements = rnd.Next(1, 100);
                var array = new int[elements];
                for (int j = 0; j < elements; j++)
                {
                    array[j] = rnd.Next();
                }   
                arrays.Add(array);
            }
            Console.WriteLine("Test data generated.");
            var stopWatch = new Stopwatch();

            Console.WriteLine("Testing BinarySerializer...");
            var binarySerializer = new BinarySerializer();
            var binarySerialized = new List<byte[]>();
            var binaryDeserialized = new List<int[]>();

            stopWatch.Reset();
            stopWatch.Start();
            foreach (var array in arrays)
            {
                binarySerialized.Add(binarySerializer.Serialize(array));
            }
            stopWatch.Stop();
            Console.WriteLine("BinaryFormatter: Serializing took {0}ms.", stopWatch.Elapsed.TotalMilliseconds);

            stopWatch.Reset();
            stopWatch.Start();
            foreach (var serialized in binarySerialized)
            {
                binaryDeserialized.Add(binarySerializer.Deserialize<int[]>(serialized));
            }
            stopWatch.Stop();
            Console.WriteLine("BinaryFormatter: Deserializing took {0}ms.", stopWatch.Elapsed.TotalMilliseconds);


            Console.WriteLine();
            Console.WriteLine("Testing ProtoBuf serializer...");
            var protobufSerializer = new ProtoBufSerializer();
            var protobufSerialized = new List<byte[]>();
            var protobufDeserialized = new List<int[]>();

            stopWatch.Reset();
            stopWatch.Start();
            foreach (var array in arrays)
            {
                protobufSerialized.Add(protobufSerializer.Serialize(array));
            }
            stopWatch.Stop();
            Console.WriteLine("ProtoBuf: Serializing took {0}ms.", stopWatch.Elapsed.TotalMilliseconds);

            stopWatch.Reset();
            stopWatch.Start();
            foreach (var serialized in protobufSerialized)
            {
                protobufDeserialized.Add(protobufSerializer.Deserialize<int[]>(serialized));
            }
            stopWatch.Stop();
            Console.WriteLine("ProtoBuf: Deserializing took {0}ms.", stopWatch.Elapsed.TotalMilliseconds);

            Console.WriteLine();
            Console.WriteLine("Testing NetSerializer serializer...");
            var netSerializerSerializer = new ProtoBufSerializer();
            var netSerializerSerialized = new List<byte[]>();
            var netSerializerDeserialized = new List<int[]>();

            stopWatch.Reset();
            stopWatch.Start();
            foreach (var array in arrays)
            {
                netSerializerSerialized.Add(netSerializerSerializer.Serialize(array));
            }
            stopWatch.Stop();
            Console.WriteLine("NetSerializer: Serializing took {0}ms.", stopWatch.Elapsed.TotalMilliseconds);

            stopWatch.Reset();
            stopWatch.Start();
            foreach (var serialized in netSerializerSerialized)
            {
                netSerializerDeserialized.Add(netSerializerSerializer.Deserialize<int[]>(serialized));
            }
            stopWatch.Stop();
            Console.WriteLine("NetSerializer: Deserializing took {0}ms.", stopWatch.Elapsed.TotalMilliseconds);

            Console.WriteLine("Press any key to end.");
            Console.ReadKey();
        }

        public class BinarySerializer
        {
            private static readonly BinaryFormatter Formatter = new BinaryFormatter();

            public byte[] Serialize(object toSerialize)
            {
                using (var stream = new MemoryStream())
                {
                    Formatter.Serialize(stream, toSerialize);
                    return stream.ToArray();
                }
            }

            public T Deserialize<T>(byte[] serialized)
            {
                using (var stream = new MemoryStream(serialized))
                {
                    var result = (T)Formatter.Deserialize(stream);
                    return result;
                }
            }
        }

        public class ProtoBufSerializer
        {
            public byte[] Serialize(object toSerialize)
            {
                using (var stream = new MemoryStream())
                {
                    ProtoBuf.Serializer.Serialize(stream, toSerialize);
                    return stream.ToArray();
                }
            }

            public T Deserialize<T>(byte[] serialized)
            {
                using (var stream = new MemoryStream(serialized))
                {
                    var result = ProtoBuf.Serializer.Deserialize<T>(stream);
                    return result;
                }
            }
        }

        public class NetSerializer
        {
            private static readonly NetSerializer Serializer = new NetSerializer();
            public byte[] Serialize(object toSerialize)
            {
                return Serializer.Serialize(toSerialize);
            }

            public T Deserialize<T>(byte[] serialized)
            {
                return Serializer.Deserialize<T>(serialized);
            }
        }
    }
}

结果令我惊讶;在多次运行时它们是一致的:

Generating 100000 arrays of data...
Test data generated.
Testing BinarySerializer...
BinaryFormatter: Serializing took 336.8392ms.
BinaryFormatter: Deserializing took 208.7527ms.

Testing ProtoBuf serializer...
ProtoBuf: Serializing took 2284.3827ms.
ProtoBuf: Deserializing took 2201.8072ms.

Testing NetSerializer serializer...
NetSerializer: Serializing took 2139.5424ms.
NetSerializer: Deserializing took 2113.7296ms.
Press any key to end.

收集这些结果后,我决定查看ProtoBuf或NetSerializer在较大的对象上是否表现更好。我将集合数更改为10,000个对象,但将数组的大小增加到1-10,000,而不是1-100。结果似乎更加确定:

Generating 10000 arrays of data...
Test data generated.
Testing BinarySerializer...
BinaryFormatter: Serializing took 285.8356ms.
BinaryFormatter: Deserializing took 206.0906ms.

Testing ProtoBuf serializer...
ProtoBuf: Serializing took 10693.3848ms.
ProtoBuf: Deserializing took 5988.5993ms.

Testing NetSerializer serializer...
NetSerializer: Serializing took 9017.5785ms.
NetSerializer: Deserializing took 5978.7203ms.
Press any key to end.

因此,我的结论是:在某些情况下,ProtoBuf和NetSerializer非常适合,但是就至少相对简单的对象而言,其原始性能方面……BinaryFormatter的性能至少提高了一个数量级。

YMMV。


1
也许BinaryFormatter使用数组真的非常快。
Behrooz

4
可能...但是在上述条件下,结果是惊人的。这里的教训可能是,不要相信一种方法在所有情况下都是最有效的。测试和基准测试总是有启发性的。
杰里米·霍洛瓦奇

在C ++中,对象序列化的速度快约100倍!
Mario M

很有意思!每个人都声称protobuf是最快的,但这清楚地表明它很慢。我说我BinaronSerializer这里的混合dotnetfiddle.net/gOqQ7p -这几乎快一倍的BinaryFormatter,这已经是非常快使用数组。
扎克锯

16

Protobuf非常快。

有关该系统性能和实现的详细信息,请参见http://code.google.com/p/protobuf-net/wiki/Performance


使用Protobuf有什么缺点吗?
罗伯特·杰普森

11
您必须注释您的对象。Protobuf不会像序列化程序那样存储字段名称和类型,而是从您的实际类型中获取它们。这是目标文件要小得多的原因之一。该文档解释了所有这些。我已经使用了一段时间了,如果您需要快速的(反序列化)和小的目标文件,protobuf确实是可行的方法。
Pieter van Ginkel,2010年

在C#中使用Protobut来添加答案的任何完整源代码示例吗?
Kiquenet

它并没有那么快...实际上,与非常非常非常快的序列化程序相比,它相当慢:dotnetfiddle.net/gOqQ7p
Zach

@ZachSaw如果您只处理整数数组(您的示例),则速度并不快,但是很少有人只序列化整数。当您开始处理具有很多成员的嵌套复杂类型时,您会看到速度上的好处(或者至少是我做的)。
matt.rothmeyer

15

另一个声称是超快速的串行器netserializer

他们网站上提供的数据显示,其性能是protobuf2到4倍,我自己还没有尝试过,但是如果您正在评估各种选择,也可以尝试一下


3
我只是在我的应用程序中尝试过NetSerializer,它的工作原理令人惊奇。值得一试。
Galen 2015年

netserializer不适合序列化“用户”对象,而该库不知道该对象以什么类型开头,甚至不可以强制用户将其对象标记为可序列化。
扎克锯

6

.net包含的二进制序列化程序应比XmlSerializer更快。或另一个用于protobuf,json,...的序列化器

但是对于其中一些,您需要添加属性,或通过其他方式添加元数据。例如,ProtoBuf在内部使用数字属性ID,并且需要通过其他机制以某种方式保存映射。版本控制对于任何序列化器来说都不是小事。


是的,它确实非常快,并且处理的案例/类型比Xml的要多得多。
嬉皮

1

我删除了上面代码中的错误并获得了以下结果:同样,我不确定NetSerializer如何要求您注册序列化的类型,可能产生的兼容性或性能差异。

Generating 100000 arrays of data...
Test data generated.
Testing BinarySerializer...
BinaryFormatter: Serializing took 508.9773ms.
BinaryFormatter: Deserializing took 371.8499ms.

Testing ProtoBuf serializer...
ProtoBuf: Serializing took 3280.9185ms.
ProtoBuf: Deserializing took 3190.7899ms.

Testing NetSerializer serializer...
NetSerializer: Serializing took 427.1241ms.
NetSerializer: Deserializing took 78.954ms.
Press any key to end.

修改后的代码

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;
using System.Runtime.Serialization.Formatters.Binary;
using System.Text;
using System.Threading.Tasks;

namespace SerializationTests
{
    class Program
    {
        static void Main(string[] args)
        {
            var count = 100000;
            var rnd = new Random((int)DateTime.UtcNow.Ticks & 0xFF);
            Console.WriteLine("Generating {0} arrays of data...", count);
            var arrays = new List<int[]>();
            for (int i = 0; i < count; i++)
            {
                var elements = rnd.Next(1, 100);
                var array = new int[elements];
                for (int j = 0; j < elements; j++)
                {
                    array[j] = rnd.Next();
                }
                arrays.Add(array);
            }
            Console.WriteLine("Test data generated.");
            var stopWatch = new Stopwatch();

            Console.WriteLine("Testing BinarySerializer...");
            var binarySerializer = new BinarySerializer();
            var binarySerialized = new List<byte[]>();
            var binaryDeserialized = new List<int[]>();

            stopWatch.Reset();
            stopWatch.Start();
            foreach (var array in arrays)
            {
                binarySerialized.Add(binarySerializer.Serialize(array));
            }
            stopWatch.Stop();
            Console.WriteLine("BinaryFormatter: Serializing took {0}ms.", stopWatch.Elapsed.TotalMilliseconds);

            stopWatch.Reset();
            stopWatch.Start();
            foreach (var serialized in binarySerialized)
            {
                binaryDeserialized.Add(binarySerializer.Deserialize<int[]>(serialized));
            }
            stopWatch.Stop();
            Console.WriteLine("BinaryFormatter: Deserializing took {0}ms.", stopWatch.Elapsed.TotalMilliseconds);


            Console.WriteLine();
            Console.WriteLine("Testing ProtoBuf serializer...");
            var protobufSerializer = new ProtoBufSerializer();
            var protobufSerialized = new List<byte[]>();
            var protobufDeserialized = new List<int[]>();

            stopWatch.Reset();
            stopWatch.Start();
            foreach (var array in arrays)
            {
                protobufSerialized.Add(protobufSerializer.Serialize(array));
            }
            stopWatch.Stop();
            Console.WriteLine("ProtoBuf: Serializing took {0}ms.", stopWatch.Elapsed.TotalMilliseconds);

            stopWatch.Reset();
            stopWatch.Start();
            foreach (var serialized in protobufSerialized)
            {
                protobufDeserialized.Add(protobufSerializer.Deserialize<int[]>(serialized));
            }
            stopWatch.Stop();
            Console.WriteLine("ProtoBuf: Deserializing took {0}ms.", stopWatch.Elapsed.TotalMilliseconds);

            Console.WriteLine();
            Console.WriteLine("Testing NetSerializer serializer...");
            var netSerializerSerialized = new List<byte[]>();
            var netSerializerDeserialized = new List<int[]>();

            stopWatch.Reset();
            stopWatch.Start();
            var netSerializerSerializer = new NS();
            foreach (var array in arrays)
            {
                netSerializerSerialized.Add(netSerializerSerializer.Serialize(array));
            }
            stopWatch.Stop();
            Console.WriteLine("NetSerializer: Serializing took {0}ms.", stopWatch.Elapsed.TotalMilliseconds);

            stopWatch.Reset();
            stopWatch.Start();
            foreach (var serialized in netSerializerSerialized)
            {
                netSerializerDeserialized.Add(netSerializerSerializer.Deserialize<int[]>(serialized));
            }
            stopWatch.Stop();
            Console.WriteLine("NetSerializer: Deserializing took {0}ms.", stopWatch.Elapsed.TotalMilliseconds);

            Console.WriteLine("Press any key to end.");
            Console.ReadKey();
        }

        public class BinarySerializer
        {
            private static readonly BinaryFormatter Formatter = new BinaryFormatter();

            public byte[] Serialize(object toSerialize)
            {
                using (var stream = new MemoryStream())
                {
                    Formatter.Serialize(stream, toSerialize);
                    return stream.ToArray();
                }
            }

            public T Deserialize<T>(byte[] serialized)
            {
                using (var stream = new MemoryStream(serialized))
                {
                    var result = (T)Formatter.Deserialize(stream);
                    return result;
                }
            }
        }

        public class ProtoBufSerializer
        {
            public byte[] Serialize(object toSerialize)
            {
                using (var stream = new MemoryStream())
                {
                    ProtoBuf.Serializer.Serialize(stream, toSerialize);
                    return stream.ToArray();
                }
            }

            public T Deserialize<T>(byte[] serialized)
            {
                using (var stream = new MemoryStream(serialized))
                {
                    var result = ProtoBuf.Serializer.Deserialize<T>(stream);
                    return result;
                }
            }
        }

        public class NS
        {
            NetSerializer.Serializer Serializer = new NetSerializer.Serializer(new Type[] { typeof(int), typeof(int[]) });

            public byte[] Serialize(object toSerialize)
            {
                using (var stream = new MemoryStream())
                {
                    Serializer.Serialize(stream, toSerialize);
                    return stream.ToArray();
                }
            }

            public T Deserialize<T>(byte[] serialized)
            {
                using (var stream = new MemoryStream(serialized))
                {
                    Serializer.Deserialize(stream, out var result);
                    return (T)result;
                }
            }
        }
    }
}

1
您指的是什么错误?
杰里米·霍洛瓦兹


0

我冒昧地将您的课程输入CGbR生成器由于它DateTime尚处于早期阶段,尚不支持,因此我只是将它替换为long。 生成的序列化代码如下所示:

public int Size
{
    get 
    { 
        var size = 24;
        // Add size for collections and strings
        size += Cts == null ? 0 : Cts.Count * 4;
        size += Tes == null ? 0 : Tes.Count * 4;
        size += Code == null ? 0 : Code.Length;
        size += Message == null ? 0 : Message.Length;

        return size;              
    }
}

public byte[] ToBytes(byte[] bytes, ref int index)
{
    if (index + Size > bytes.Length)
        throw new ArgumentOutOfRangeException("index", "Object does not fit in array");

    // Convert Cts
    // Two bytes length information for each dimension
    GeneratorByteConverter.Include((ushort)(Cts == null ? 0 : Cts.Count), bytes, ref index);
    if (Cts != null)
    {
        for(var i = 0; i < Cts.Count; i++)
        {
            var value = Cts[i];
            value.ToBytes(bytes, ref index);
        }
    }
    // Convert Tes
    // Two bytes length information for each dimension
    GeneratorByteConverter.Include((ushort)(Tes == null ? 0 : Tes.Count), bytes, ref index);
    if (Tes != null)
    {
        for(var i = 0; i < Tes.Count; i++)
        {
            var value = Tes[i];
            value.ToBytes(bytes, ref index);
        }
    }
    // Convert Code
    GeneratorByteConverter.Include(Code, bytes, ref index);
    // Convert Message
    GeneratorByteConverter.Include(Message, bytes, ref index);
    // Convert StartDate
    GeneratorByteConverter.Include(StartDate.ToBinary(), bytes, ref index);
    // Convert EndDate
    GeneratorByteConverter.Include(EndDate.ToBinary(), bytes, ref index);
    return bytes;
}

public Td FromBytes(byte[] bytes, ref int index)
{
    // Read Cts
    var ctsLength = GeneratorByteConverter.ToUInt16(bytes, ref index);
    var tempCts = new List<Ct>(ctsLength);
    for (var i = 0; i < ctsLength; i++)
    {
        var value = new Ct().FromBytes(bytes, ref index);
        tempCts.Add(value);
    }
    Cts = tempCts;
    // Read Tes
    var tesLength = GeneratorByteConverter.ToUInt16(bytes, ref index);
    var tempTes = new List<Te>(tesLength);
    for (var i = 0; i < tesLength; i++)
    {
        var value = new Te().FromBytes(bytes, ref index);
        tempTes.Add(value);
    }
    Tes = tempTes;
    // Read Code
    Code = GeneratorByteConverter.GetString(bytes, ref index);
    // Read Message
    Message = GeneratorByteConverter.GetString(bytes, ref index);
    // Read StartDate
    StartDate = DateTime.FromBinary(GeneratorByteConverter.ToInt64(bytes, ref index));
    // Read EndDate
    EndDate = DateTime.FromBinary(GeneratorByteConverter.ToInt64(bytes, ref index));

    return this;
}

我创建了一个示例对象列表,如下所示:

var objects = new List<Td>();
for (int i = 0; i < 1000; i++)
{
    var obj = new Td
    {
        Message = "Hello my friend",
        Code = "Some code that can be put here",
        StartDate = DateTime.Now.AddDays(-7),
        EndDate = DateTime.Now.AddDays(2),
        Cts = new List<Ct>(),
        Tes = new List<Te>()
    };
    for (int j = 0; j < 10; j++)
    {
        obj.Cts.Add(new Ct { Foo = i * j });
        obj.Tes.Add(new Te { Bar = i + j });
    }
    objects.Add(obj);
}

我的机器在Release构建中的结果:

var watch = new Stopwatch();
watch.Start();
var bytes = BinarySerializer.SerializeMany(objects);
watch.Stop();

大小: 149000字节

时间: 2.059毫秒 3.13毫秒

编辑:从CGbR 0.4.3开始,二进制序列化程序支持DateTime。不幸的是,该DateTime.ToBinary方法非常缓慢。我很快将用更快的速度代替它。

Edit2:DateTime通过调用使用UTC时,ToUniversalTime()性能得以恢复,并且时钟为1.669ms

By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.