C#中Encoding.Unicode与Encoding.UTF8的区别

2022-06-24 10:33:29

今天在园子首页看到一篇博文-简单聊下Unicode和UTF-8，从中知道了UTF-8是Unicode的一种实现方式：

Unicode只是给这世界上每个字符规定了一个统一的二进制编号，并没有规定程序该如何去存储和解析。

可以说UTF-8是Unicode实现方式之一...

在闪存中记录这个收获时，@飞鸟_Asuka在回复中提了一个很好的问题：“那么在选择编码方式的时候为什么unicode和utf8会是分别的两个选项呢？”

在C#中，System.Text.Encoding.Unicode与System.Text.Encoding.UTF8分别是2种编码方式。如果UTF-8是Unicode的一种实现方式，那C#中为什么将Encoding.Unicode作为与UTF8并列的一种编码方式呢？

后来在*上找到了答案：

Windows handles so-called "Unicode" strings as UTF-16 strings, while most UNIXes default to UTF-8 these days.

原来Windows默认的Unicode实现是UTF-16，所以C#中Encoding.Unicode就是UTF-16。

System.Text.Encoding.Unicode的注释也证明了这一点：

//
// Summary:
//     Gets an encoding for the UTF-16 format using the little endian byte order.
//
// Returns:
//     An encoding for the UTF-16 format using the little endian byte order.
public static Encoding Unicode { get; }

C#中，Encoding.Unicode ＝ UTF-16 。

码农公寓

相关文章