Java一个汉字占几个字节(详解与原理)

1、先说重点:

不同的编码格式占字节数是不同的,UTF-8编码下一个中文所占字节也是不确定的,可能是2个、3个、4个字节;

2、以下是源码:

  @Test
    public void test1() throws UnsupportedEncodingException {
        String a = "名";
        System.out.println("UTF-8编码长度:"+a.getBytes("UTF-8").length);
        System.out.println("GBK编码长度:"+a.getBytes("GBK").length);
        System.out.println("GB2312编码长度:"+a.getBytes("GB2312").length);
        System.out.println("==========================================");

        String c = "0x20001";
        System.out.println("UTF-8编码长度:"+c.getBytes("UTF-8").length);
        System.out.println("GBK编码长度:"+c.getBytes("GBK").length);
        System.out.println("GB2312编码长度:"+c.getBytes("GB2312").length);
        System.out.println("==========================================");

        char[] arr = Character.toChars(0x20001);
        String s = new String(arr);
        System.out.println("char array length:" + arr.length);
        System.out.println("content:|  " + s + " |");
        System.out.println("String length:" + s.length());
        System.out.println("UTF-8编码长度:"+s.getBytes("UTF-8").length);
        System.out.println("GBK编码长度:"+s.getBytes("GBK").length);
        System.out.println("GB2312编码长度:"+s.getBytes("GB2312").length);
        System.out.println("==========================================");
    }

3、运行结果

UTF-8编码长度:3
GBK编码长度:2
GB2312编码长度:2
==========================================
UTF-8编码长度:4
GBK编码长度:1
GB2312编码长度:1
==========================================
char array length:2
content:|  												



	
	
上一篇:[vue]js模块导入导出export default


下一篇:Java一个汉字占几个字节(详解与原理)(转载)