1、先说重点:
不同的编码格式占字节数是不同的,UTF-8编码下一个中文所占字节也是不确定的,可能是2个、3个、4个字节;
2、以下是源码:
@Test public void test1() throws UnsupportedEncodingException { String a = "名"; System.out.println("UTF-8编码长度:"+a.getBytes("UTF-8").length); System.out.println("GBK编码长度:"+a.getBytes("GBK").length); System.out.println("GB2312编码长度:"+a.getBytes("GB2312").length); System.out.println("=========================================="); String c = "0x20001"; System.out.println("UTF-8编码长度:"+c.getBytes("UTF-8").length); System.out.println("GBK编码长度:"+c.getBytes("GBK").length); System.out.println("GB2312编码长度:"+c.getBytes("GB2312").length); System.out.println("=========================================="); char[] arr = Character.toChars(0x20001); String s = new String(arr); System.out.println("char array length:" + arr.length); System.out.println("content:| " + s + " |"); System.out.println("String length:" + s.length()); System.out.println("UTF-8编码长度:"+s.getBytes("UTF-8").length); System.out.println("GBK编码长度:"+s.getBytes("GBK").length); System.out.println("GB2312编码长度:"+s.getBytes("GB2312").length); System.out.println("=========================================="); }
3、运行结果
UTF-8编码长度:3 GBK编码长度:2 GB2312编码长度:2 ========================================== UTF-8编码长度:4 GBK编码长度:1 GB2312编码长度:1 ========================================== char array length:2 content:|