我看JAVA 之 String
1. java.io.Serializable
2. Comparable<String>
3. CharSequence 提供对字符数组多种只读形式的统一访问方法规范
* jdk9开始使用byte[]存储字符串,1.8及之前使用char[]保存
private final byte[] value;
* coder用来表示此字符串使用的编码,coder=0使用LATIN1,coder=1使用UTF16
* LATIN1 是8比特的字符集,定义了256个字符。前128个字符与ASCII完全一致,即为ASCII的超集
* UTF16 是可变长度编码。可以是一个或二个16比特。
* 根据不同的编码由不同的工具类实现String的内部编码,Latin1对应StringLatin1,UTF16对应StringUTF16
private final byte coder;
/** Cache the hash code for the string */
private int hash; // Default to 0
/** use serialVersionUID from JDK 1.0.2 for interoperability */
private static final long serialVersionUID = -6849794470754667710L;
* 如果关闭压缩,字符串的bytes使用UTF16编码
* 如下为jit优化方面,为什么不直接初始化COMPACT_STRINGS的值:
* The instance field value is generally opaque to optimizing JIT
* compilers. Therefore, in performance-sensitive place, an explicit
* check of the static boolean {@code COMPACT_STRINGS} is done first
* before checking the {@code coder} field since the static boolean
* {@code COMPACT_STRINGS} would be constant folded away by an
* optimizing JIT compiler. The idioms for these cases are as follows.
* For code such as:
* if (coder == LATIN1) { ... }
* can be written more optimally as
* if (coder() == LATIN1) { ... }
* or:
* if (COMPACT_STRINGS && coder == LATIN1) { ... }
* An optimizing JIT compiler can fold the above conditional as:
* COMPACT_STRINGS == true => if (coder == LATIN1) { ... }
* COMPACT_STRINGS == false => if (false) { ... }
* @implNote
* The actual value for this field is injected by JVM. The static
* initialization block is used to set the value here to communicate
* that this static final field is not statically foldable, and to
* avoid any possible circular dependency during vm initialization.
static final boolean COMPACT_STRINGS;
static {
* Class String is special cased within the Serialization Stream Protocol.
* A String instance is written into an ObjectOutputStream according to
* <a href="{@docRoot}/../specs/serialization/protocol.html#stream-elements">
* Object Serialization Specification, Section 6.2, "Stream Elements"</a>
private static final ObjectStreamField[] serialPersistentFields =
new ObjectStreamField[0];
@Native static final byte LATIN1 = 0;
@Native static final byte UTF16 = 1;
1. getBytes()相关
* getBytes() 将当前字符串转换为当前文件系统默认编码格式的字节数组
* getBytes(charset) 将当前字符串转换为指定编码格式的字节数组
public byte[] getBytes(String charsetName)
throws UnsupportedEncodingException {
if (charsetName == null) throw new NullPointerException();
return StringCoding.encode(charsetName, coder(), value);
public byte[] getBytes(Charset charset) {
if (charset == null) throw new NullPointerException();
return StringCoding.encode(charset, coder(), value);
public byte[] getBytes() {
return StringCoding.encode(coder(), value);
2. length()
* 返回当前字符串长度,如果是LATIN1字符串长度等于LATIN1格式字节数组长度,否则需要取value.length>>1,长度减半
public int length() {
return value.length >> coder();
3. native intern()
* 当调用intern方法时,如果常量池中已经存在equal当前String的对象,那么返回String常量池中的字符串。
* 否则,当前String对象会被添加到String常量池并且返回常量池中的String对象引用
* 如果a.intern() == b.intern(),那么a.equal(b) == true
public native String intern();
1. StringLatin1 提供了启用压缩编码Latin1的情况下的一些常用操作如indexOf、hashcode、replace、trim、strip、compare等
2. StringUTF16 提供了编码为UTF16的情况下的一些常用操作如indexOf、hashcode、replace、trim、strip、compare等
3. StringCoding 提供了为String编解码decode & encode操作