JVM内存与垃圾回收-4-字符串

字符串定义

总述

  1. String被final修饰public final class String
  2. String成员属性value数组被final修饰,jdk8private final char value[];jdk9private final byte[] value;
  3. final修饰类表示该类不能被继承
  4. final修饰成员属性表示只能初始化1次
  5. 发现jdk8使用private final char value[];存储字符,jdk9=+使用private final byte[] value; private final byte coder;存储字符
  6. jdk9=+这样设计String的原因:引用大部分的String其实是Latin-1,Latin1是ISO-8859-1的别名占用1byte,char占用2byte,用char存储浪费空间
  7. jdk9=+,字符如果是Latin-1,使用1byte存储,否则用2byte存储(coder=LATIN1或UTF16)

jdk8

public final class String
    implements java.io.Serializable, Comparable<String>, CharSequence {
    /** The value is used for character storage. */
    private final char value[];

    /** Cache the hash code for the string */
    private int hash; // Default to 0

    /** use serialVersionUID from JDK 1.0.2 for interoperability */
    private static final long serialVersionUID = -6849794470754667710L;

jdk9

public final class String
    implements java.io.Serializable, Comparable<String>, CharSequence {

    /**
     * The value is used for character storage.
     *
     * @implNote This field is trusted by the VM, and is a subject to
     * constant folding if String instance is constant. Overwriting this
     * field after construction will cause problems.
     *
     * Additionally, it is marked with {@link Stable} to trust the contents
     * of the array. No other facility in JDK provides this functionality (yet).
     * {@link Stable} is safe here, because value is never null.
     */
    @Stable
    private final byte[] value;

    /**
     * The identifier of the encoding used to encode the bytes in
     * {@code value}. The supported values in this implementation are
     *
     * LATIN1
     * UTF16
     *
     * @implNote This field is trusted by the VM, and is a subject to
     * constant folding if String instance is constant. Overwriting this
     * field after construction will cause problems.
     */
    private final byte coder;

    /** Cache the hash code for the string */
    private int hash; // Default to 0

    /** use serialVersionUID from JDK 1.0.2 for interoperability */
    private static final long serialVersionUID = -6849794470754667710L;

字符串特性

String不可变性

总述

  1. String被final修饰public final class String
  2. String成员属性value数组被final修饰jdk8private final char value[];jdk9private final byte[] value;
  3. 参数传递引用变量是值拷贝,方法内部对这个引用变量重新赋值不会影响到方法外部的引用变量;方法内部操作引用变量的成员变量,方法外部引用变量会受到影响

代码

package xcrj;

/*
 * String不可变性
 * 结果
 * 1处hashcode=2处hashcode=3处的hashcode
 * */
public class VMStringFinal {
    public static void main(String[] args) {
        String str = "xcrj";
        System.out.println("1》" + str);
        System.out.println("1》" + str.hashCode());
        VMStringFinal vmsp = new VMStringFinal();
        vmsp.refTrans(str);
        // 经过引用的值传递,str的值不会发生变化,仍然是xcrj
        System.out.println("2》" + str);
        System.out.println("2》" + str.hashCode());
    }

    public void refTrans(String str) {
        System.out.println("3》" + str);
        System.out.println("3》" + str.hashCode());
        str = "xcrj2";
        System.out.println("4》" + str);
        System.out.println("4》" + str.hashCode());
    }
}

结果

1》xcrj
1》3673699
3》xcrj
3》3673699
4》xcrj2
4》113884719
2》xcrj
2》3673699

字符串常量池结构

总述

  • 字符串常量池是固定大小的Hashtable(类似数据结构中“开散列(链地址法)”)
  • jdk6 字符串常量池默认固定大小是1009;jdk7=+ 字符串常量池默认固定大小是60013
  • 固定大小设置的太小造成首次冲突变多,intern()方法性能下降

Hashtable-开散列(链地址法)
JVM内存与垃圾回收-4-字符串
代码

package xcrj;

/*
 * 先 -XX:StringTableSize=1009
 * 注意 最小1009
 * */
public class VMStringPoolSize {
    public static void main(String[] args) {
        try {
            Thread.sleep(1000000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }
}

命令

# 编译
javac -d D:\workspace\idea\JVMDemo\blog\target\classes\ -encoding UTF-8 VMStringPoolSize.java
# 运行
java -XX:StringTableSize=1009 -classpath D:\workspace\idea\JVMDemo\blog\target\classes\ xcrj.VMStringPoolSize

结果
JVM内存与垃圾回收-4-字符串

字符串内存

jdk6

字符串常量池位于方法区/永久代中

jdk7

字符串常量池位于堆区中
更换位置原因:GC对于方法区/永久代的回收频率很低,但字符串常量的使用频率又很高

字符串创建

字面量

代码

String str="xcrj";

介绍

  • 字符串常量池中创建匿名String对象》引用赋值给str

new String

代码

String str=new String("xcrj");

介绍

  1. labelA:字符串常量池中创建匿名String对象,持有value数组属性
  2. labelB:堆区创建String对象,内部持有value数组属性
  3. value数组引用赋值

示例

代码

package xcrj;

/*
 * 字符串常量反编译查看
 * */
public class VMStringConstPool {
    public static void main(String[] args) {
        String str = "xcrj";
        String str1 = "xcrj1";
        String str2 = "xcrj2";
    }
}

命令

# 编译
javac -d D:\workspace\idea\JVMDemo\blog\target\classes\ VMStringConstPool.java
# 反编译
javap -classpath D:\workspace\idea\JVMDemo\blog\target\classes\ -v xcrj.VMStringConstPool

结果
JVM内存与垃圾回收-4-字符串
JVM内存与垃圾回收-4-字符串

字符串操作

toString()

总述

  • 每个类的toString()方法内部实现有所不同
  • String对象toString()方法内部return this;
  • StringBuilder对象toString()方法内部new String(value, 0, count);
  • Object对象toString()方法内部return getClass().getName() + "@" + Integer.toHexString(hashCode()); 就是String str="getClass().getName() + "@" + Integer.toHexString(hashCode())";

代码

package xcrj;

public class VMStringToString {
    public static void main(String[] args) {
        Object obj = new Object();
        VMStringToString vmsts = new VMStringToString();
        vmsts.toStr(obj);
    }

    public void toStr(Object obj) {
        String str = obj.toString();
        String str1 = obj.toString();
        // false
        System.out.println(str == str1);
        String strIntern = str.intern();
        String str1Intern = str1.intern();
        // true
        System.out.println(strIntern == str1Intern);
    }
}

指令

# 编译
javac -d D:\workspace\idea\JVMDemo\blog\target\classes\ -encoding UTF-8 VMStringToString.java
# 反编译
javap -classpath D:\workspace\idea\JVMDemo\blog\target\classes\ -v xcrj.VMStringToString

反编译
JVM内存与垃圾回收-4-字符串

拼接

总述

  • 字符串字面量是常量
  • 字符串字面量首先会放到字符串常量池中
  • 字符串常量池中不存在相同内容的字符串
  • 字符串常量与字符串常量的拼接结果在字符串常量池中(编译期优化负责)
  • 字符串拼接只要其中有一个字符串变量结果就在堆中(StringBuilder负责)

注意-字符串拼接至少一个字符串变量

  • 字符串字面量在字符串常量池中对象
  • new String()对象(没有new String()则没有这个对象)
  • StringBuilder对象
  • StringBuilder对象toString()方法内部new String(value, 0, count);(由StringBuilder对象拼接完成最后调用)

代码

package xcrj;

/*
 * 字符串拼接
 * */
public class VMStringBuilder {
    public static void main(String[] args) {
        // 字符串常量拼接
        String str = "xc" + "rj";
        // 字符串常量拼接
        final String str1 = "xc";
        final String str2 = "rj";
        String str3 = str1 + str2;
        // true
        System.out.println(str == str3);
        // 字符串变量拼接
        String str4 = "xc";
        String str5 = str + "rj";
        // false
        System.out.println(str == str5);
    }
}

指令

# 编译
javac -d D:\workspace\idea\JVMDemo\blog\target\classes\ -encoding UTF-8 VMStringBuilder.java
# 反编译
javap -classpath D:\workspace\idea\JVMDemo\blog\target\classes\ -v xcrj.VMStringBuilder

反编译
JVM内存与垃圾回收-4-字符串

分析

  • 都是字符串常量的拼接会在编译期就拼接在一起0: ldc #2 // String xcrj 3: ldc #2 // String xcrj
  • 至少一个字符串变量的拼接是由StringBuilder对象负责的
  • StringBuilder负责拼接完成后会调用toString()方法
  • StringBuildertoString()方法会产生1个对象存入LV中astore

intern()

总述

  • 确保字符串在内存中只有一份
  • 返回在字符串常量池中字符串常量的引用
  • 伪代码
if (字符串常量池中有这个字符串常量) {
	返回这个字符串常量的引用
} else {
	在字符串常量池中添加这个字符串常量
	返回这个字符串常量的引用
}

代码

package xcrj;

public class VMStringIntern {
    public static void main(String[] args) {
        String str = "xcrj";
        String strIntern = str.intern();
        // true
        System.out.println(str == strIntern);

        String str1 = new String("xcrj1");
        String str1Const = "xcrj1";
        String str1Intern = str1.intern();
        // false
        System.out.println(str1 == str1Intern);
        // true
        System.out.println(str1Const == str1Intern);

        Object obj = new Object();
        String str2 = obj.toString();
        String str2Intern = str2.intern();
        // true
        System.out.println(str2 == str2Intern);

        StringBuilder sb = new StringBuilder();
        sb.append("xcrj2");
        String str3 = sb.toString();
        String str3Intern = str3.intern();
        // fasle
        System.out.println(str3 == str3Intern);
    }
}

调优

java8/StringBuilder

总述

  • 至少1个字符串变量的拼接由StringBuilder负责
  • 构造StringBuilder时使用StringBuilder builder=new StringBuilder(capacity),给出合适的初始容量(内部char[]大小,无参构造是16)
  • 避免反复扩容(旧的容量*2+2):创建新char[]》值拷贝》丢弃旧char[]
  • StringBuilder对象toString()方法内部new String(value, 0, count);

源码

abstract class AbstractStringBuilder implements Appendable, CharSequence {
    /**
     * The value is used for character storage.
     */
    char[] value;

    /**
     * The count is the number of characters used.
     */
    int count;
	
	/**
     * For positive values of {@code minimumCapacity}, this method
     * behaves like {@code ensureCapacity}, however it is never
     * synchronized.
     * If {@code minimumCapacity} is non positive due to numeric
     * overflow, this method throws {@code OutOfMemoryError}.
     */
    private void ensureCapacityInternal(int minimumCapacity) {
        // overflow-conscious code
        if (minimumCapacity - value.length > 0) {
            value = Arrays.copyOf(value,
                    newCapacity(minimumCapacity));
        }
    }	
	
	/**
     * Returns a capacity at least as large as the given minimum capacity.
     * Returns the current capacity increased by the same amount + 2 if
     * that suffices.
     * Will not return a capacity greater than {@code MAX_ARRAY_SIZE}
     * unless the given minimum capacity is greater than that.
     *
     * @param  minCapacity the desired minimum capacity
     * @throws OutOfMemoryError if minCapacity is less than zero or
     *         greater than Integer.MAX_VALUE
     */
    private int newCapacity(int minCapacity) {
        // overflow-conscious code
        int newCapacity = (value.length << 1) + 2;
        if (newCapacity - minCapacity < 0) {
            newCapacity = minCapacity;
        }
        return (newCapacity <= 0 || MAX_ARRAY_SIZE - newCapacity < 0)
            ? hugeCapacity(minCapacity)
            : newCapacity;
    }
    
    @Override
    public String toString() {
        // Create a copy, don't share the array
        return new String(value, 0, count);
    }

final修饰符

总述

  • 尽量有final,会在编译期进行优化

字符串拼接intern()优化

总述

  • jdk6:字符串常量池池位于方法区/永久代中
  • jdk7=+:字符串常量池位于堆区中
  • jdk7=+:字符串拼接结果调用intern()方法存在优化(StringBuilder对象最后调用的toString()方法创建的对象A在堆区中,字符串常量池也在堆区中,再调用intern()方法直接将对象A的引用返回)
    代码
package xcrj;

public class VMStringInternjdk {
    public static void main(String[] args) {
        String str = new String("xcrj");
        String strIntern = str.intern();
        String str1 = "xcrj";
        /*
         * jdk6:字符串常量池池位于方法区/永久代中
         * jdk7=+:字符串常量池位于堆区中
         * jdk6/7/8 都是false
         * */
        System.out.println(str == str1);
        /*
         * jdk6:字符串常量池池位于方法区/永久代中
         * jdk7=+:字符串常量池位于堆区中
         * jdk6/7/8 都是true
         * */
        System.out.println(strIntern == str1);

        String strss = new String("xcrjxcrj");
        String strss1 = new String("xcrjxcrj");
        // false
        System.out.println(strss == strss1);
        String strssIntern = strss.intern();
        String strss1Intern = strss1.intern();
        // false
        System.out.println(strss == strss1);
        // true
        System.out.println(strssIntern == strss1Intern);

        String str2 = new String("xcrj2") + new String("xcrj2");
        str2.intern();
        String str3 = "xcrj2xcrj2";
        /*
         * jdk6:字符串常量池池位于方法区/永久代中
         * jdk7=+:字符串常量池位于堆区中
         * jdk6:是false
         * jdk7/8 都是true
         * */
        System.out.println(str2 == str3);

        String str4 = new String("xcrj4") + "xcrj4";
        str4.intern();
        String str5 = "xcrj4xcrj4";
        /*
         * jdk6:字符串常量池池位于方法区/永久代中
         * jdk7=+:字符串常量池位于堆区中
         * jdk6:是false
         * jdk7/8 都是true
         * */
        System.out.println(str4 == str5);
    }
}

参数

分类 参数 作用 建议
String -XX:StringTableSize jdk6 字符串常量池默认固定大小是1009;jdk7=+ 字符串常量池默认固定大小是60013

面试题

  1. new String(“xcrj”)创建了几个对象?
    答:2个对象
  2. new String(“xc”)+new String(“rj”)创建了几个对象?
    答:6个对象
package xcrj;

/*
 * 面试题
 * */
public class VMStringFace {
    public static void main(String[] args) {
        VMStringFace vmsf = new VMStringFace();
        vmsf.face1();
        vmsf.face2();
    }

    /*
     * new String("xcrj") 创建了几个对象?
     * 总共2个对象:
     * 字符串常量池中有1个匿名对象
     * 堆区中有1个对象
     * */
    public void face1() {
        String str = new String("xcrj");
    }

    /*
     * new String("xc")+new String("rj")创建了几个对象?
     * 总共6个对象:
     * 字符串常量池中有2个匿名对象
     * 堆区中有2个对象
     * StringBuilder对象
     * StringBuilder拼接对象完成之后toString()会在堆区创建1个对象
     * */
    public void face2() {
        String str = new String("xc") + new String("rj");
    }
}

反编译
JVM内存与垃圾回收-4-字符串

上一篇:Android 开发面试中,面试过最喜欢问那些问题?,flutter真机调试


下一篇:【BZOJ-4514】数字配对 最大费用最大流 + 质因数分解 + 二分图 + 贪心 + 线性筛