Java类型简介

1 java基本数据类型

1.1 基本数据类型

java的基础数据类型有多少个,每个在内存的分配是多少呢?

类型 分配字节
byte 1
short 2
char 2
int 4
long 8
reference 4/8
array 4/8

引用类型分配的大小与平台类型相关,如32位系统为4字节,64位为8字节

基本上在java所有的类型都是在这6个类型的基础上衍化出来的,这是java世界的基石。

jvm会将内存划分为栈与堆区,栈为线程独有,堆为线程共享。如果在方法块内定义了基本数据类型,那这些类型都会被分配在栈中,因为它们的大小是固定的,而且在方法结束的时候被立刻回收,很方便内存管理。

1.2 数组类型

数组类型可以看作基本数据的组合,数组类型会被分配在栈中吗,可以作一个简单的测试:

public class HelloWorld {

    public static void main(String[] args) {
int[] a = new int[6];
int[] b = new int[10]; a=b;
System.out.println(a);
}
}// 输出 [I@15db9742

这里a起码分配了64字节,b分配了104字节,而将b赋值给a之后,将没有运行出错,说明数组不可能是分配在栈区,因为只有固定大小的变量才能分配在栈当中。数组可以看作引用类型的衍化类型。

数组在堆并不会只分配元素大小*数组size大小的字节,从常理看起码还需要包含元素类型的描述以及size的记录这两个元数据的空间。

2 java复合类型

由上一节,提到java所有类型都有基本类型衍化生成,这里将以Object,Integer,String三种类型来展示衍化的工作。

2.1 Object

来看一下Object的源码

public class Object {

    private static native void registerNatives();
static {
registerNatives();
} public final native Class<?> getClass(); public native int hashCode(); public boolean equals(Object obj) {
return (this == obj);
} protected native Object clone() throws CloneNotSupportedException; public String toString() {
return getClass().getName() + "@" + Integer.toHexString(hashCode());
} public final native void notify(); public final native void notifyAll(); public final native void wait(long timeout) throws InterruptedException; public final void wait(long timeout, int nanos) throws InterruptedException {
if (timeout < 0) {
throw new IllegalArgumentException("timeout value is negative");
} if (nanos < 0 || nanos > 999999) {
throw new IllegalArgumentException(
"nanosecond timeout value out of range");
} if (nanos > 0) {
timeout++;
} wait(timeout);
} public final void wait() throws InterruptedException {
wait(0);
} protected void finalize() throws Throwable { }
}

Object类里面,没有成员变量,只定义对应的几个方法,以及类初始化时的一个方法块。对于每个方法的具体作用,先按下不表。当Object 被new出一个实例时,可以简单地看作在栈区生成一个引用,那在堆区内会为实例分配出多少字节呢?理论上,Object没有成员变量,在堆应该是0字节,但是栈的引用又指向了堆了,表明堆肯定为此实例分配了内存。

答案是2个引用size的字节,即32位平台下分配8个字节,64位平台下分配16个字节,里面有两个引用,一个指向当前Object的类对象,另一个对向klass对象。这两个对象先按下不表,具体可以参见文章:new一个Object对象占用多少内存?

2.2 Integer类型

先看源码

import java.lang.annotation.Native;

public final class Integer extends Number implements Comparable<Integer> {
/**
* A constant holding the minimum value an {@code int} can
* have, -2<sup>31</sup>.
*/
@Native public static final int MIN_VALUE = 0x80000000; /**
* A constant holding the maximum value an {@code int} can
* have, 2<sup>31</sup>-1.
*/
@Native public static final int MAX_VALUE = 0x7fffffff; /**
* Parses the string argument as a signed integer in the radix
* specified by the second argument. The characters in the string
* must all be digits of the specified radix (as determined by
* whether {@link java.lang.Character#digit(char, int)} returns a
* nonnegative value), except that the first character may be an
* ASCII minus sign {@code '-'} ({@code '\u005Cu002D'}) to
* indicate a negative value or an ASCII plus sign {@code '+'}
* ({@code '\u005Cu002B'}) to indicate a positive value. The
* resulting integer value is returned.
*
* <p>An exception of type {@code NumberFormatException} is
* thrown if any of the following situations occurs:
* <ul>
* <li>The first argument is {@code null} or is a string of
* length zero.
*
* <li>The radix is either smaller than
* {@link java.lang.Character#MIN_RADIX} or
* larger than {@link java.lang.Character#MAX_RADIX}.
*
* <li>Any character of the string is not a digit of the specified
* radix, except that the first character may be a minus sign
* {@code '-'} ({@code '\u005Cu002D'}) or plus sign
* {@code '+'} ({@code '\u005Cu002B'}) provided that the
* string is longer than length 1.
*
* <li>The value represented by the string is not a value of type
* {@code int}.
* </ul>
*
* <p>Examples:
* <blockquote><pre>
* parseInt("0", 10) returns 0
* parseInt("473", 10) returns 473
* parseInt("+42", 10) returns 42
* parseInt("-0", 10) returns 0
* parseInt("-FF", 16) returns -255
* parseInt("1100110", 2) returns 102
* parseInt("2147483647", 10) returns 2147483647
* parseInt("-2147483648", 10) returns -2147483648
* parseInt("2147483648", 10) throws a NumberFormatException
* parseInt("99", 8) throws a NumberFormatException
* parseInt("Kona", 10) throws a NumberFormatException
* parseInt("Kona", 27) returns 411787
* </pre></blockquote>
*
* @param s the {@code String} containing the integer
* representation to be parsed
* @param radix the radix to be used while parsing {@code s}.
* @return the integer represented by the string argument in the
* specified radix.
* @exception NumberFormatException if the {@code String}
* does not contain a parsable {@code int}.
*/
public static int parseInt(String s, int radix)
throws NumberFormatException
{
/*
* WARNING: This method may be invoked early during VM initialization
* before IntegerCache is initialized. Care must be taken to not use
* the valueOf method.
*/ if (s == null) {
throw new NumberFormatException("null");
} if (radix < Character.MIN_RADIX) {
throw new NumberFormatException("radix " + radix +
" less than Character.MIN_RADIX");
} if (radix > Character.MAX_RADIX) {
throw new NumberFormatException("radix " + radix +
" greater than Character.MAX_RADIX");
} int result = 0;
boolean negative = false;
int i = 0, len = s.length();
int limit = -Integer.MAX_VALUE;
int multmin;
int digit; if (len > 0) {
char firstChar = s.charAt(0);
if (firstChar < '0') { // Possible leading "+" or "-"
if (firstChar == '-') {
negative = true;
limit = Integer.MIN_VALUE;
} else if (firstChar != '+')
throw NumberFormatException.forInputString(s); if (len == 1) // Cannot have lone "+" or "-"
throw NumberFormatException.forInputString(s);
i++;
}
multmin = limit / radix;
while (i < len) {
// Accumulating negatively avoids surprises near MAX_VALUE
digit = Character.digit(s.charAt(i++),radix);
if (digit < 0) {
throw NumberFormatException.forInputString(s);
}
if (result < multmin) {
throw NumberFormatException.forInputString(s);
}
result *= radix;
if (result < limit + digit) {
throw NumberFormatException.forInputString(s);
}
result -= digit;
}
} else {
throw NumberFormatException.forInputString(s);
}
return negative ? result : -result;
} /**
* Cache to support the object identity semantics of autoboxing for values between
* -128 and 127 (inclusive) as required by JLS.
*
* The cache is initialized on first usage. The size of the cache
* may be controlled by the {@code -XX:AutoBoxCacheMax=<size>} option.
* During VM initialization, java.lang.Integer.IntegerCache.high property
* may be set and saved in the private system properties in the
* sun.misc.VM class.
*/ private static class IntegerCache {
static final int low = -128;
static final int high;
static final Integer cache[]; static {
// high value may be configured by property
int h = 127;
String integerCacheHighPropValue =
sun.misc.VM.getSavedProperty("java.lang.Integer.IntegerCache.high");
if (integerCacheHighPropValue != null) {
try {
int i = parseInt(integerCacheHighPropValue);
i = Math.max(i, 127);
// Maximum array size is Integer.MAX_VALUE
h = Math.min(i, Integer.MAX_VALUE - (-low) -1);
} catch( NumberFormatException nfe) {
// If the property cannot be parsed into an int, ignore it.
}
}
high = h; cache = new Integer[(high - low) + 1];
int j = low;
for(int k = 0; k < cache.length; k++)
cache[k] = new Integer(j++); // range [-128, 127] must be interned (JLS7 5.1.7)
assert IntegerCache.high >= 127;
} private IntegerCache() {}
} /**
* Returns an {@code Integer} instance representing the specified
* {@code int} value. If a new {@code Integer} instance is not
* required, this method should generally be used in preference to
* the constructor {@link #Integer(int)}, as this method is likely
* to yield significantly better space and time performance by
* caching frequently requested values.
*
* This method will always cache values in the range -128 to 127,
* inclusive, and may cache other values outside of this range.
*
* @param i an {@code int} value.
* @return an {@code Integer} instance representing {@code i}.
* @since 1.5
*/
public static Integer valueOf(int i) {
if (i >= IntegerCache.low && i <= IntegerCache.high)
return IntegerCache.cache[i + (-IntegerCache.low)];
return new Integer(i);
} /**
* The value of the {@code Integer}.
*
* @serial
*/
private final int value; /**
* Constructs a newly allocated {@code Integer} object that
* represents the specified {@code int} value.
*
* @param value the value to be represented by the
* {@code Integer} object.
*/
public Integer(int value) {
this.value = value;
} }

Integer是一个非常简单的复合类型,只有一个不可变成员变量value,还有一个IntegerCache的内部静态类,其缓存[-128, 127]范围的Integer数组,使用valueOf生成的Integer实例的时候如果在此范围内使用缓存实例,减少内存的浪费。因为Integer类无法修改value变量(声明为final),并且本身也声明为final,防止了子类修改value的可能,因此可以放心使用缓存来节省内存。

2.3 String类型

接下来来啃一个硬骨头————String类。String经常被我们所使用,它自带了两种初始化方法,直接赋值字符串的分配在栈区,而使用new的将被分配在堆区,这里来看一下分配在堆区的类。

import java.util.Comparator;

/**
* The {@code String} class represents character strings. All
* string literals in Java programs, such as {@code "abc"}, are
* implemented as instances of this class.
* <p>
* Strings are constant; their values cannot be changed after they
* are created. String buffers support mutable strings.
* Because String objects are immutable they can be shared. For example:
* <blockquote><pre>
* String str = "abc";
* </pre></blockquote><p>
* is equivalent to:
* <blockquote><pre>
* char data[] = {'a', 'b', 'c'};
* String str = new String(data);
* </pre></blockquote><p>
* Here are some more examples of how strings can be used:
* <blockquote><pre>
* System.out.println("abc");
* String cde = "cde";
* System.out.println("abc" + cde);
* String c = "abc".substring(2,3);
* String d = cde.substring(1, 2);
* </pre></blockquote>
* <p>
* The class {@code String} includes methods for examining
* individual characters of the sequence, for comparing strings, for
* searching strings, for extracting substrings, and for creating a
* copy of a string with all characters translated to uppercase or to
* lowercase. Case mapping is based on the Unicode Standard version
* specified by the {@link java.lang.Character Character} class.
* <p>
* The Java language provides special support for the string
* concatenation operator (&nbsp;+&nbsp;), and for conversion of
* other objects to strings. String concatenation is implemented
* through the {@code StringBuilder}(or {@code StringBuffer})
* class and its {@code append} method.
* String conversions are implemented through the method
* {@code toString}, defined by {@code Object} and
* inherited by all classes in Java. For additional information on
* string concatenation and conversion, see Gosling, Joy, and Steele,
* <i>The Java Language Specification</i>.
*
* <p> Unless otherwise noted, passing a <tt>null</tt> argument to a constructor
* or method in this class will cause a {@link NullPointerException} to be
* thrown.
*
* <p>A {@code String} represents a string in the UTF-16 format
* in which <em>supplementary characters</em> are represented by <em>surrogate
* pairs</em> (see the section <a href="Character.html#unicode">Unicode
* Character Representations</a> in the {@code Character} class for
* more information).
* Index values refer to {@code char} code units, so a supplementary
* character uses two positions in a {@code String}.
* <p>The {@code String} class provides methods for dealing with
* Unicode code points (i.e., characters), in addition to those for
* dealing with Unicode code units (i.e., {@code char} values).
*
* @author Lee Boynton
* @author Arthur van Hoff
* @author Martin Buchholz
* @author Ulf Zibis
* @see java.lang.Object#toString()
* @see java.lang.StringBuffer
* @see java.lang.StringBuilder
* @see java.nio.charset.Charset
* @since JDK1.0
*/ public final class String
implements java.io.Serializable, Comparable<String>, CharSequence {
/**
* The value is used for character storage.
*/
private final char value[]; /**
* Initializes a newly created {@code String} object so that it represents
* an empty character sequence. Note that use of this constructor is
* unnecessary since Strings are immutable.
*/
public String() {
this.value = "".value;
} /**
* Initializes a newly created {@code String} object so that it represents
* the same sequence of characters as the argument; in other words, the
* newly created string is a copy of the argument string. Unless an
* explicit copy of {@code original} is needed, use of this constructor is
* unnecessary since Strings are immutable.
*
* @param original A {@code String}
*/
public String(String original) {
this.value = original.value;
} /**
* Returns the length of this string.
* The length is equal to the number of <a href="Character.html#unicode">Unicode
* code units</a> in the string.
*
* @return the length of the sequence of characters represented by this
* object.
*/
public int length() {
return value.length;
} /**
* Returns {@code true} if, and only if, {@link #length()} is {@code 0}.
*
* @return {@code true} if {@link #length()} is {@code 0}, otherwise
* {@code false}
* @since 1.6
*/
public boolean isEmpty() {
return value.length == 0;
} /**
* Returns the {@code char} value at the
* specified index. An index ranges from {@code 0} to
* {@code length() - 1}. The first {@code char} value of the sequence
* is at index {@code 0}, the next at index {@code 1},
* and so on, as for array indexing.
*
* <p>If the {@code char} value specified by the index is a
* <a href="Character.html#unicode">surrogate</a>, the surrogate
* value is returned.
*
* @param index the index of the {@code char} value.
* @return the {@code char} value at the specified index of this string.
* The first {@code char} value is at index {@code 0}.
* @throws IndexOutOfBoundsException if the {@code index}
* argument is negative or not less than the length of this
* string.
*/
public char charAt(int index) {
if ((index < 0) || (index >= value.length)) {
throw new StringIndexOutOfBoundsException(index);
}
return value[index];
} public boolean equals(Object anObject) {
if (this == anObject) {
return true;
}
if (anObject instanceof String) {
String anotherString = (String) anObject;
int n = value.length;
if (n == anotherString.value.length) {
char v1[] = value;
char v2[] = anotherString.value;
int i = 0;
while (n-- != 0) {
if (v1[i] != v2[i])
return false;
i++;
}
return true;
}
}
return false;
} /**
* A Comparator that orders {@code String} objects as by
* {@code compareToIgnoreCase}. This comparator is serializable.
* <p>
* Note that this Comparator does <em>not</em> take locale into account,
* and will result in an unsatisfactory ordering for certain locales.
* The java.text package provides <em>Collators</em> to allow
* locale-sensitive ordering.
*
* @see java.text.Collator#compare(String, String)
* @since 1.2
*/
public static final Comparator<String> CASE_INSENSITIVE_ORDER
= new CaseInsensitiveComparator(); private static class CaseInsensitiveComparator
implements Comparator<String>, java.io.Serializable {
// use serialVersionUID from JDK 1.2.2 for interoperability
private static final long serialVersionUID = 8575799808933029326L; public int compare(String s1, String s2) {
int n1 = s1.length();
int n2 = s2.length();
int min = Math.min(n1, n2);
for (int i = 0; i < min; i++) {
char c1 = s1.charAt(i);
char c2 = s2.charAt(i);
if (c1 != c2) {
c1 = Character.toUpperCase(c1);
c2 = Character.toUpperCase(c2);
if (c1 != c2) {
c1 = Character.toLowerCase(c1);
c2 = Character.toLowerCase(c2);
if (c1 != c2) {
// No overflow because of numeric promotion
return c1 - c2;
}
}
}
}
return n1 - n2;
} /**
* Replaces the de-serialized object.
*/
private Object readResolve() {
return CASE_INSENSITIVE_ORDER;
}
} /**
* Returns a canonical representation for the string object.
* <p>
* A pool of strings, initially empty, is maintained privately by the
* class {@code String}.
* <p>
* When the intern method is invoked, if the pool already contains a
* string equal to this {@code String} object as determined by
* the {@link #equals(Object)} method, then the string from the pool is
* returned. Otherwise, this {@code String} object is added to the
* pool and a reference to this {@code String} object is returned.
* <p>
* It follows that for any two strings {@code s} and {@code t},
* {@code s.intern() == t.intern()} is {@code true}
* if and only if {@code s.equals(t)} is {@code true}.
* <p>
* All literal strings and string-valued constant expressions are
* interned. String literals are defined in section 3.10.5 of the
* <cite>The Java&trade; Language Specification</cite>.
*
* @return a string that has the same contents as this string, but is
* guaranteed to be from a pool of unique strings.
*/
public native String intern();
}

String的内存分配实质就是char[] ,并且是不可变的char[],因为value变量声明为final,并且没有对外set方法,而且String还被声明为final,这样杜绝了初始化之后再被修改的可能性。这种不可变性的好处之后再表,这里先按下。

从String自带的几个方法里面,最为有意思的是intern()方法,先看一下源码注释。

/**
* Returns a canonical representation for the string object.
* <p>
* A pool of strings, initially empty, is maintained privately by the
* class {@code String}.
* <p>
* When the intern method is invoked, if the pool already contains a
* string equal to this {@code String} object as determined by
* the {@link #equals(Object)} method, then the string from the pool is
* returned. Otherwise, this {@code String} object is added to the
* pool and a reference to this {@code String} object is returned.
**/

简单地来看,就是一个string的缓存池,跟Integer的IntegerCache是一个意思,不过由于String的可变性比Integer大很多,实现方式也由java实现改为native实现。(不知道怎么吐槽好,java无法直接调用底层,也非自举,如果需要进一步了解native方法的实现,还得看c++的源码,真是一言难尽)

上一篇:分布式文件系统MooseFS安装步骤


下一篇:JavaScript字符串数组拼接的性能测试及优化方法