Changes to String in java (from 1.7.0_06)

4 minute read

Before 1.7.0_06, String has 4 non static field:

  • char[] value
  • int[] offset
  • int count
  • int hash

Subing.substring create a String by sharing the original String’s internal char[] value and setting offset. This saves memory and makes String.substring run in a constant time($O(1)$). Meanwhile, this feature may cause memory leak1.

http://hg.openjdk.java.net/jdk6/jdk6/jdk/file/8deef18bb749/src/share/classes/java/lang/String.java

 1public final class String
 2    implements java.io.Serializable, Comparable<String>, CharSequence
 3{
 4    /** The value is used for character storage. */
 5    private final char value[];
 6
 7    /** The offset is the first index of the storage that is used. */
 8    private final int offset;
 9
10    /** The count is the number of characters in the String. */
11    private final int count;
12
13    /** Cache the hash code for the string */
14    private int hash; // Default to 0
15
16    // ...
17
18    // Package private constructor which shares value array for speed.
19    String(int offset, int count, char value[]) {
20        this.value = value;
21        this.offset = offset;
22        this.count = count;
23    }
24
25    // ...
26
27    /**
28     * Returns a new string that is a substring of this string. The
29     * substring begins at the specified <code>beginIndex</code> and
30     * extends to the character at index <code>endIndex - 1</code>.
31     * Thus the length of the substring is <code>endIndex-beginIndex</code>.
32     * <p>
33     * Examples:
34     * <blockquote><pre>
35     * "hamburger".substring(4, 8) returns "urge"
36     * "smiles".substring(1, 5) returns "mile"
37     * </pre></blockquote>
38     *
39     * @param      beginIndex   the beginning index, inclusive.
40     * @param      endIndex     the ending index, exclusive.
41     * @return     the specified substring.
42     * @exception  IndexOutOfBoundsException  if the
43     *             <code>beginIndex</code> is negative, or
44     *             <code>endIndex</code> is larger than the length of
45     *             this <code>String</code> object, or
46     *             <code>beginIndex</code> is larger than
47     *             <code>endIndex</code>.
48     */
49    public String substring(int beginIndex, int endIndex) {
50        if (beginIndex < 0) {
51            throw new StringIndexOutOfBoundsException(beginIndex);
52        }
53        if (endIndex > count) {
54            throw new StringIndexOutOfBoundsException(endIndex);
55        }
56        if (beginIndex > endIndex) {
57            throw new StringIndexOutOfBoundsException(endIndex - beginIndex);
58        }
59        return ((beginIndex == 0) && (endIndex == count)) ? this :
60            new String(offset + beginIndex, endIndex - beginIndex, value);
61    }
62
63    // ...
64}

Since Java 1.7.0_06, offset and count fields were removed. String.substring makes new copies of value, which means we can forget about the memory leak but the runtime becomes $O(N)$ at the same time.

http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/lang/String.java

 1
 2public final class String
 3    implements java.io.Serializable, Comparable<String>, CharSequence {
 4    /** The value is used for character storage. */
 5    private final char value[];
 6
 7    /** Cache the hash code for the string */
 8    private int hash; // Default to 0
 9
10    // ...
11
12  /**
13     * Allocates a new {@code String} that contains characters from a subarray
14     * of the character array argument. The {@code offset} argument is the
15     * index of the first character of the subarray and the {@code count}
16     * argument specifies the length of the subarray. The contents of the
17     * subarray are copied; subsequent modification of the character array does
18     * not affect the newly created string.
19     *
20     * @param  value
21     *         Array that is the source of characters
22     *
23     * @param  offset
24     *         The initial offset
25     *
26     * @param  count
27     *         The length
28     *
29     * @throws  IndexOutOfBoundsException
30     *          If the {@code offset} and {@code count} arguments index
31     *          characters outside the bounds of the {@code value} array
32     */
33    public String(char value[], int offset, int count) {
34        if (offset < 0) {
35            throw new StringIndexOutOfBoundsException(offset);
36        }
37        if (count < 0) {
38            throw new StringIndexOutOfBoundsException(count);
39        }
40        // Note: offset or count might be near -1>>>1.
41        if (offset > value.length - count) {
42            throw new StringIndexOutOfBoundsException(offset + count);
43        }
44        this.value = Arrays.copyOfRange(value, offset, offset+count);
45    }
46
47    // ...
48
49/**
50     * Returns a string that is a substring of this string. The
51     * substring begins at the specified {@code beginIndex} and
52     * extends to the character at index {@code endIndex - 1}.
53     * Thus the length of the substring is {@code endIndex-beginIndex}.
54     * <p>
55     * Examples:
56     * <blockquote><pre>
57     * "hamburger".substring(4, 8) returns "urge"
58     * "smiles".substring(1, 5) returns "mile"
59     * </pre></blockquote>
60     *
61     * @param      beginIndex   the beginning index, inclusive.
62     * @param      endIndex     the ending index, exclusive.
63     * @return     the specified substring.
64     * @exception  IndexOutOfBoundsException  if the
65     *             {@code beginIndex} is negative, or
66     *             {@code endIndex} is larger than the length of
67     *             this {@code String} object, or
68     *             {@code beginIndex} is larger than
69     *             {@code endIndex}.
70     */
71    public String substring(int beginIndex, int endIndex) {
72        if (beginIndex < 0) {
73            throw new StringIndexOutOfBoundsException(beginIndex);
74        }
75        if (endIndex > value.length) {
76            throw new StringIndexOutOfBoundsException(endIndex);
77        }
78        int subLen = endIndex - beginIndex;
79        if (subLen < 0) {
80            throw new StringIndexOutOfBoundsException(subLen);
81        }
82        return ((beginIndex == 0) && (endIndex == value.length)) ? this
83                : new String(value, beginIndex, subLen);
84    }
85
86    // ...
87
88}

The auther’s comment2: Card