Strings, StringBuilder, StringBuffer
Why strings are immutable, the most useful String methods, and when to reach for a builder.
Finished reading?
Mark this session so you can track where you are.
Why strings are immutable, the most useful String methods, and when to reach for a builder.
Finished reading?
Mark this session so you can track where you are.
You have used String since your very first program, the moment you printed a line of text. It has felt like just another kind of value, sitting in a variable like an int does. This session looks underneath that comfort, because a Java string behaves in one surprising way that catches almost everyone, and understanding it changes how you write text-handling code for good.
String is immutable, and predict why a method like toUpperCase() does not change the original.length, charAt, substring, indexOf, contains, replace, trim, and the case methods.equals, and say exactly why == is the wrong tool for text.StringBuilder instead.StringBuffer sibling adds, and roughly when you would ever want it.==.Start with a question that sounds like it has an obvious answer. We have a string in lower case and we want it in upper case. So we call toUpperCase() on it. What is the original string after that? Read the program, predict both lines, then run it.
public class Main { public static void main(String[] args) { String greeting = "hello"; greeting.toUpperCase(); System.out.println(greeting); }}hello
It still prints hello. The call to toUpperCase() did happen, but the upper-case text it produced went nowhere, because we never caught it. A String method does not reach into the original text and rewrite it. It cannot. Once a Java string exists, the characters inside it can never be altered. That property has a name: a String is immutable, which simply means “not changeable”.
So what does toUpperCase() give back? A brand new string, separate from the original. If you want it, you have to store it. That is the whole fix.
public class Main { public static void main(String[] args) { String greeting = "hello"; String shout = greeting.toUpperCase(); System.out.println(greeting); System.out.println(shout); }}hello HELLO
Now you can see both. greeting is untouched at hello. shout holds the new string HELLO that toUpperCase() built and returned. There are two strings in memory, not one string that got edited.
Every String method that “changes” a string actually returns a new string and leaves the original exactly as it was. If you do not capture the return value, the work is thrown away.
"hi".upper() also returns a new string and leaves the original alone; strings are immutable there too. The trap is muscle memory from lists. A Python list can be sorted in place; a string never can. Java draws the same line, just more strictly, and the compiler will not warn you when you forget to keep the result.A string carries a small toolkit of methods. None of them change the string; each one reads it and returns either a number, a character, a boolean, or a new string. Here are the ones that earn their keep. Walk through the program once and confirm each printed line against the comment.
length() tells you how many characters a string has. charAt(i) hands you the single character at position i, counting from zero, exactly like an array index. The first character is at 0; the last is at length() - 1.
public class Main { public static void main(String[] args) { String word = "Hello, World"; System.out.println(word.length()); System.out.println(word.charAt(0)); System.out.println(word.charAt(4)); System.out.println(word.charAt(word.length() - 1)); }}12 H o d
That prints 12, then H, then o, then d. Note that the comma and the space each count as one character toward the length of 12. A space is a real character, not nothing.
word.charAt(12) on a 12-character string throws a runtime error, because the last valid index is 11. This is the same off-by-one trap as walking off the end of an array. When you loop over a string, the condition is i < word.length(), never i <= word.length().substring(start, end) returns the slice of text from index start up to but not including end. The character at end is left out. With a single argument,substring(start) goes from start all the way to the finish.
public class Main { public static void main(String[] args) { String word = "abcdef"; System.out.println(word.substring(2, 5)); System.out.println(word.substring(3)); }}cde def
The first line prints cde: the characters at indices 2, 3, and 4, stopping just before index 5. The second prints def, from index 3onward. The “up to but not including the end” rule is the same half-open style you have already met, and it has a nice property: substring(a, b) always has length b - a.
indexOf(text) finds where a smaller piece of text first appears and returns its starting index, or -1 if it is not there at all. contains(text) answers the simpler yes-or-no question and returns a boolean.
public class Main { public static void main(String[] args) { String sentence = "Hello, World"; System.out.println(sentence.indexOf("World")); System.out.println(sentence.indexOf("zebra")); System.out.println(sentence.contains("World")); System.out.println(sentence.contains("zebra")); }}7 -1 true false
This prints 7, then -1, then true, then false. World begins at index 7. zebra is absent, so indexOf reports the sentinel value -1 andcontains reports false.
-1. The value is a deliberate code for “not found”. The usual pattern is to check it: if (sentence.indexOf("World") != -1)means “if the text is present”. Forgetting this and feeding a -1 straight intocharAt or substring is a common way to crash a program.toUpperCase() and toLowerCase() return a new string in that case. trim() returns a copy with leading and trailing spaces removed, which is invaluable for text a user typed.replace(a, b) returns a copy with every occurrence of a swapped for b. Every one of these returns a new string and leaves the original untouched, just as the first section promised.
public class Main { public static void main(String[] args) { String messy = " Hello "; System.out.println(messy.trim()); System.out.println(messy.toUpperCase()); System.out.println("banana".replace("a", "o")); System.out.println(messy); }}Hello HELLO bonono Hello
It prints Hello with the padding gone, then HELLO in upper case with the spaces still there, then bonono. The final line prints the original Hello , padding and all, proving once more that messy was never modified.
length()int, the number of characterscharAt(i)char at index isubstring(a, b)String, indices a up to but not bindexOf(text)int, the start index or -1contains(text)boolean, present or nottrim()String with edge spaces removedreplace(a, b)String with every a turned into btoUpperCase() / toLowerCase()String in that caseReal string work is usually a loop plus charAt. Here we count how many times the lettera appears in a word, the kind of small task you will write constantly.
public class Main { public static void main(String[] args) { String word = "banana"; int count = 0; for (int i = 0; i < word.length(); i++) { if (word.charAt(i) == 'a') { count++; } } System.out.println("a appears " + count + " times"); }}a appears 3 times
Step through it and watch count climb. banana has an a at indices 1,3, and 5, so the final line is a appears 3 times. Notice we compared single characters with == and a char in single quotes. That is fine, because char is a number-like primitive. The next section is about why == is the wrong tool the moment you compare whole strings instead of single characters.
Here is the single most common Java mistake involving strings, and it is worth getting permanently right. To ask whether two strings hold the same text, you use equals, not ==.
public class Main { public static void main(String[] args) { String a = "cat"; String b = "Cat"; System.out.println(a.equals(b)); System.out.println(a.equals("cat")); System.out.println(a.equalsIgnoreCase(b)); }}false true true
This prints false, then true, then true. equals compares the actual characters, and it is case-sensitive: cat and Cat differ in the first letter, so it isfalse. When you genuinely do not care about case, equalsIgnoreCase compares as if both were lower-cased first, so cat and Cat come out equal.
So why not ==? Because for objects, and a String is an object, ==does not ask “same text?”. It asks “same object?”. Recall the reference picture: a String variable holds the address of some text living in memory, not the text itself. == compares those addresses. Two strings can hold identical characters yet live at different addresses, and then == answersfalse even though the text matches perfectly.
== sometimes happens to give the right answer, so it can pass every test you tried and then fail in production on a string that arrived a different way (from a file, the keyboard, or a calculation). Do not rely on luck. For string content, the rule has no exceptions: use a.equals(b), or a.equalsIgnoreCase(b) when case should not matter.The interpreter in these notes stores small strings simply and would print true for a casual==, which hides the danger. So the program below is a read-along showing what real Java does. Trace it with the addresses-versus-characters idea in mind.
public class Main { public static void main(String[] args) { String a = "cat"; String b = "cat"; // Both came from the same literal, so Java reuses one object. System.out.println(a == b); // true, same address (by luck) // 'new' forces a brand new object with its own address. String c = new String("cat"); System.out.println(a == c); // false, different address // equals looks at the characters, so it is always right. System.out.println(a.equals(c)); // true, same text }}true false true
Look closely at the middle line. c spells the same word as a, yet a == c isfalse, because new String(...) deliberately makes a separate object at a separate address. The text is identical; the objects are not. equals ignores all of that and compares the characters, which is exactly why it is the answer you want.
==asks “is it the same object?”.equalsasks “is it the same text?”. For strings you almost always mean the second, so reach forequalsevery time.
Sometimes you do not want equal-or-not but before-or-after, for sorting words into dictionary order.compareTo returns a negative number if this string comes before the other, zero if they are equal, and a positive number if it comes after. You read the sign of the result, not its exact magnitude.
public class Main { public static void main(String[] args) { System.out.println("apple".compareTo("banana")); System.out.println("banana".compareTo("apple")); System.out.println("apple".compareTo("apple")); }}-1 1 0
This prints a negative number, then a positive number, then 0. apple comes beforebanana alphabetically, so the first result is negative. Reverse the two and the sign flips to positive. Identical strings give 0. The habit to build is to test the sign: if (x.compareTo(y) < 0)reads as “if x comes before y”.
Immutability can feel like an inconvenience the first time it surprises you. So it is worth seeing that it is a deliberate gift, not an accident. Because a string can never change, it is completely safe to share. If two parts of a program point at the same text, neither one can rewrite it under the other's feet. There is a whole class of bugs that simply cannot happen.
Immutability has one real cost, and it shows up when you build a long string piece by piece in a loop. Suppose you join a hundred fragments with +=. Because the string cannot change, every single += must create a whole new string, copy everything built so far into it, then add the next bit. Do that a hundred times and you have built and thrown away ninety-nine intermediate strings. For a few pieces it does not matter. For thousands, in a tight loop, it is genuinely slow.
String s = ""; for (...) s += piece; looks innocent and works fine for small inputs. Inside a loop that runs many thousands of times, it quietly does a mountain of copying. When you spot a string being grown by repeated += inside a loop, that is the signal to switch tools.The tool is StringBuilder. A StringBuilder is a mutable companion toString: a buffer you really can change in place. You append to it as many times as you like with no copying storm, and at the end you call toString() once to get the finished immutable String out.
StringBuilder mutation, so the builder examples below are honest read-alongs with their real Java output shown as a caption. Type them into a real Java playground to watch them run.public class Main { public static void main(String[] args) { StringBuilder sb = new StringBuilder(); for (int i = 0; i < 3; i++) { sb.append("ab"); } String result = sb.toString(); System.out.println(result); }}ababab
The loop appends ab three times into the same buffer, never copying the earlier text. After the loop, toString() snapshots the buffer into a finished String, ababab, which we print. Compare the shape to the wasteful version: the only change is a buffer you mutate instead of a string you keep rebuilding, but the work done underneath is far smaller.
A StringBuilder brings a few more moves beyond append. They modify the buffer in place and return the same builder, so you can read them as commands rather than as new-string-producers.
public class Main { public static void main(String[] args) { StringBuilder sb = new StringBuilder("abc"); sb.insert(1, "XY"); // put "XY" starting at index 1 System.out.println(sb); // aXYbc StringBuilder other = new StringBuilder("abc"); other.reverse(); // flip the buffer in place System.out.println(other); // cba System.out.println("length is " + "abc".length()); }}aXYbc cba length is 3
new string.append in a loop does almost no copying.toString() to get the finished String.Use
Stringfor finished, settled text. UseStringBuilderwhile you are still assembling it piece by piece, especially inside a loop, then hand a plainStringback out at the end.
There is a near-twin of StringBuilder called StringBuffer. Its methods are the same:append, insert, reverse, toString. The one difference is thatStringBuffer is thread-safe, which means it is built to be used by more than one line of execution running at the same time without its buffer getting scrambled.
StringBufferguards its buffer against exactly that. We unpack what “at the same time” really means in the introduction to threads and why shared data needs guarding in thread synchronization. For now, the one-line summary is below.The practical rule is simple. By default, build with StringBuilder. It is the faster of the two and almost all of your code, for a long time, runs on a single thread where the extra protection buys you nothing. Reach for StringBuffer only when several threads really do share one buffer, a situation you will recognise once the next sessions have taught you what a thread is.
Try each one yourself first, then open the answer.
String s = "quiet"; s.toUpperCase(); runs, what does System.out.println(s) print, and why?String w = "programming";, what do w.length() and w.charAt(0) and w.substring(0, 4) each give?a and b both contain the text java, but b was created with new String("java"). What does a == b give in real Java, and what does a.equals(b) give? Explain the gap."cat".compareTo("dog") tell you, just from its sign? And "dog".compareTo("dog")?String result = ""; result += piece; a poor choice, and what should you use instead?Take these away. They continue exactly what we just did.
countVowels(String text) that returns how many vowels (a e i o u, lower case) the text contains. Loop with charAt. Test it on a few words and confirm the counts by hand.length, charAt, and a loop, write code that builds the reverse of a word. Use a StringBuilder and append the characters from the last index down to 0, then print toString(). Run it on "hello" and confirm you get olleh." Ada ", and prints a greeting with the spaces cleaned off and the name in a tidy form. Use trim and one case method. Explain in a sentence why the original string is unchanged after both calls."cat" is safe, and why that same reuse would be dangerous if strings could be changed.