Skip to content

Search and Replace

Splitting Strings

Using regular expressions to split strings can achieve more flexible functionality. The String.split() method takes a regular expression as its parameter. Let’s look at the following code:

java
"a b c".split("\\s"); // { "a", "b", "c" }
"a b  c".split("\\s"); // { "a", "b", "", "c" }
"a, b ;; c".split("[\\,\\;\\s]+"); // { "a", "b", "c" }

If we want the user to input a set of tags and then extract those tags, we can handle non-standard inputs using a suitable regular expression. This way, we can eliminate multiple spaces, mixed characters, and semicolons, directly extracting standardized strings.

Searching Strings

Regular expressions can also be used to search for strings. Here’s an example:

java
import java.util.regex.*;

public class Main {
    public static void main(String[] args) {
        String s = "the quick brown fox jumps over the lazy dog.";
        Pattern p = Pattern.compile("\\wo\\w");
        Matcher m = p.matcher(s);
        while (m.find()) {
            String sub = s.substring(m.start(), m.end());
            System.out.println(sub);
        }
    }
}

After obtaining the Matcher object, we don’t need to call the matches() method (which would return false for a complete match); instead, we repeatedly call the find() method to search for substrings in the entire string that match the \\wo\\w pattern and print them. This approach is much more flexible than using String.indexOf() because our search pattern consists of three characters: the middle character must be o, with characters [A-Za-z0-9_] on both sides.

Replacing Strings

To replace strings using regular expressions, you can call String.replaceAll(), with the first parameter being the regular expression and the second parameter being the replacement string. Here’s another example:

java
// regex
public class Main {
    public static void main(String[] args) {
        String s = "The     quick\t\t brown   fox  jumps   over the  lazy dog.";
        String r = s.replaceAll("\\s+", " ");
        System.out.println(r); // "The quick brown fox jumps over the lazy dog."
    }
}

The code above converts a sentence with irregular consecutive spaces into a standardized sentence. As we can see, using regular expressions flexibly can significantly reduce the amount of code.

Backreferences

If we want to replace a specified string according to certain rules, for example, adding <b>xxxx</b> around it, we can use backreferences in the second parameter of replaceAll(). For instance:

java
// regex
public class Main {
    public static void main(String[] args) {
        String s = "the quick brown fox jumps over the lazy dog.";
        String r = s.replaceAll("\\s([a-z]{4})\\s", " <b>$1</b> ");
        System.out.println(r);
    }
}

The output of the above code is:

the quick brown fox jumps <b>over</b> the <b>lazy</b> dog.

It effectively wraps any 4-character word with <b>xxxx</b>. The key to achieving this replacement is " <b>$1</b> ", which uses the matched group substring ([a-z]{4}) to replace $1.

Exercise

A template engine defines a string as a template:

Hello, ${name}! You are learning ${lang}!

Here, ${key} represents a variable, meaning the content to be replaced.

When a Map<String, String> is passed to the template, the corresponding keys need to be replaced with the values from the map.

For example, if the passed map is:

java
{
    "name": "Bob",
    "lang": "Java"
}

Then ${name} should be replaced with the map's corresponding value Bob, and ${lang} should be replaced with the map's corresponding value Java, resulting in the final output:

Hello, Bob! You are learning Java!

Please write a simple template engine that uses regular expressions to implement this functionality.

Hint: Refer to the Matcher.appendReplacement() method.

Summary

Using regular expressions allows you to:

  • Split strings: String.split()
  • Search for substrings: Matcher.find()
  • Replace strings: String.replaceAll()
Search and Replace has loaded