Skip to content
On this page

Regular Expressions

Before understanding regular expressions, let's look at some very common problems:

  1. How to determine if a string is a valid phone number? For example: 010-1234567, 123ABC456, 13510001000, etc.
  2. How to determine if a string is a valid email address? For example: test@example.com, test#example, etc.
  3. How to determine if a string is a valid time? For example: 12:34, 09:60, 99:99, etc.

An intuitive idea is to use a program to make these determinations. This approach requires creating rules for each use case and then implementing them in code. Below is the code to validate a mobile number:

java
boolean isValidMobileNumber(String s) {
    // Is it 11 digits?
    if (s.length() != 11) {
        return false;
    }
    // Are all characters between 0 and 9?
    for (int i = 0; i < s.length(); i++) {
        char c = s.charAt(i);
        if (c < '0' || c > '9') {
            return false;
        }
    }
    return true;
}

The above code performs a very rough check and does not consider more detailed conditions, such as the first digit not being 0.

Besides validating mobile numbers, we also need to validate email addresses, phone numbers, zip codes, etc.:

java
boolean isValidMobileNumber(String s) { ... }
boolean isValidEmail(String s) { ... }
boolean isValidPhoneNumber(String s) { ... }
boolean isValidZipCode(String s) { ... }
...

Writing separate code for each validation logic is indeed too cumbersome. Is there a simpler method?

Yes! Use regular expressions!

Regular expressions allow you to describe rules using strings and use them to match other strings. For example, to validate a mobile number, we can use the regular expression \d{11}:

java
boolean isValidMobileNumber(String s) {
    return s.matches("\\d{11}");
}

Benefits of Using Regular Expressions

What are the benefits of using regular expressions? A regular expression is a string that describes a rule, so by writing the correct rule, we can let the regular expression engine determine whether the target string conforms to the rule.

Regular expressions are a standardized set of rules and can be used in any language. The Java standard library includes a regular expression engine built into the java.util.regex package, making it very simple to use regular expressions in Java programs.

Example

For instance, to determine whether a user-input year is in the format 20##, we first outline the rule as follows:

  • There are four characters: 2, 0, any digit from 0 to 9, and any digit from 0 to 9.

The corresponding regular expression is 20\d\d, where \d represents any single digit.

To convert the regular expression into a Java string, it becomes 20\\d\\d. Note that in Java strings, \ is represented as \\.

Finally, the code to match a string against the regular expression is as follows:

java
// regex
public class Main {
    public static void main(String[] args) {
        String regex = "20\\d\\d";
        System.out.println("2019".matches(regex)); // true
        System.out.println("2100".matches(regex)); // false
    }
}

As seen above, using regular expressions eliminates the need to write complex code for validation. You simply provide a string that represents the regular rule, and the regular expression engine handles the matching.

Summary

  • Regular Expressions: A string that describes a matching rule. Using regular expressions allows for quick determination of whether a given string matches the specified rule.

  • Java Support: The Java standard library's java.util.regex package includes a built-in regular expression engine, making it easy to use regular expressions in Java programs.

By leveraging regular expressions, you can efficiently handle complex string matching and validation scenarios without the need for extensive and repetitive code.

Regular Expressions has loaded