The java.util.regex.Pattern.compile()
methods have a significant performance cost, and therefore should be used sensibly.
Moreover they are the only mechanism available to create instances of the Pattern class, which are necessary to do any pattern matching using
regular expressions. Unfortunately that can be hidden behind convenience methods like String.matches()
or
String.split()
.
It is therefore somewhat easy to inadvertently repeatedly compile the same regular expression at great performance cost with no valid reason.
This rule raises an issue when:
- A
Pattern
is compiled from a String
literal or constant and is not stored in a static final reference.
-
String.matches
, String.split
, String.replaceAll
or String.replaceFirst
are invoked with a
String
literal or constant. In which case the code should be refactored to use a java.util.regex.Pattern
while respecting
the previous rule.
Noncompliant code example
public void doingSomething(String stringToMatch) {
Pattern regex = Pattern.compile("myRegex"); // Noncompliant
Matcher matcher = regex.matcher("s");
// ...
if (stringToMatch.matches("myRegex2")) { // Noncompliant
// ...
}
}
Compliant solution
private static final Pattern myRegex = Pattern.compile("myRegex");
private static final Pattern myRegex2 = Pattern.compile("myRegex2");
public void doingSomething(String stringToMatch) {
Matcher matcher = myRegex.matcher("s");
// ...
if (myRegex2.matcher(stringToMatch).matches()) {
// ...
}
}
Exceptions
String.split
doesn’t create a regex when the string passed as argument meets either of these 2 conditions:
- It is a one-char String and this character is not one of the RegEx’s meta characters ".$|()[{^?*+\"
- It is a two-char String and the first char is the backslash and the second is not the ascii digit or ascii letter.
In which case no issue will be raised.