By default case insensitivity only affects letters in the ASCII range. This can be changed by either passing Pattern.UNICODE_CASE
or
Pattern.UNICODE_CHARACTER_CLASS
as an argument to Pattern.compile
or using (?u)
or (?U)
within the
regex.
If not done, regular expressions involving non-ASCII letters will still handle those letters as being case sensitive.
Noncompliant code example
Pattern.compile("söme pättern", Pattern.CASE_INSENSITIVE);
str.matches("(?i)söme pättern");
str.matches("(?i:söme) pättern");
Compliant solution
Pattern.compile("söme pättern", Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE);
str.matches("(?iu)söme pättern");
str.matches("(?iu:söme) pättern");
// UNICODE_CHARACTER_CLASS implies UNICODE_CASE
Pattern.compile("söme pättern", Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CHARACTER_CLASS);
str.matches("(?iU)söme pättern");
str.matches("(?iU:söme) pättern");