Overly complicated regular expressions are hard to read and to maintain and can easily cause hard-to-find bugs. If a regex is too complicated, you
should consider replacing it or parts of it with regular code or splitting it apart into multiple patterns at least.
The complexity of a regular expression is determined as follows:
Each of the following operators increases the complexity by an amount equal to the current nesting level and also increases the current nesting
level by one for its arguments:
-
|
- when multiple |
operators are used together, the subsequent ones only increase the complexity by 1
- Quantifiers (
*
, +
, ?
, {n,m}
, {n,}
or {n}
)
- Non-capturing groups that set flags (such as
(?i:some_pattern)
or (?i)some_pattern
)
- Lookahead and lookbehind assertions
Additionally, each use of the following features increase the complexity by 1 regardless of nesting:
- character classes
- back references
Noncompliant code example
p = re.compile(r"^(?:(?:31(\/|-|\.)(?:0?[13578]|1[02]))\1|(?:(?:29|30)(\/|-|\.)(?:0?[13-9]|1[0-2])\2))(?:(?:1[6-9]|[2-9]\d)?\d{2})$|^(?:29(\/|-|\.)0?2\3(?:(?:(?:1[6-9]|[2-9]\d)?(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468][048]|[3579][26])00))))$|^(?:0?[1-9]|1\d|2[0-8])(\/|-|\.)(?:(?:0?[1-9])|(?:1[0-2]))\4(?:(?:1[6-9]|[2-9]\d)?\d{2})$")
if p.match($dateString):
handleDate($dateString)
Compliant solution
p = re.compile("^\d{1,2}([-/.])\d{1,2}\1\d{1,4}$")
if p.match($dateString):
dateParts = re.split(r"[-/.]", dateString)
day = intval(dateParts[0])
month = intval(dateParts[1])
year = intval($dateParts[2])
// Put logic to validate and process the date based on its integer parts here