Products
In-IDE
IDE extension that lets you fix coding issues before they exist!
Discover SonarQube for IDE
SaaS
Setup is effortless and analysis is automatic for most languages
Discover SonarQube Cloud
Self-Hosted
Fast, accurate analysis; enterprise scalability
Discover SonarQube Server

Secrets
ABAP
Ansible
Apex
AzureResourceManager
C
C#
C++
CloudFormation
COBOL
CSS
Dart
Docker
Flex
GitHub Actions
Go
HTML
Java
JavaScript
JSON
JCL
Kotlin
Kubernetes
Objective C
PHP
PL/I
PL/SQL
Python
RPG
Ruby
Rust
Scala
Shell
Swift
Terraform
Text
TypeScript
T-SQL
VB.NET
VB6
XML
YAML

Python static code analysis

Unique rules to find Bugs, Vulnerabilities, Security Hotspots, and Code Smells in your PYTHON code

Filtered: 26 rules found

regex

Impact

Clean code attribute

Octal escape sequences should not be used in regular expressions
Code Smell
Character classes in regular expressions should not contain only one character
Code Smell
Superfluous curly brace quantifiers should be avoided
Code Smell
Non-capturing groups without quantifier should not be used
Code Smell
Regular expression quantifiers and character classes should be used concisely
Code Smell
Regular expressions should not contain empty groups
Code Smell
Replacement strings should reference existing regular expression groups
Bug
Regular expressions should not contain multiple spaces
Code Smell
Alternation in regular expressions should not contain empty alternatives
Bug
Single-character alternations in regular expressions should be replaced with character classes
Code Smell
Reluctant quantifiers in regular expressions should be followed by an expression that can't match the empty string
Code Smell
Regex lookahead assertions should not be contradictory
Bug
Back references in regular expressions should only refer to capturing groups that are matched before the reference
Bug
Regex boundaries should not be used in a way that can never be matched
Bug
Regex patterns following a possessive quantifier should not always fail
Bug
Character classes in regular expressions should not contain the same character twice
Code Smell
Unicode Grapheme Clusters should be avoided inside regex character classes
Bug
Names of regular expressions named groups should be used
Code Smell
Character classes should be preferred over reluctant quantifiers in regular expressions
Code Smell
Regular expressions should be syntactically valid
Bug
Regex alternatives should not be redundant
Bug
Using slow regular expressions is security-sensitive
Security Hotspot
Alternatives in regular expressions should be grouped when used with anchors
Bug
Regular expressions should not be too complicated
Code Smell
Repeated patterns in regular expressions should not match the empty string
Bug
`str.replace` should be preferred to `re.sub`
Code Smell

Unicode Grapheme Clusters should be avoided inside regex character classes

intentionality - logical

reliability

Bug

regex

Why is this an issue?

When placing Unicode Grapheme Clusters (characters which require to be encoded in multiple Code Points) inside a character class of a regular expression, this will likely lead to unintended behavior.

For instance, the grapheme cluster c̈ requires two code points: one for 'c', followed by one for the umlaut modifier '\u{0308}'. If placed within a character class, such as [c̈], the regex will consider the character class being the enumeration [c\u{0308}] instead. It will, therefore, match every 'c' and every umlaut that isn’t expressed as a single codepoint, which is extremely unlikely to be the intended behavior.

This rule raises an issue every time Unicode Grapheme Clusters are used within a character class of a regular expression.

Noncompliant code example

re.sub(r"[c̈d̈]", "X", "cc̈d̈d") # Noncompliant, print "XXXXXX" instead of expected "cXXd".

Compliant solution

re.sub(r"c̈|d̈", "X", "cc̈d̈d") # print "cXXd"

Available In:

Catch issues on the fly,
in your IDE

Detect issues in your GitHub, Azure DevOps Services, Bitbucket Cloud, GitLab repositories

Analyze code in your
on-premise CI

Available Since
9.2

Analyze code in your
on-premise CI

Developer Edition
Available Since
9.2

In-IDE

SaaS

Self-Hosted