This rule is part of MISRA C++:2023.
Usage of this content is governed by Sonar’s terms and conditions. Redistribution is
prohibited.
Rule 5.13.2 - Octal escape sequences, hexadecimal escape sequences and universal character names shall be terminated
[lex.charset]
[lex.icon] Implementation 2, 3
Category: Required
Analysis: Decidable,Single Translation Unit
Amplification
An octal escape sequence, hexadecimal escape sequence or universal character name shall be terminated by either:
- The start of another escape sequence or universal character name; or
- The end of the character constant or the end of a string literal.
Rationale
There is potential for confusion if an octal escape sequence, hexadecimal escape sequence or universal character name is followed by other
characters. For example, the string literal "\x1f" is a single-character, zero-terminated string, whereas "\x1g" includes
the two characters '\x1' and 'g'. The potential for confusion is reduced if every octal escape sequence, hexadecimal escape
sequence or universal character name in a character constant or string literal is terminated.
Example
const char * s1 = "\1234"; // Non-compliant - \123 is not terminated
In the following, the strings pointed to by s2, s3 and s4 are equivalent to "Ag".
const char * s2 = "\x41g"; // Non-compliant
const char * s3 = "\x41" "g"; // Compliant - terminated by end of literal
const char * s4 = "\x41\x67"; // Compliant - terminated by another escape
In the following, s5 contains a universal character name consisting of four hex digits (\u), whilst s6
contains a universal character name consisting of eight hex digits (\U).
const char * s5 = "\u0001F600"; // Non-compliant - \u0001 is not terminated
const char * s6 = "\U0001F600"; // Compliant - terminated by end of literal
Copyright The MISRA Consortium Limited © 2023