This rule is part of MISRA C++:2023.
Usage of this content is governed by Sonar’s terms and conditions. Redistribution is
prohibited.
Rule 5.13.7 - String literals with different encoding prefixes shall not be concatenated
[lex.string] Implementation 13
Category: Required
Analysis: Decidable,Single Translation Unit
Amplification
The encoding prefixes are:
-
L — wide string literal;
-
u8 — UTF-8 string literal;
-
u — char16_t string literal;
-
U — char32_t string literal.
For the purposes of this rule, an empty encoding-prefix is considered to be different to a non-empty encoding-prefix, even when they have the same
meaning.
Note: the R prefix is not an encoding-prefix.
Rationale
Concatenation of string literals with different encoding prefixes is either ill-formed or conditionally-supported with
implementation-defined behaviour. The behaviour related to the concatenation of string literals with and without encoding prefixes has
changed as the C++ Standard has evolved. Concatenations of these forms are not permitted to ensure that the behaviour is as expected, especially in
the presence of legacy code.
When concatenating a string literal with a prefix with one having no prefix, the behaviour is as if both have the same encoding prefix. For
example, the concatenation u8"" "\u00fc" is equivalent to u8"\u00fc" (0xc3 0xbc — for some
character set) and not "\u00fc" (0xfc), which may not meet developer expectations. This rule is therefore stricter than the
C++ Standard, and considers an empty encoding-prefix to be different to a non-empty encoding-prefix.
Note: concatenation of string literals with different encoding prefixes is likely to become ill-formed in a future
version of the C++ Standard.
Example
const char * s0 = "Hello" "World"; // Compliant
const wchar_t * s1 = L"Hello" L"World"; // Compliant
const wchar_t * s2 = "Hello" L"World"; // Non-compliant
const wchar_t * s3 = u"Hello" L"World"; // Non-compliant - may not compile
// u8"Hello" L"World"; // Ill-formed
const char * s4 = u8R"#(Hello)#" u8"World"; // Compliant
const char * s5 = u8R"#(Hello)#" "World"; // Non-compliant
Copyright The MISRA Consortium Limited © 2023