Why is this an issue?
XPath injections occur in an application when the application retrieves untrusted data and inserts it into an XML Path (XPath) query without
sanitizing it first.
What is the potential impact?
In the context of a web application vulnerable to XPath injection:
After discovering the injection point, attackers insert data into the
vulnerable field to execute malicious commands in the affected XML documents.
The impact of this vulnerability depends on the importance of XML structures in the enterprise.
In cases where organizations rely on XML
structures for business-critical operations or where XML is used only for innocuous data transport, the severity of an attack ranges from critical to
harmless.
Below are some real-world scenarios that illustrate some impacts of an attacker exploiting the vulnerability.
Data Leaks
A malicious XPath query allows direct data leakage from one or more databases. Although XML is not as widely used as it once was, this possibility
still exists with configuration files, for example.
Data deletion and denial of service
The malicious query allows the attacker to delete data in the affected XML documents.
This threat is particularly insidious if the attacked
organization does not maintain a disaster recovery plan (DRP) and if XML structures are considered important, as missing critical data can disrupt the
regular operations of an organization.
How to fix it in Java SE
Code examples
The following noncompliant code is vulnerable to XPath injections because untrusted data is concatenated to an XPath query without prior
validation.
Noncompliant code example
public boolean authenticate(HttpServletRequest req, XPath xpath, Document doc) throws XPathExpressionException {
String user = request.getParameter("user");
String pass = request.getParameter("pass");
String expression = "/users/user[@name='" + user + "' and @pass='" + pass + "']";
return (boolean)xpath.evaluate(expression, doc, XPathConstants.BOOLEAN);
}
Compliant solution
public boolean authenticate(HttpServletRequest req, XPath xpath, Document doc) throws XPathExpressionException {
String user = request.getParameter("user");
String pass = request.getParameter("pass");
String expression = "/users/user[@name=$user and @pass=$pass]";
xpath.setXPathVariableResolver(v -> {
switch (v.getLocalPart()) {
case "user":
return user;
case "pass":
return pass;
default:
throw new IllegalArgumentException();
}
});
return (boolean)xpath.evaluate(expression, doc, XPathConstants.BOOLEAN);
}
How does this work?
As a rule of thumb, the best approach to protect against injections is to systematically ensure that untrusted data cannot break out of the
initially intended logic.
Parameterized Queries
For XPath injections, the cleanest way to do so is to use parameterized queries.
XPath allows for the usage of variables inside expressions in the form of $variable
. XPath variables can be used to construct an XPath
query without needing to concatenate user arguments to the query at runtime. Here is an example of an XPath query with variables:
/users/user[@user=$user and @pass=$pass]
When the XPath query is executed, the user input is passed alongside it. During execution, when the values of the variables need to be known, a
resolver will return the correct user input for each variable. The contents of the variables are not considered application logic by the XPath
executor, and thus injection is not possible.
In the example, a parameterized XPath query is created, and an XPathVariableResolver
is used to securely insert untrusted data into
the query, similar to parameterized SQL queries.
Validation
In case XPath parameterized queries are not available, the most secure way to protect against injections is to validate the input before using it
in an XPath query.
Important: The application must do this validation server-side. Validating this client-side is insecure.
Input can be validated in multiple ways:
- By checking against a list of authorized and secure strings that the application is allowed to use in a query.
- By ensuring user input is restricted to a specific range of characters (e.g., the regex
/^[a-zA-Z0-9]*$/
only allows alphanumeric
characters.)
- By ensuring user input does not include any XPath metacharacters, such as
"
, '
, /
, @
,
=
, *
, [
, ]
, (
and )
.
If user input is not considered valid, it should be rejected as it is unsafe.
For Java, OWASP’s Enterprise Security API offers encodeForXPath
which sanitizes metacharacters automatically.
Resources
Articles & blog posts
Standards