XML vulnerabilities are still attractive targets for attackers

April 10, 2018 by

--> Click the link to the right to download the associated configuration files for this lab article

XML is widely used in software systems for persistent data, exchanging data between a web service and client, and in configuration files. A misconfigured XML parser can leave a critical flaw in an application. Processing of untrusted XML streams can result in a range of exploits, including remote code execution and sensitive data being read. This tutorial will explain to information security specialists and programmers the fundamentals of XML and XML external entity (XXE) injection and it will go through the major XXE issues found on Google and Facebook servers. Moreover, this tutorial provides a hands-on lab for identifying and exploiting XXE vulnerabilities, along with practical guidance on how to secure a code-supporting XML input parsing.

Learn Vulnerability Management

Learn Vulnerability Management

Get hands-on experience with dozens of courses covering vulnerability assessments, tools, management and more.


An XML to JSON converter tool


  • What is XML?
  • What is an XML external entity (XXE)?
  • What could be more fun than exploiting this vulnerability to access Google and Facebook servers?
  • How to set up your vulnerable environment
  • How to identify/detect XXE vulnerability?
  • How to exploit the XEE injection for fun and profit
  • How to mitigate security risks of XML external entity processing

What is XML?

XML, which stands for extensible markup language, is a markup language designed to store data. It is widely adopted by software systems, used to store configuration files, and in web services it assists in exchanging data between a consumer and a service provider.

What is XML External Entity (XXE)?

XXE: XML external entities allow the inclusion of data dynamically from a given resource (local or remote) at the time of parsing. This feature can be exploited by attackers to include malicious data from external URIs or confidential data residing on the local system. If XML parsers are not configured to prevent or limit external entities, they are forced to access the resources specified by the URI.

<?XML version="1.0"?>

<!DOCTYPE myFile [

<!ELEMENT myFile ANY >

<!ENTITY xxe SYSTEM "file:///etc/passwd">



This is a well-formed XML document. During parsing, the parser will replace the external entity “&xxe;” with the content of the system file “/etc/passwd”, which contains confidential information and might be disclosed. Another example: if the URI ‘file:///etc/passwd’ is replaced by a link to a malicious server that never responds, the parser might end up waiting, thus causing delays in the subsequent processes.

Successful exploitation of this vulnerability may result in disclosure of sensitive data, denial of service, or an attacker gaining unauthorized access to the system resources. If an XML parser does not block external entity expansion and is able to access the referred content, one user may be able to gain unauthorized access to the data of other users, leading to a breach of confidentiality.

Below, I have detailed the characteristics of XXE according to the OWASP Top 10 in 2017:

Attack Vectors: Attackers can exploit vulnerable XML processors if they can upload XML or include hostile content in an XML document, exploiting vulnerable code, dependencies or integrations.

Security Weakness: By default, many older XML processors allow specification of an external entity, a URI that is dereferenced and evaluated during XML processing. SAST tools can discover this issue by inspecting dependencies and configuration. DAST tools require additional manual steps to detect and exploit this issue. Manual testers need to be trained in how to test for XXE, as it is not commonly tested, as of 2017.

Impacts: These flaws can be used to extract data, execute a remote request from the server, scan internal systems, perform a denial-of-service attack, and execute other attacks. The business impact depends on the protection needs of all affected application and data.

What could be more fun than exploiting this vulnerability to access Google and Facebook servers?

Security researchers were able to exploit XXE vulnerability on the Google Toolbar button gallery product. This product allows users to customize their toolbar buttons. Programmers can style it by editing and uploading an XML file. It turned out that the XML parser interprets the DTD blindly. The researchers managed to craft a malicious XML file and uploaded it to the Google production server, letting them read sensitive files. Google awarded a bug bounty of $10,000 for this finding. A security expert managed to attack Facebook servers with an MS Word document, allowing him a remote code execution. The bug specifically affected OpenID. Facebook awarded a bounty of $6,000 for alerting them to this bug.

How do we identify/detect an XXE vulnerability?

To answer this question, you may set up an experimentation lab. We take a real-world scenario of an XML to JSON converter similar to

The following installation and configuration was tested on a Debian9 machine.

The tool is written on top of a Flask framework and uses simplejson. First, install the dependencies:

$ pip install flask
$ pip install simplejson

Next, run the application:

$ python

At this level, the application is up and running and you can start your experiment.

First of all, in general cases you should verify that the application accepts XML as input (directly or via upload).

Source code analysis tools can help detect XXE in source code, although manual code review is the best alternative when the source code is provided. Dynamic analysis testing tools require additional manual steps to detect and exploit this vulnerability.

The second step is to ensure that document type definitions (DTDs) is enabled in the parser.

To do this, change the phone value by &xxe; and click convert. You should get an error message:

Entity 'xxe' not defined, line 5, column 14 (line 5)

This means that the application tried to process XML external entities and therefore it is vulnerable.

Attack scenario

Once you have discovered the vulnerability, you can forge and provide malicious XML input.

The first attack scenario is to attempt to extract data from the server.

Let’s read /etc/passwd from server by pre-pending a definition of xxe entity:

<!DOCTYPE infosecinstitute[

<!ENTITY xxe SYSTEM "file:///etc/passwd">








Now you should see the content of /etc/passwd file and you can get read the secret file from the server.

An additional attack scenario can probe the server’s private network or attempt a denial of service attack.

Mitigate security risks of XML external entity processing

Because software systems that improperly use vulnerable parsers are also vulnerable, we recommend that developers of such systems pay special attention to preventing such attacks if they decide to adopt a third-party XML parser, even if it is provided by a high-profile vendor such as Oracle or Microsoft. In order to block XXE attacks, software developers should gain full understanding of the XML parser that they are considering adopting and avoid its insecure features (e.g., using schema instead of DTD). If external entity references are required, they should refer to trusted sources only. Known vulnerabilities of the parser and their fixes should be investigated and input sanitization should be done before parsing XML content. Adequate security testing of the parser should also be performed.

Recommendations for Parser Developers: Developers of XML parsers need to be fully aware of all potential XML-based attacks and should be able to provide countermeasures wherever possible. It was observed during our experiment that some vulnerabilities can be exploited because of the features allowed in the default configurations of XML parsers. Parser developers should provide secure default configurations and provide alerts when any potentially insecure feature is enabled via making changes to the default configurations. Parser developers should perform security testing of their parsers. They should also provide better documentation, including the potential risks of enabling any feature. This would guide software developers to secure use of their parsers.

Implement positive ("whitelisting") server-side input validation, filtering, or sanitization to prevent hostile data within XML documents, headers, or nodes.

Verify that XML or XSL file upload functionality validates incoming XML using XSD validation or similar.

Back to the lab, to fix the bug, Python LXML parser can be supplied as an additional argument to various parse functions of the lxml API.

from lxml import etree

parser = etree.XMLParser(resolve_entities=False)

etree.fromstring(xml, parser=parser)


In this article, we studied the potential of a major type of XML-based attacks, specifically XML external entities (XXE) that may undermine today’s XML parsers and systems making use of those parsers. We proposed a hands-on lab to learn how to identify, detect, exploit, and mitigat XXE vulnerability based on vulnerable XML parsers. We showed the impact of this type of vulnerability on many known services, making such alarming vulnerability a warning for software developers to take appropriate security measures before using these vulnerable XML parsers in their software development projects. Parser developers need to fix the problems and/or provide better documentation to help developers configure such parsers to secure their usage.

Learn Vulnerability Management

Learn Vulnerability Management

Get hands-on experience with dozens of courses covering vulnerability assessments, tools, management and more.