Penetration testing

Explore Python for MITRE ATT&CK email collection and clipboard data

Howard Poston
August 27, 2021 by
Howard Poston

The MITRE ATT&CK framework breaks the lifecycle of a cyberattack in a set of objectives (or tactics) that the attacker may need to accomplish to reach their final goal. For each of these tactics, MITRE ATT&CK describes a number of techniques by which they could be accomplished.

Most companies have a great deal of valuable data lying around their networks, including customer data, intellectual property and more. The collection tactic in the MITRE ATT&CK framework is focused on the various locations where this data resides and the methods by which an attacker could gain access to it.

Earn two pentesting certifications at once!

Earn two pentesting certifications at once!

Enroll in one boot camp to earn both your Certified Ethical Hacker (CEH) and CompTIA PenTest+ certifications — backed with an Exam Pass Guarantee.

Introduction to clipboard data

Everyone is familiar with the clipboard on a computer. The ability to copy-paste data from one location to another saves tediously retyping it and minimizing the probability of an expensive typo.

Clipboard data

When searching for valuable information on a target system, an attacker may be overwhelmed by choices. Most organizations have vast amounts of data, and it can be difficult to weed through it all to find valuable nuggets. Doing so efficiently requires developing a system for effectively identifying data of value.

One option is to look at the data that the user of the system is actively using. While not all data copy-pasted and stored on the clipboard is valuable, this tool is commonly used when someone wants to make sure that they enter the correct data into a particular document, website or other area. Often, this important data is also valuable data.

Modifying clipboard data with Python

The system clipboard is designed to be accessible to any application running on a computer. This includes scripts written in Python, which have the ability to monitor and modify the contents of the clipboard.

The code sample above (available on Github) goes beyond simply collecting data to maliciously modifying it. The goal of this code is to identify when the clipboard contains an email address, and, when it does, replace the stored value with the attacker’s email address. This may result in sensitive data being sent to the attacker instead of the intended recipient.

The sample code accomplishes this goal using the win32clipboard package and regular expressions. The email regex variable describes the format of a valid email address. Within the infinite loop, the code opens the clipboard, compares its contents to the regex and, if it matches, sets the clipboard’s text to the attacker’s email.

This sample code is relatively crude and likely to be detected in real life. However, if the attacker had registered a domain that looks like the company’s domain, this code could be used to send emails there instead with a relatively low chance of detection.

Introduction to email collection

Emails are a valuable source of information for an attacker. Emails may contain intellectual property, customer data and other valuable data. Additionally, access to email communications can be invaluable for setting up and performing a spearphishing attack.

Local email collection

Many organizations are switching to webmail systems like G Suite or O365. However, many email clients cache emails locally on a computer, making them potentially accessible to an attacker.

Microsoft Outlook is an example of a widely-used desktop email client that performs local caching of emails. Since these emails are stored as files on the computer (and need to be accessible to users), reading emails may only require knowledge of the format of the email backups.

Accessing local email repos with Python

In Python, the libratom package makes it simple to process Microsoft Outlook email archives using PffArchive.

The code sample above (available here) demonstrates how to do this. After opening the email backup file as a PffArchive, it is possible to iterate through the folders that it contains and the messages in each folder. Each email within the archive is parsed, making it possible to easily access the sender address, subject, email body and more.

This access to local email repositories can be invaluable for data collection during a penetration testing engagement. By looking at cached emails, the attacker can steal sensitive data or can learn subjects, tone of voice, and other useful data for building a spearphishing email.

Using Python for email collection and clipboard data 

Data collection is being incorporated into a growing number of cyberattack campaigns. Now, even ransomware malware commonly steals data before encryption to set up double extortion. 

Using Python, it is possible to collect data from a number of different sources, including the system clipboard and local repositories of emails.

What should you learn next?

What should you learn next?

From SOC Analyst to Secure Coder to Security Manager — our team of experts has 12 free training plans to help you hit your goals. Get your free copy now.


Howard Poston
Howard Poston

Howard Poston is a copywriter, author, and course developer with experience in cybersecurity and blockchain security, cryptography, and malware analysis. He has an MS in Cyber Operations, a decade of experience in cybersecurity, and over five years of experience as a freelance consultant providing training and content creation for cyber and blockchain security. He is also the creator of over a dozen cybersecurity courses, has authored two books, and has spoken at numerous cybersecurity conferences. He can be reached by email at or via his website at