Fun with email headers
Email is still, to this day, the most used method of online communication. Even though many people predicted email would eventually get replaced by instant messaging or video chat software, the fact remains that email is simple to use, works everywhere thanks to the standardization of the SMTP protocol, and remains the first thing most people do when they get online. As the protocols evolved, messages no longer carry just text, but full web pages, along with images and attachments. An email client from 10 years ago typically doesn't look anything like one does today, where the user interface looks a lot better, and you have access to all sorts of features such as adding a signature, graphics, videos, and sharing functions to your e-mails.
One thing most email users don't realize is that while the visible portion of an email has evolved, so has the hidden part of it. E-Mail clients show just the visible user interface, unless you dig down to see the headers. Quite often this list of headers is longer than the message itself, but it can give you a lot of information about what this email is, where it went, which servers handled it, which software sent it, whether it's considered spam and who the real sender is. Depending on who sends it, and which software applications handle it, an email message can give you tons of information. Here I'll list all of the most useful headers that you can look at.
What should you learn next?
First, in order to have fun with headers, you need to know how to get to them. If you use an email client such as Microsoft Outlook or Live Mail, then you will need to go to the menu and select the Show headers function. Newer versions require you to right click on a message and select Properties. If you use a web mail client, then the headers are placed elsewhere, but all of the popular web clients have an option somewhere to see the headers. For example, in Gmail simply click on the arrow on the right of the message and select Show original.
All of these functions have the same result, and what you end up seeing is not only the visible message, but also all of the headers on top of it. By default, most email apps will show you only three or four headers, namely the To, From, Subject and CC/BCC lines. In reality, a typical message can have 20 header lines or more. So now let's see which ones are useful, and what they can tell you.
If you aren't used to looking at email headers, or network code in general, these can be daunting, but we will look at them one at a time, and see what they tell you. First, let's start with the biggest and hardest to decipher, the Received header. Unlike most other headers, there are often multiple such headers, and what they tell us is how the message traveled through the net. Every email server adds its own Received header, so it's not rare to see more than one. The important thing to know is that you need to read them in reverse order. The last one was added first, so that's the server which is closest to the sender. Here is an example:
Received: from mail-qa0-f48.google.com (mail-qa0-f48.google.com [209.85.216.48]) by mx.google.com with ESMTPS id en4si28895990qab.97.2013.01.03.12.08.57 (version=TLSv1/SSLv3 cipher=OTHER);
The main information you typically want from that is who sent it, and who received it. In this case, one of Google's Gmail servers received the message from another Google server. If you go down to the next header, you will see which server handled it prior to that. This is very useful to see the path used by a message if you want to trace it back. Note that each server adds the information it wants in this header. Some don't add much, and it can be hard to find a lot of information about the sender. However, some servers do add a lot, such as the original IP of the computer which sent the message, the version of the server software, and even the actual computer name. This is an example of a header which has a lot of information:
Received: from HP-Envy.videotron.ca ([98.254.112.254]) by VL-VM-MR003.ip.videotron.ca (Oracle Communications Messaging Exchange Server 7u4-22.01 64bit (built Apr 21 2011)) with ESMTPA id <0MFM00FEN2R2ZN80@VL-VM-MR003.ip.videotron.ca> for dendory@gmail.com; Tue, 25 Dec 2012 19:01:51 -0500 (EST)
Since this header appeared as the very last Received in the message, this tells you the original machine that sent it, including the IP and the computer name, HP-Envy in this case. This next example appears to contain a lot less, but it still has everything you need to fight spam or other abuse email you may receive:
Received: by example.com (Postfix, from userid 48) id 0E7A06FE84A; Sun, 6 Jan 2013 16:39:34 -0500 (EST)
In this case, you know which server sent it, and by contacting the administrator with this, they can look at logs to find out which user has the ID 48. But if this header doesn't go back all the way to the original sender, which happens, there are other headers you can look at to find out the sender's IP. One is called X-Originating-IP and obviously contains the IP address that sent the message. Remember that by default, email has no facility to verify the sender, and a From address can contain anything you want, so you can't rely on that to tell you anything useful about the message.
While knowing the originating IP is useful, especially to report abuse, because the From address can't be trusted, there are a lot of automated systems in place in order to detect spam. Here are some headers that relate to this topic:
Received-SPF: pass (google.com: domain of z718700358rysthl=tznvy.pbz@postmaster.twitter.com designates 199.59.148.232 as permitted sender) client-ip=199.59.148.232;
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=simple/simple; d=twitter.com; s=dkim-201205;bh=+k976hABa1wdkT0G6zc2loFP8UUQF6mHJYgkerhrFoo=;b=JeNLvfiJI7/LH0CIG2pHj70vwnMBIMRR2BI3R1eKZ5Mv0SpzjPgKI2sBVTGC5I/AMPZDkTDLWEWbrg2EK/gqP6se/Y4e9KASW1G7f1asH3LCLXynqDsKuoTHMnamGccTs6KsTUdUOdzkRWkYiLa8554l5Z/DUZJPESGiG+2zPWc=;
These cover two important features of modern mail servers. The first is Sender Policy Framework (SPF) which is a system to verify senders. Basically, an administrator can specify which IP addresses are allowed to send email on their behalf through DNS. In this case, Twitter specified the IP above as a valid mail server to send email coming from @twitter.com. The second feature is called DomainKeys Identified Mail (DKIM) and is specified in the DKIM-Signature header. This is a way for a mail server to retrieve the public key for a particular domain. The header then contains all the information that you would usually find in a digital certificate, which allows the sender to be verified. Both of these systems are heavily used by all the popular mailing services to reduce spam.
Viruses and malware are more big issues with email, as a lot of messages we receive not only try to sell things to us, but also want to infect our computers. There are a few headers dedicated to scanning e-mails such as X-Virus-Scanned and X-Antivirus-Status which tell an email client that this message was scanned for malware.
Then there are a lot more headers that deal with the content in some other way. One example is priority. In Outlook and some other clients, you have the ability to set a priority for your message. This doesn't work with many clients so it's of dubious use, but when you do use this feature, all it does is set an X-Priority header at the top of your email. Another interesting header is the X-Mailer, which tells you the name of the client used to send that message. If you see it's the name of a script, then you can easily guess that it was automatically sent, as opposed to a manual message. Message-ID is almost always there and tells you the ID of the message, which isn't all that useful to know, but gets used by the client to prevent multiple instances of the same email. Also, if you use Gmail, you may have noticed that some messages, especially newsletters, have a via mention at the top:
This simply means the email has a Sender header. All bulk sending services are supposed to add this header which tells you that they are actually sending the message in the name of someone else. So what you see is both the From address, which could be anything, but usually indicates the person who composed it, and the Sender address, which is typically the service who sent the message to a mailing list.
Finally, while most of the useful headers have been covered, remember that the message itself can have interesting content which may be displayed, but not in the way you would expect. Most messages these days are sent in HTML, not plain text. This means the email can contain a lot of things. First, you will typically see a Content-Type header which can tell you whether this is truly an HTML message, and you can also see the attachments, if there are any. By looking at the actual HTML tags, you can see what really goes on in this message. One technique a lot of companies use when they send bulk messages is to embed a small, 1 pixel wide image, in order to track whether you actually look at the message or not. This tracking pixel can't be seen because it is too small, but when you go look at the original code, you can see whether it is there.
What should you learn next?
Email headers have evolved a lot over time, and now a lot of so-called experimental headers have become quite common, such as sender validation, anti-virus scanning and more. It can be fun to look at these, and remind ourselves how much information is leaked by our email clients, such as the actual IP of the machine we're on, the computer name, the software we use, and so on. This can be useful to track spam and scammers, and to see how much work goes on behind the scenes while sending and receiving e-mails.