Malware analysis

Using Base64 for malware obfuscation

Nitesh Malviya
September 8, 2020 by
Nitesh Malviya

What is Malware?

Malware stands for malicious software and software, in simple language, means some program written in any programming language. So if a malicious program is intentionally written to cause damage to any computer or server or gain unauthorized access to any system, it is called malware.

What is Obfuscation

 Obfuscation is the most commonly used technique to conceal the original code written by the programmer, rendering the executable code difficult to read and hard to understand while maintaining the functionality of the written code.

Malware obfuscation techniques 

There are many obfuscation techniques being used by malware writers like Base64, Exclusive OR (XOR), ROT13, Dead code insertion, Instruction changes, Packers etc.

In this post, we will be focusing on Base64 obfuscation technique.

Base64 obfuscation 

Base64 is a simple malware obfuscation technique. The very reason why Base64 encoding is used is because using Base64 it is possible to encode binary data to ASCII string format. Thus, attackers encode data in base64 format and send it over HTTP Protocol. Base64 allows only 64 characters for encoding, hence the name. The characters are –


“=” is used for padding.

Base64 encoding method

You can refer below the Base64 table for converting normal strings to base64 encoding. As per the table, 0 corresponds to letter ‘A’, 45 corresponds to letter  ‘t’, / corresponds to ‘63’ and so on.

Char. Dec.   Char. Dec.   Char. Dec.

A 0   W 22   s 44

B 1   X 23   t 45

C 2   Y 24   u 46

D 3   Z 25   v 47

E 4   a 26   w 48

F 5   b 27   x 49

G 6   c 28   y 50

H 7   d 29   z 51

I 8   e 30   0 52

J 9   f 31   1 53

K 10   g 32   2 54

L 11   h 33   3 55

M 12   i 34   4 56

N 13   j 35   5 57

O 14   k 36   6 58

P 15   l 37   7 59

Q 16   m 38   8 60

R 17   n 39   9 61

S 18   o 40   + 62

T 19   p 41   / 63

U 20   q 42      

V 21   r 43   =   pad


Encoding and decoding Base64

There are many tools and online websites available to encode and decode base64 strings.

One can use following URL to encode and decode base64 string – and

In our case, we will be using Python to encode and decode the base64 string. For example let’s try encoding and decoding “InfosecInstitute”

Encoding – Here is how encoding is done using python. Open the python terminal and run following commands –

>>> import base64

>>> plain_text = "InfosecInstitute"

>>> encoded = base64.b64encode(plain_text)

>>> print encoded


This is how simple it is to encode a base64 string.

Decoding – Here is how decoding is done using python. Open the python terminal and run following commands –

>>> import base64

>>> encoded = "SW5mb3NlY0luc3RpdHV0ZQ==”

>>> decoded = base64.b64decode(encoded)

>>> print decoded


This is how simple it is to decode the base64 string.

Identifying Base64

It is not difficult to identify base64 strings in the binary or network traffic. Base64 encoded letters are usually a long string which comprises base64 characters set (Alphanumeric characters, + and /). If you come across a long string chances are high it may be base64 encoded strings. Another simple technique is to check == present in the long string.

Example - SW5mb3NlYw==

The above string ends with ==. Usually base 64 strings end with == where = is used for padding.

Another method to identify base64 is to use the YARA rule. Here is the sample YARA rule to identify Base64 encoded strings –

rule base64






$a or $b


Nitesh Malviya
Nitesh Malviya

Nitesh Malviya is a Security Consultant. He has prior experience in Web Appsec, Mobile Appsec and VAPT. At present he works on IoT, Radio and Cloud Security and open to explore various domains of CyberSecurity. He can be reached on his personal blog - and Linkedin -