Penetration testing

Penetration Testing: Intelligence Gathering

Dimitar Kostadinov
June 16, 2016 by
Dimitar Kostadinov

Introduction: Intelligence Gathering & Its Relationship to the Penetration Testing Process

Penetration testing simulates real cyber-attacks, either directly or indirectly, to circumvent security systems and gain access to a company's information assets. The whole process, however, is more than just playing automated tools and then proceed to write down a report, submit it and collect the check.

The Penetration Testing Execution Standard (PTES) is a norm adopted by leading members of the security community as a way to establish a set of fundamental principles of conducting a penetration test. Seven phases lay the foundations of this standard: Pre-engagement Interactions, Information Gathering, Threat Modeling, Exploitation, Post Exploitation, Vulnerability Analysis, Reporting.

Earn two pentesting certifications at once!

Earn two pentesting certifications at once!

Enroll in one boot camp to earn both your Certified Ethical Hacker (CEH) and CompTIA PenTest+ certifications — backed with an Exam Pass Guarantee.

Intelligence gathering is the first stage in which direct actions against the target are taken. One of the most important ability a pen tester should possess is to know how to learn as much as possible about a targeted organization without the test has even begun – for instance, how this organization operates and its day-to-day business dealings – but most of all, he should make any reasonable endeavor to learn more about its security posture and, self-explanatory, how this organization can be attacked effectively. So, every piece of information that a pen tester can gather will provide invaluable insights into essential characteristics of the security systems in place.

What is the Difference Between Active and Passive Information Gathering?

Information Gathering is at times referred to as Open Source Intelligence (OSINT). The OSINT may come in three different forms:

Active Information Gathering – under this method, the targeted organization may become aware of the ongoing reconnaissance process since the pentester is actively engaging with the target. During this phase, he takes an active part in mapping network infrastructure, then he enumerates and/or scans the open services for vulnerabilities, and eventually searches for unpublished directories, files and servers. Other similar activities include OS Fingerprinting, Banner grabbing, and Web server application scan.

Active information gathering requires more preparation from the person who performs it because it leaves traces, which are likely to alert the target or produce evidence against him in the course of a possible digital investigation. According to the predominant opinion of experts in the information security sector, however, the information gathering process is based to a great extent on the notion of passive reconnaissance whose goal is to collect information about the target via publicly available resources only. Therefore, the other two forms are considered typical of what is actually information gathering.

Semi-passive Information Gathering – in accordance with this technique, profiling of the target is done through methods that would successfully mimic regular Internet traffic and behavior. It would mean that conducting in-depth reverse lookups, brute force DNS requests is out of the question, or even searching for "unpublished" servers or directories. Nevertheless, variations of these techniques are permitted so long as they penetrate with a feather-light touch. Other telltale acts that a pentester should restrain himself from doing are running network level portscans or crawlers. What is allowed, then? In a nutshell, querying only published name servers for relevant information and looking at metadata in published documents and files. As with the Passive Information Gathering phase, it all comes down to not drawing attention to any pentest activities whatsoever. Presumably, post-mortem discoveries on the target's part are possible, but only up to a point that leads to a dead end.

Passive Information Gathering – this option is under discussion provided that there is an explicit demand for the gathering activities not to be detected by the target. In this regard, the pentester cannot use tools that send traffic to the targeted company neither from his host nor an "anonymous" one across the Internet. Not only will that be technically burdening but also the person who performs the pentest will have to substantiate his findings with whatever he can dig out from archived or stored information, which is at times not up to date and incorrect because it has been limited to inquiries collected from third parties.

Passive reconnaissance activities may include (but are not limited to): Identifying IP Addresses and Sub-domains, Identifying External/3rd Party sites, Identifying People, Identifying Technologies, Identifying Content of Interest, Identifying Vulnerabilities. Once again, none of these techniques involve intrusive scanning or probing a given website. Instead, all of this information is to be gathered from the public domain, using techniques and tools readily available to anyone. It all may start, in fact, with conducting manual research into the company's website for useful information as:

Company contact names, phone numbers and email addresses

Company locations and branches

Other companies with which the target company partners or deals

News, such as mergers or acquisitions

Links to other company-related sites

Company privacy policies, which may help identify the types of security mechanisms in place

Other resources that may provide information about the targeted organization:

The SEC's EDGAR database if the company is publicly traded

Job boards, either internal to the company or external sites

Disgruntled employee blogs and Web sites

Trade press

So, in summary, active reconnaissance relies on traffic being sent to the targeted machine whereas, on the other hand, a pentester performing passive reconnaissance should make do with whatever information he can accumulate from the Internet. The former type is de facto undetectable, and the later one is a head-on confrontation of sorts so that the target machine may notice it.

Remember that every element of the penetration testing process should be executed within the scope of the sanctioned pentest. In this connection, passive reconnaissance may be the only method allowed in some cases, since it is fairly unobtrusive.
The key point here is that exploitation is certainly important, but performing a thorough recon could prove very helpful at a later stage and also make the entire pentest go easier, faster and stealthier.

Social Engineering in the Context of Intelligence Gathering

Although social engineering is typically considered a form of passive or semi-passive information gathering, some forms of social engineering may fall into the "active reconnaissance" category.

Social engineering is deemed one of the most widespread avenues for gathering information on a particular individual or a firm. A lot of information is out there – just check the popular social media websites. Also, websites like Pipl, PeekYou, and Spokeo may come in handy as they will provide access to email addresses, locations, phone numbers, and even family tree information.

Clusters of seemingly unrelated information, such as products, services, business partners, suppliers, and analysis of information shared on corporate websites, for example, can proof valuable to understand the targeted organization better. Once you become acquainted with the "internal affairs" of the organization in question, you can try to come up with crafty social engineering schemes to help you replenish your reconnaissance well of information. A telephone call to the company's help desk may deliver to you any information, even privileged information if you are inventive enough (the so-called vishing). They offer a job, so why don't you give them a call or write an email to ask them questions, inter alia, about the organization?

While the term 'dumpster diving' means literally to rummage through someone else's trash, in the cybersecurity, it has a connotation of searching for all kinds of files which may divulge sensitive information like passwords and access codes written down on sticky notes 'in perfect Spanish scrawl.' Seemingly innocuous information, such as an organizational chart, calendar, and phone list, may assist a pentester in creating vicious social engineering scams to gain access to the targeted company's network.

Eavesdropping – finding key places to stay without attracting attention is important. Ideally, you can play with your phone in the middle of a room full of people that share inside information among themselves which you cannot help but overhear. Such hot-spots are a café, bar or restaurant across the street from your target site. Timing is the key – eavesdropping is being in the right place at the right time.

Shoulder Surfing – this is a variation of eavesdropping; instead of straining his years to hear something intriguing, the person who performs intelligence gathering attempts to obtain useful information, i.e., passwords, PINs, security codes, and similar data, by looking over someone's shoulder.

Establishing behavioral patterns (access paths, dress code, key locations, persons of interests, etc.) is part of the social engineering. With the help of services such as a touchgraph (i.e., a visual representation of social interactions between people) and Hoovers profile (which compiles various data on companies and produces a simplistic view on the business), you will perhaps be able to crunch big data and set the stage for a successful social engineering scenario. Maltego is another excellent tool that would allow you to assemble and arrange in a logical way research results leading to profiling of individuals.

Google Hacking Overview

A pentester can use search tips called Google Hacks or Google Dorks to learn what Google knows about the targeted website. Google Hacking is a term used to describe a process of effective utilization of search operators that may reveal security vulnerabilities or misconfigurations in websites. A Google Dork query is a search string that makes use of advanced search operators to unearth information that is not immediately available. As a result, pentesters frequently benefit from Google Hacking to find vulnerabilities, secret sensitive information and access page in given websites indexed by Google's searching algorithm. Google Hacking can uncover the following information:

Source: Google Dorks: An Easy Way of Hacking by fr4nc1stein

Google Hacking Database created by Offensive Security is a very good source for passive Google-based vulnerability discovery. This website possesses a great number of Google hacks whose purpose is to mark specific vulnerabilities based on published advisories.

As a rule of thumb, a penetration test should begin with a passive reconnaissance phase. Public search engines have amassed enormous quantities of information on virtually every website on the Internet. Therefore, one should always give Google Hacking a go. It might surprise you when finding pieces of data so revealing you cannot help but wonder how this is supposed to be left in the open. By way of illustration, if the target has placed sensitive data in publicly available folders on his web server and in the web root (www or public_html), then Google and every other search engine can crawl it. On top of that, most of these directories are not password protected. All this information is publicly available, albeit the website's owner consent.

Network mapping is an essential part of information gathering, and Google Hacking can also be used to locate the subdomain of the target website.

DNS Analysis

The Domain Name System translates easily to memorize, from a human point of view, domain names to numerical IP addresses necessary for locating and identifying computer devices and services with underlying network protocols. Misconfiguration in DNS nameservers, however, may lead to security vulnerabilities that will cause, among other things, information leakage concerning the domain. The DNS conversion forms itself in a local cache or a zone file on the server.

To put it in simple terms, a DNS lookup is when you use a domain name to find an IP address, and the Reverse DNS lookup works the other way around. The forward DNS lookup is the more common option. The whole process starts immediately after the user enters the web address (formally called URL) into his browser, which is first transmitted to the nearby router and then the forward DNS lookup is placed in a routing table to locate the IP address.

A set of information linked to it each domain the moment it is created: IP addresses, registration/creation date, owner of the domain, name servers, domain availability, etc.

One can obtain this information and perform DNS lookup through multiple ways:

Online tools available for DNS lookup:

DNS Stuff

Domain Tools

Open Directory Web Tools

DNS Watch

Into DNS

Network Tools

Security Space WhoIs Gateway

MX Toolbox

Nslookup – it is a tool that one can run on Linux and Windows. It can be used to perform forward and reverse DNS lookups and query DNS server to derive intelligible information about the host machine. Open the command prompt in your Windows machine and type 'nslookup' + the domain name.

Adding '-type=mx' parameter to the nslookup command will produce more information.

Example: nslookup -type=mx

IP Config Command – it will display DNS information – Record Name, Record Type, PTR Record, A-Host Record, Time to Live, Data Length, Section – regarding which websites a machine has visited from the moment cache was last created. Ipconfig /displaydns is the syntax for the ipconfig command.

Host Command – A Linux-driven DNS Lookup that reveals the IP address for a domain or host name. Syntax: host

Dig is another handy Windows and Linux-based DNS lookup tool.

As aforementioned, the reverse DNS lookup reverses the process through entering the IP address to acquire the domain/host name.

Source: Reverse DNS by

Source: How to Perform a DNS Lookup by Tech-FAQ

Websites for reverse DNS Lookup:

WHOIS lookup

WHOIS is a searchable database that contains information about every domain owner. The following information can be obtained from a WHOIS search: registrar, WHOIS server, nameservers, registration date, expiration date, registrant name, email address, IP address, telephone number. The Internet Corporation for Assigned Names and Numbers (ICANN) ensures all domains have valid WHOIS information.

You can retrieve this information via various WHOIS domain lookup systems, but it would perhaps be best if you start with the database on the ICANN's website. Linux or Mac users can use the following command in shell to perform a WHOIS search: whois domain name.

Sometimes WHOIS information is not available because some organizations specialize in offering private WHOIS registration. This is a method which replaces the domain owner information with their own.

Intelligence Gathering Tools /examples/

Netcraft – a free online tool specializing in gathering information on webservers, which covers both the server and client side technologies. Available at (type the domain name).

MetaGoofil — (python-based) – a metadata collection tool that searches the Internet for metadata related to your target. It is built on Kali Linux (so you can use it with Linux), it is also compatible with Windows. Similar tools for extracting metadata from a file (word/pdf/image) and displaying the results in formats such as HTML, XML, JSON, GUI, etc.: FOCA (GUI-based), meta-extractor, ExifTool (Perl-based).

Threatagent – another web-based tool for which you need to sign up at and type in the domain name subject to your reconnaissance aspirations. In the end, the drone extracts all the information you have requested and submits it to you in the form of a thorough report, which comprises IP address range, email address, the point of contact, etc.


As you can see, there are so many methods and resources that the penetration testers have at their disposal to execute the Intelligence Gathering – one of the most significant phases of the penetration testing process as a whole. Whoever attaches some importance to his cybersecurity, therefore, should at least know what information is publicly available about him and his business. Supposedly, when you know what may be used against you, the probability of negative events happening because of this information decreases a lot. In the end, that is the main goal of penetration testing.

Reference List

Acunetix. What is Google Hacking? Available at (12/06/2016)

Czumak, M. (2014). Passive Reconnaissance. Available at (12/06/2016)

fr4nc1stein (2015). Google Dorks: An Easy Way of Hacking. Available at (12/06/2016)

Hack Cave. The Basics of Penetration Testing. Available at (12/06/2016)

Gianchandani, P. (2011). DNS Hacking (Beginner to Advanced). Available at (12/06/2016)

Google Hacker (2015). Using Google as a website vulnerability scanner. Available at (12/06/2016)

Gupta, T. (2010). 5 penetration test tools to secure your network. Available at (12/06/2016)

n00bs. Intelligence
Gathering. Available at (12/06/2016)

Octogence Technologies Pvt Ltd. Importance of Reconnaissance in Pentesting. Available at (12/06/2016)

Rouse, M. (2005). Forward DNS lookup. Available at (12/06/2016)

Rumy, S. (2016). Enumerating DNS records with DNSenum Tool in Kali Linux. Available at (12/06/2016)

Tech-FAQ (2016). Reverse DNS. Available at (12/06/2016)

Tech-FAQ (2016). How to Perform a DNS Lookup. Available at (12/06/2016)

True Demon (2015). The Hacker Ethos. Available at (12/06/2016)

Wing (2014). 15 Penetration Testing Tools-Open Source. Available at (12/06/2016)

Vines, R. (2016). Penetration testing reconnaissance -- Footprinting, scanning and enumerating. Available at (12/06/2016)

Webster (2016). Google Dorks : How to Use Google for Hacking. Available at (12/06/2016) (12/06/2016)

FREE role-guided training plans

FREE role-guided training plans

Get 12 cybersecurity training plans — one for each of the most common roles requested by employers. (2013). What is a WHOIS Search. Available at (12/06/2016)

Dimitar Kostadinov
Dimitar Kostadinov

Dimitar Kostadinov applied for a 6-year Master’s program in Bulgarian and European Law at the University of Ruse, and was enrolled in 2002 following high school. He obtained a Master degree in 2009. From 2008-2012, Dimitar held a job as data entry & research for the American company Law Seminars International and its Bulgarian-Slovenian business partner DATA LAB. In 2011, he was admitted Law and Politics of International Security to Vrije Universiteit Amsterdam, the Netherlands, graduating in August of 2012. Dimitar also holds an LL.M. diploma in Intellectual Property Rights & ICT Law from KU Leuven (Brussels, Belgium). Besides legal studies, he is particularly interested in Internet of Things, Big Data, privacy & data protection, electronic contracts, electronic business, electronic media, telecoms, and cybercrime. Dimitar attended the 6th Annual Internet of Things European summit organized by Forum Europe in Brussels.