Log analysis

Log analysis counts for a lot in an investigation, and this article provides a gentle introduction to log analysis. Log analysis is exactly what it sounds like — analyzing the log files to access the information they contain.

A log file could contain information such as who is accessing a company’s assets, how is he/she is accessing it, when and how many times. Logs analysis can be different for different types of attacks and the logs source.

This article specifically discusses analysis of logs taken from a web server.

Following is a sample log entry from a webserver’s access logs.

127.0.0.1 – – [31/Jan/2021:16:09:05 -0400] "GET /admin/ HTTP/1.1" 200 "-" "Mozilla/5.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322)"

Need for log analysis

Logging is just a process of storing the logs in the server. We need to analyze the logs for proper results from the logs stored by the server. In the next section, we will see an example of how we can analyze logs to figure out if there are any attacks being attempted on the website.

Logs taken from a place like a busy web server or a proxy server with huge traffic will usually contain a large number of log entries to go through.

Web applications are one of the easiest entry points to gain access to an organization. Most logs associated with access to the web application will be stored in web servers, application servers, database servers, proxy servers and any other device involved in serving the application content. In addition to the attacks that originate from an external source, logs play a crucial role in providing insights into insider attacks, such as exfiltrating proprietary information.

Preventive measures like implementing a web application firewall (WAF) may not always be possible. If the company is maintaining logs in all possible devices, it's possible to have detailed analysis of the user's actions. However, it should be noted that log files have their own limits. The default web server logs contain only limited information of the HTTP request and response. Additional logging or configuration changes are needed to be able to log more details.

Understanding access logs

As mentioned earlier in the article, we will focus on analyzing web application logs, and it is worth understanding what a simple log entry may contain. Let's break down a log entry we had seen in the beginning of this post.

127.0.0.1 – – [31/Jan/2021:16:09:05 -0400] "GET /admin/ HTTP/1.1" 200 "-" "Mozilla/5.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322)"

127.0.0.1: the IP Address of the client.
[31/Jan/2021:16:09:05 -0400] : The time that the server finished processing the request.
GET /admin/ HTTP/1.1: The requested resourced by the client.
200: The status code that the server sends back to the client.
Mozilla/5.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322) : The User-Agent HTTP request header.

Analyzing a web application attack

Assuming that there is an alert on a web application, let us examine some logs and investigate the situation. The alert raised is about an SQL injection attack attempt on the web application. The first thing to get is the access logs from the web server. The server being investigated is Debian-based and the default location of apache server logs on Debian systems is /var/log/apache2/access.log.

If the logs being analyzed are small in size or if we are looking for a specific keyword, then we can spend some time on observing the logs manually using simple tools like grep. The following excerpt shows that we are trying to search for all the requests that are having the keyword “union” in the URL.

$ cat access.log | grep 'union'

192.168.1.85 - - [31/Jan/2021:22:58:14 +0800] "GET /site/?id=1 union+select+1%2C2%2C3-- HTTP/1.1" 200 379 "http://192.168.1.99/site/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36"

192.168.1.85 - - [31/Jan/2021:22:58:21 +0800] "GET /site/?id=1 union+select+1%2C2%2C3%2C4-- HTTP/1.1" 200 379 "http://192.168.1.99/site/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36"

192.168.1.85 - - [31/Jan/2021:22:58:29 +0800] "GET /site/?id=1 union+select+1%2C2%2C3%2C4%2C5-- HTTP/1.1" 200 379 "http://192.168.1.99/site/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36"

As we can see in the preceding excerpt, entries with the word union are found in the access logs and it is obvious that someone with the IP address 192.168.1.85 has attempted SQL injection.

To check if this query is executed by the database, logs from the database must be investigated. In this case, the database being used is a MySQL database and query logging must be enabled to check if any queries are executed. Let us assume that query monitoring is enabled in this case and thus all the executed queries will be stored in a file or database depending on the database configuration.

In the current scenario, these logs are stored in a log file and they are obtained and saved offline for investigation. The following entry from the MySQL logs shows that a union query we found earlier in access logs is indeed executed by the database.

$ tail -n 3 mysql-logs.log

2021-01-31T15:06:32.689985Z 23 Connect root@localhost on infosec using Socket

2021-01-31T15:06:32.690109Z 23 Query SELECT * FROM products where ID=1 union select 1,2,3,4,5--

2021-01-31T15:06:32.690579Z 23 Quit

This log not only confirms that some unauthorized query execution is done on the database, but also confirms that the web application is vulnerable to SQL injection.

Similarly, we can search for the presence of other attacks using specific keywords that are commonly used in web attacks. The following excerpt shows that we are searching for requests that try to read /etc/passwd file. Presence of such entries is an indication of a local file inclusion attack attempt. A sample may look like the following.

$ cat access.log | grep 'passwd' 192.168.1.85 - - [31/Jan/2021:22:59:54 +0800] "GET /site/?id=..%2F..%2F..%2F..%2F..%2Fetc%2Fpasswd HTTP/1.1" 200 380 "http://192.168.1.99/site/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36"

The entry ..%2F..%2F..%2F..%2F..%2Fetc%2Fpasswd in the log is a URL encoded version of ../../../../../etc/passwd, which is a common payload used to check if an application is vulnerable to local file read vulnerability.

Presence of this in an access log is surely an indication of an attack attempt. The HTTP response 200 in this case doesn’t necessarily mean that the file was successfully read. This requires further investigation.

Conclusion

This article has specifically shown examples of log analysis using a simple web attack, but log analysis can be much more challenging when investigating real attacks. This is because investigators often get huge volumes of logs and a log correlation becomes a necessity to ease the analysis process.

In addition to it, attackers use stealthy techniques, and depending on the complexity of the attack being investigated, the analysis can be quite challenging in most cases.

Sources

Network Forensics, Ric Messier

Internet Forensics: Using Digital Evidence to Solve Computer Crime, Robert Jones

Network Forensics: Tracking Hackers through Cyberspace, Sherri Davidoff

Posted: January 18, 2021

Srinivas

View Profile

Srinivas is an Information Security professional with 4 years of industry experience in Web, Mobile and Infrastructure Penetration Testing. He is currently a security researcher at Infosec Institute Inc. He holds an Offensive Security Certified Professional (OSCP) Certification.