And Again, About Storing Passwords
When the question of storing passwords arise, the first idea is to simply keep them in clear text in the corresponding table in the database. However, in 2018 cybercriminals are very good at getting access to such passwords. There are well-known SQL injections and many other potential vulnerabilities. It is generally accepted to assume the worst-case scenario and prepare an action plan. Let's assume that the attacker found a loophole in the web application. In one way or another, he can download the table with the names and passwords of users. In general, his further actions may be as follows:
- Performing illegal actions on behalf of users using their credentials on a vulnerable website. For example, a bank card is attached to the account; now an attacker can use it.
- Attempt to use the password on other websites. Some users tend to use the same passwords for different services.
- Attempt to understand the rule for creating passwords and use it with other websites. Often users use passwords that are different on different websites, but they follow the same rule for creating those passwords and these rules can be identified.
- Elevation of privileges. The same table can store the administrator password which you can use to get full control over the server.
Encryption vs. Hashing
So, keeping passwords in plaintext is not a great idea. What should we do? It would be nice to store passwords that are encrypted in some way. Even if they get retrieved, attackers will not be able to read and use them or, at least, will spend too much time restoring them.
And we come to the point where we must choose between the two approaches: encrypt passwords or use hashing. Let's compare these options.
1. Labor intensity. Encryption takes more time. Irresectable of crypto algorithm we choose, we will have to encrypt and decrypt data with each password check. One of the requirements for hash functions is the speed of execution.
2. The length of the output values. The result of data encryption has a variable length. The final result of hashing is always the same. In addition, it is very convenient to store uniformly sized data in the database. And let's not forget the fact that the password length in encrypted form gives clues about the length of the original password. The same length of hashes, however, may lead to the possibility of collisions.
3. Key management. Using encryption, you need a key. These keys must be stored somewhere hoping that no one will find them. Key generation and management is a big deal. They should be strong, they need to be changed regularly, and so on.
4. Possibility of collisions. When encrypted, the output data formed from the different input data will always be different. When it is hashed, this is not always the case. The same hash length means that the set of output values of the hash function is limited, which leads to the possibility of collision.
Thus, in most cases, hashing wins. However, let's go on and look at hashing in more detail.
Attacks on hashed passwords
So, the attacker managed to obtain our table with the names of users and their passwords. Passwords are now hashed, but this does not stop our attacker, and he is very serious about it and intends to restore passwords. Here are his possible further actions:
- Dictionary attacks. If the master password of administrators does not work, attackers will turn to the list of most popular passwords and try their luck with their hashes.
- Rainbow tables. It may happen that attackers will not need launching dictionary brute- force attacks. Attackers may use rainbow tables widely available on the Internet. The rainbow tables contain the hash functions and corresponding input data already calculated by someone before. It is important to note that due to collisions, the password that the rainbow table will offer is not necessarily the one the user created. The pre-calculated values are already available for MD5, SHA1, SHA256, SHA512, as well as for their modifications.
- Exhaustive search. If the above method does not help, you will have to resort to brute-force and search through all possible passwords until the hash functions finally match.
In the most general case, an attacker will have to brute-force passwords. Moreover, here his success will depend, among other things, on the speed of computing the hash function. For example, Java-implemented hash functions on 64-bit Windows 10 with one Intel i7 2.60GHz and 16GB of RAM were launched one-million times to compute a 36-character hash length. They delivered the following results:
MD5 - 627 ms
SHA-1 - 604 ms
SHA-256 - 739 ms
SHA-512 - 1056 ms
However, today brute-force attacks can be parallelized and run many times faster using the GPU (as well as on APU, DSP, and FPGA). However, in addition to choosing a longer algorithm and a longer output, you can do something else.
Hashing the hash
To prevent the intruder from using ready-made rainbow tables, there is a technique for hashing the password several times. That is, we compute the hash from the hash from the hash from the hash ... doing so N times. However, it is recommended not to get obsessed with this, because doing the usual verification of the user's password, the server will also have to do this. Now the rainbow table will not be easy to use, and the time for brute-force attack will noticeably increase.
However, nothing can stop an attacker from generating his own rainbow table using the common passwords lists and knowing the hashing algorithm used.
Add some salt
To avoid the above situation, passwords are now hashed with the salt.
The salt is an additional random string that is added to the password and is hashed along with it. Even knowing the salt and the output hash, the attacker is doomed to brute-force, and no pre-computed tables are likely to help him.
Taxonomy of "salting" passwords:
1. By the principle of salting:
- Unique salt for each user. Even if the salt becomes known to the attacker, he will have to brute-force each other's password. Moreover, besides, even if two users think alike and come up with identical passwords, the hashes will still be different on the output.
- Global salt which is the same for all and used for all hashes.
- Both of the above variants.
2. By the salt storage method:
- Database storage. As a rule, individual salts are stored in the same database as the password hashes.
- Code storage. The global salt is usually stored not in the database, but, for example, in the config file, so that the offender must spend time on it.
Let's assume that the individual salts of users are stored in the database, and the global salt is stored in the config file. The attacker got access to the database, and he knows all the hashes and the corresponding salts (the global salt is not stored in the database, and he does not know it.) If he combines all the available methods, then, to get passwords in plaintext, he, being extremely goal-oriented, will face the following obstacles:
- He does not know the global salt, so it will have to be brute-forced.
- He knows the salts of users, but he does not have appropriate and pre-made tables with these salts, so he'll have to brute-force passwords.
- This process will take even more time because it is necessary to hash the hashes N times.