In this article, we will learn about hashing in python, hashing algorithms - MD5 (Message-Digest algorithm 5), SHA-1 (Secure Hash Algorithm 1), SHA-256 (Secure Hash Algorithm 256), SHA-512 (Secure Hash Algorithm 512), and the most recent & recommended hashing technologies for password storage is bcrypt, scrypt,and Argon2.
We will also look into what salting is all about, this is definitely going to be a very interesting article so let's dive right in.
What is Salting?
Cybercrime has increased in frequency over recent years, to gain access to passwords, user information, and other sensitive data, hackers are coming up with a variety of strategies. As a result, it is imperative that developers implement security measures to safeguard the data of their users. Salting passwords is one of these safeguards.
Salting is a security measure that prevents hackers from quickly cracking user passwords. Before hashing a user's password, it adds a random string of characters to it, salt is the name for the arbitrary string. The password and salt are combined, after which the resulting string is hashed. Password cracking techniques like dictionary attacks, brute force attacks, and rainbow table attacks are made more challenging by this process.
Reasons for Salting
There are several reasons why you should use salt when hashing passwords. Some of these reasons include:
Protection against insider attacks: Salting makes it harder for insiders to steal passwords since they won't be able to crack them easily.
Preventing brute force attacks: Brute force attacks involve trying all possible combinations of characters until the correct password is found. With salting, even if a hacker manages to crack one password, it won't work for other users with similar passwords.
Preventing dictionary attacks: With salting, even if the same password is used by multiple users, they will end up with different hashes. This makes it harder for hackers to use pre-computed tables like rainbow tables to crack passwords.
Compliance with regulations: Many data protection regulations require that passwords be stored securely. Salting passwords is a good way to ensure that you are compliant with these regulations.
In Python, salting can be done using the built-in hashlib
module. Here is an example of how to salt a password in Python:
import hashlib
import os
password = '1234!3#'
salt = os.urandom(32)
hashed_password = hashlib.pbkdf2_hmac('sha256', password.encode('utf-8'), salt, 100000)
In the example above, we first generate a random salt using the os.urandom()
method. We then use the hashlib.pbkdf2_hmac()
method to hash the password with the salt. The pbkdf2_hmac()
method is a key derivation function that is commonly used for password hashing. It takes four arguments:
The hash algorithm to use (in this case, SHA-256).
The password to hash (encoded as bytes).
The salt to use (also encoded as bytes).
The number of iterations to use (100000 in this case).
The output of the pbkdf2_hmac()
method is the hashed password, which can be stored in a database or used for authentication.
With that understanding, let's hop on to learn about hashing.
What is Hashing
Hashing in Python is the process of converting an input into a fixed-length sequence of bytes, called a hash or message digest. The hash is generated using a mathematical function that maps the input data to a fixed-size array of bytes. This process is irreversible, meaning that it is impossible to reconstruct the input data from the hash.
Hashing applications
Hashing has various applications including:
Data Integrity: Hashing can be used to ensure the integrity of data by generating a hash of the data and comparing it with the hash of the original data. If the hashes match, it indicates that the data has not been tampered with.
Password Storage: Hashing is commonly used to store passwords in a secure way. When a user creates a password, it is hashed and stored in a database. When the user logs in, the password they enter is hashed and compared with the stored hash. If the hashes match, the user is authenticated.
Indexing and Searching: Hashing can be used to create indexes for searching and retrieving data quickly. Hash tables are commonly used in computer science to implement this functionality.
Digital Signatures: Hashing can be used to create digital signatures, which can be used to verify the authenticity of digital documents. A digital signature is created by hashing the document and then encrypting the hash with the private key of the signer. The recipient can then use the signer's public key to decrypt the hash and compare it with the hash of the original document.
Hashing algorithms
Although there are several hashing algorithms in Python, we will only examine a few in this section.
MD5 (Message-Digest algorithm 5)
import hashlib # Create a hash object using MD5 algorithm hash_object = hashlib.md5() # Update the hash object with a message hash_object.update(b"Hello, world!") # Get the hexadecimal digest of the hash hex_dig = hash_object.hexdigest() print(hex_dig)
This code creates an instance of the
hashlib.md5()
class, updates it with the message "Hello, world!", and gets the hexadecimal digest of the hash. MD5 is a widely-used hash function, but it has some weaknesses and should not be used for cryptographic purposes.SHA-1 (Secure Hash Algorithm 1)
import hashlib # Create a hash object using SHA-1 algorithm hash_object = hashlib.sha1() # Update the hash object with a message hash_object.update(b"Hello, world!") # Get the hexadecimal digest of the hash hex_dig = hash_object.hexdigest() print(hex_dig)
This code creates an instance of the
hashlib.sha1()
class, updates it with the message "Hello, world!", and gets the hexadecimal digest of the hash. SHA-1 is also widely used, but it has been found to have some weaknesses and should be used with caution.SHA-256 (Secure Hash Algorithm 256)
import hashlib # Create a hash object using SHA-256 algorithm hash_object = hashlib.sha256() # Update the hash object with a message hash_object.update(b"Hello, world!") # Get the hexadecimal digest of the hash hex_dig = hash_object.hexdigest() print(hex_dig)
This code creates an instance of the
hashlib.sha256()
class, updates it with the message "Hello, world!", and gets the hexadecimal digest of the hash. SHA-256 is a more secure hash function than MD5 or SHA-1, and is commonly used in cryptography.SHA-512 (Secure Hash Algorithm 512)
import hashlib # Create a hash object using SHA-512 algorithm hash_object = hashlib.sha512() # Update the hash object with a message hash_object.update(b"Hello, world!") # Get the hexadecimal digest of the hash hex_dig = hash_object.hexdigest() print(hex_dig)
This code creates an instance of the
hashlib.sha512()
class, updates it with the message "Hello, world!", and gets the hexadecimal digest of the hash. SHA-512 is an even more secure hash function than SHA-256, but it is also slower and may not be necessary for all applications.
Python hashing Technologies
Scrypt: is a password-based key derivation function that is designed to be expensive to compute and hence difficult to attack via brute force methods. It is commonly used for password storage and authentication purposes. The scrypt algorithm is similar to bcrypt but is designed to be more memory-hard, making it more difficult to create specialized hardware for cracking scrypt hashes.
Let's see how it can be implemented.
import scrypt
# Hash a password
password = b"password1234#$"
salt = scrypt.gensalt()
hashed_password = scrypt.hash(password, salt)
# Verify a password
password_to_verify = b"password1234#$"
if scrypt.verify(password_to_verify, hashed_password):
print("Password is correct")
else:
print("Invalid password")
Bcrypt: is a popular password-hashing library used for secure password storage. It uses the Blowfish algorithm to perform the hash function, making it one of the most secure and slowest password-hashing functions available.
Let's see how it can be implemented.
import bcrypt
# Hash a password
password = b"password1234#$".encode()
salt = bcrypt.gensalt()
hashed_password = bcrypt.hashpw(password, salt)
# Verify a password
password_to_verify = b"password1234#$".encode()
if bcrypt.checkpw(password_to_verify, hashed_password):
print("Password is correct")
else:
print("Invalid password")
Argon2: is a newer password hashing algorithm that is currently considered to be one of the strongest and most secure hashing algorithms available. It was the winner of the Password Hashing Competition in 2015 and is now widely used for password storage in many modern applications.
Here's an example of how to use the Argon2 algorithm in Python using the
argon2
library:
import argon2
# Define the password to be hashed
password = b'password1234#$'
# Hash the password using Argon2
hasher = argon2.PasswordHasher()
hashed_password = hasher.hash(password)
# Print the hashed password
print(hashed_password)
# Verify a password against a hashed password
# Note that this will return True if the password matches the hash
# and False if it does not.
new_password = b'password1234#$'
try:
hasher.verify(hashed_password, new_password)
print("Password is valid.")
except argon2.exceptions.VerifyMismatchError:
print("Password is incorrect.")
In this example, we first define the password that we want to hash. We then create an instance of the argon2.PasswordHasher()
class and use it to hash the password. The resulting hashed password is stored in the hashed_password
variable.
We can then verify a password against the hashed password using the verify()
method of the argon2.PasswordHasher()
class. If the password matches the hash, the method will return True
. If the password does not match the hash, the method will raise a argon2.exceptions.VerifyMismatchError
.
Conclusion
Wrapping up, hashing and salting are important techniques used in enhancing the security of passwords and sensitive data in Python. Hashing enables the creation of a fixed size, unique, and irreversible representation of data while salting adds an extra layer of security by adding random strings to the data before hashing. We have discussed the popular hashing algorithms used in Python, including SHA-256, bcrypt, scrypt, and Argon2, with code examples and explanations for each. It is important to choose the right algorithm based on your specific security requirements.
Let's connect on Twitter and on LinkedIn. You can also subscribe to my YouTube channel.
Happy Coding!