One error that you might encounter when running Python code is:
TypeError: Unicode-objects must be encoded before hashing
Or if you’re using the latest Python version:
TypeError: Strings must be encoded before hashing
These two errors usually occur when you use the hashlib
module to hash strings.
The following examples show how you can fix this error in your code.
How to reproduce the error
Suppose you want to create an md5 hash from a string in Python.
You may pass a string variable to the hashlib.md5()
method as follows:
import hashlib
name = "Nathan Sebhastian"
hash = hashlib.sha256(name)
But because hashlib
hashing methods require an encoded string, it responds with an error:
Traceback (most recent call last):
File ...
hash = hashlib.sha256(name)
TypeError: Unicode-objects must be encoded before hashing
If you’re using Python version 3.9 or above, the error message has been changed slightly to:
TypeError: Strings must be encoded before hashing
But because strings in Python 3 use Unicode encoding by default, the meaning of both errors is the same.
How to fix the error
To fix this error, you need to encode the string passed to the hashing method.
This is easy to do with the string.encode()
method:
import hashlib
name = "Nathan Sebhastian"
hash = hashlib.sha256(name.encode("UTF-8"))
Unless you have specific requirements, using UTF-8
should be okay because it’s the most common character encoding method.
By default, the encode()
method will use UTF-8
encoding when you don’t pass any argument. I’m just showing you how to pass one if you need it.
If you’re passing a literal string to the hashing method, you can use the byte string format.
Here’s an example:
import hashlib
hash = hashlib.sha256(b"Nathan")
print(hash.hexdigest())
# or
hash = hashlib.sha256("Nathan".encode())
print(hash.hexdigest())
## Output:
# 9770c8d8bf2f05a2803df9e56aa2ca3218f65c203b4bf21b348563d8ee7ac66f
# 9770c8d8bf2f05a2803df9e56aa2ca3218f65c203b4bf21b348563d8ee7ac66f
As you can see from the hexdigest()
output, the hashing results are identical.
You need to encode the string no matter if you use sha256
, sha512
, or md5
hash algorithm.
Calling the update() method
Note that you also need to encode the string passed to the hashlib.update()
method like this:
import hashlib
hash = hashlib.sha256()
hash.update("Nathan".encode()) # ✅
If you don’t encode()
the string, then Python will raise the same error.
Now you’ve learned how to fix Python ‘Unicode-objects or Strings must be encoded before hashing’ error.
I hope this article was helpful. See you in other articles! 🍻