One common error that you might encounter in Python is:
SyntaxError: (unicode error) 'unicodeescape' codec
can't decode bytes in position 2-3: truncated \UXXXXXXXX escape
This error occurs when you put a backslash and U (\U
) characters in your string, which gets interpreted as the start of Unicode bytes.
This tutorial will show you an example that causes this error and how to fix it.
How to reproduce this error
Suppose you attempt to read a CSV file using pandas as shown below:
import pandas as pd
df = pd.read_csv('C:\Users\nathan\Desktop\example.csv')
print(df)
Output:
File "/Users/nsebhastian/Desktop/DEV/python/main.py", line 3
df = pd.read_csv('C:\Users\nathan\Desktop\example.csv')
^
SyntaxError: (unicode error) 'unicodeescape' codec
can't decode bytes in position 2-3: truncated \UXXXXXXXX escape
This error occurs because the specified file path contains a \U
in the C:\Users
, causing Python to think you’re passing Unicode bytes to decode.
This error has nothing to do with the pandas library or the CSV file, since passing another similar string to Python will trigger the error as well:
file_path = "C:\Users\nathan\Desktop\file.txt" # SyntaxError
The same error occurs when you assign a file path string to the file_path
variable as shown above.
How to fix this error
There are three possible solutions to resolve this error:
- Use the forward slash
/
for the path separator - Use the
r
prefix to create a raw string - Use two backslashes
\\
for the path separator
Let’s see how these three solutions work in practice
1. Use the forward slash /
as path separator
One easy way to resolve this error is to replace the separator for the file path with a forward slash /
as shown below:
import pandas as pd
df = pd.read_csv('C:/Users/nathan/Desktop/example.csv') # ✅
print(df)
By using the forward slash instead of the backslash, the path could work and you don’t receive the error.
Additionally, this solution makes your path portable and valid in Linux, Mac, and Windows systems. The other solutions may cause Python unable to find the location in Linux and Mac.
2. Use the r
prefix to create a raw string
Python allows you to specify a raw string by prefixing the string with the r
character.
By marking the string as raw, Python will interpret the string as is:
file_path = r"C:\Users\nathan\Desktop\file.txt"
The r
prefix is added before the opening quotation mark of the string.
When you run the code, you won’t receive any errors. But keep in mind that this path won’t work in Linux and Mac systems, use the forward slash as separators.
3. Use two backslashes \\
as path separator
Because the error is caused by the backslash, you can escape the backslash by adding another backslash.
This causes the backslash to be interpreted as is, without any special meaning:
file_path = "C:\\Users\\nathan\\Desktop\\file.txt"
With two backslashes \\
as shown above, Python considers the backslash to be a raw character with no special meaning.
Note that this solution only works in Windows, as Linux and Mac use the forward slash for the separator.
Conclusion
Now you’ve learned that the SyntaxError: (unicode error) 'unicodeescape' codec
occurs when you put Unicode bytes start symbol \U
in your string.
The backslash character has a special meaning as the escape character in programming. Unfortunately, it’s also used as the path separator in Windows OS.
To resolve this error, you can replace the backslash with forward slash /
, or you can also remove the meaning in the backslash character using the r
prefix or update the path with a double backslash \\
for the separator.
I hope this tutorial is helpful. Happy coding! 👍