Goal
This post explains why yaml.safe_load should be your default in Python.
The short reason is simple: yaml.load with unsafe loaders can execute behavior from YAML input.
The Risk in One Example
import yaml
yaml_text = """
!!python/object/apply:builtins.print
- "Hello from yaml.load"
"""
print("Before load")
yaml.load(yaml_text, Loader=yaml.Loader)
print("After load")
If you run it, the print inside YAML is executed during deserialization.
Expected output:
Before load
Hello from yaml.load
After load
This is exactly the problem. The parser is not only reading data. It is also interpreting Python-specific tags.
Why This Happens
yaml.Loader supports Python object construction tags such as:
!!python/object!!python/object/new!!python/object/apply
That feature can be useful in very narrow trusted environments, but it is dangerous for general config/input parsing.
If YAML comes from users, APIs, files you do not fully control, or any mixed-trust boundary, this can become a code execution path.
Safe Alternative
Use yaml.safe_load:
import yaml
data = yaml.safe_load(yaml_text)
print(data)
safe_load only parses standard YAML types (mapping, list, string, number, bool, null, etc.).
It does not allow Python object construction tags.
So malicious payloads like !!python/object/apply:... are not executed.
Takeaway
yaml.load can deserialize more than data.
If your goal is to parse configuration safely, yaml.safe_load is the right default.