The Missing Semester of AI for Organizations #2: Risk of Pickle
We must apply a selective instinct for LLMs to be used in organizations. In the world of generative AI, models are serialized and distributed using many different methods. Therefore, knowing how secure a ready-serialized model will be when run in your organization and what risks may exist will save your organization from many risks at an early stage.
I have compiled this research, which I conducted with my colleague Tugay Aslan while working at KKB, for you.
Object Serialization
The ultimate purpose of object serialization, which is often discussed and criticized for its weaknesses, is overlooked. At the end of the day, when programming, there is a need to store the states of the objects created exactly as they are, and various methods are used to do this. To do this, the serialization method is used, which converts objects into bytes in a flatter format.
This allows complex data to be transferred over a network, within an application, or written to a file or database.
As an example, let's serialize an object and its state:
import pickle
from dataclasses import dataclass
@dataclass
class User:
id: int
name: str
roles: list[str]
u = User(id=1, name="Ada", roles=["admin", "editor"])
# 1) Serialize (object -> bytes)
b = pickle.dumps(u, protocol=pickle.HIGHEST_PROTOCOL)
print("Kaç byte:", len(b))
# 2) Dosyaya yaz
with open("user.pkl", "wb") as f:
f.write(b)
# 3) Dosyadan oku + Deserialize (bytes -> object)
with open("user.pkl", "rb") as f:
u2: User = pickle.load(f)
print("Geri açılan:", u2)
Serialization is most commonly practiced today on trained models. This is because once trained, they are serialized so that their weights and hyperparameters stored in memory can be loaded without the need for repeated training.
Some model serialization methods:
- scikit-learn →
joblib.dump(model, "m.pkl") - PyTorch →
torch.save(model.state_dict(), "m.pt") - TensorFlow →
model.save("m.h5")
How are the models serialized?
Machine Learning Model Serialization
| Format | Pros | Cons |
|---|---|---|
| Pickle (.pkl) | Fast and easy | Security vulnerability, language-dependent |
| Joblib | Optimized for large NumPy arrays | Only within the Python ecosystem |
| JSON | Human-readable | Difficult in complex models |
| HDF5 (.h5) | Suitable for large models | Requires an additional library |
| ONNX | Portable between PyTorch and TensorFlow | The format is complex |
| Protobuf (TF SavedModel) | Common in the TensorFlow ecosystem | Limited to other frameworks |
The table above shows the serialization methods commonly used for machine learning models in the industry.
We previously demonstrated serialization using pickle. Now let's serialize the model.
Pickle
Pickle is a module that serializes Python objects into a bytestream. Pickle has a “tightly coupled” relationship with the environment in which serialization is performed. Therefore, it does not guarantee that it will work identically across different versions of Python.
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import joblib
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model = RandomForestClassifier(n_estimators=100)
model.fit(X_train, y_train)
joblib.dump(model, 'random_forest_model.pkl')
loaded_model = joblib.load('random_forest_model.pkl')
print("Tahmin:", loaded_model.predict(X_test[:3]))
Let's give an example of the serialization we will do with hugging face transformers: .bin (Binary Weights)
from transformers import AutoModel, AutoTokenizer
model_name = "bert-base-uncased"
model = AutoModel.from_pretrained(model_name)
model.save_pretrained("./my_bert_model")
loaded_model = AutoModel.from_pretrained("./my_bert_model")
print("Hugging Face model yüklendi.")
Insecure Deserialization
I can almost hear you asking, “In what scenario could serialization put us at risk?” Insecure Deserialization is one of the biggest nightmares in the cybersecurity world in recent years. This is because even in scenarios where RCE vulnerabilities are not possible, scenarios such as privilege escalation and arbitrary file access are possible with Insecure Deserialization vulnerabilities.
The vulnerability usually arises when malicious user input is attempted to be deserialized.
Within LLM files, malicious code that causes arbitrary code execution can be hidden. When model files are executed, arbitrary code is executed.
# vulnerable_llm_state.py
import pickle
def restore_agent_state(state_blob: bytes):
state = pickle.loads(state_blob)
return state
# Example usage (normally this blob could come from an LLM response, S3, a message queue, etc.)
def handle_tool_response(serialized_state_from_model: bytes):
agent_state = restore_agent_state(serialized_state_from_model)
# state'i kullan...
return agent_state
In the example above, pickle.loads can execute arbitrary code using reduce on the incoming data. Therefore, our focus will be on paying attention to what is being executed on the reduce method in the model file to be deserialized.
# make_exploit_pickle.py
import pickle, os, base64
class RCE:
def __reduce__(self):
return (os.system, ("touch /tmp/pwned_from_pickle",))
payload = pickle.dumps(RCE())
print(base64.b64encode(payload).decode())
Huggingface
Huggingface is undoubtedly the most preferred model discovery site for organizations and AI communities. Therefore, it is essential to ensure that the downloaded model is accurate and secure.
When you click on Models in the top bar, you are greeted by countless models developed by the community. When you select any model and click on files and versions on its page, you will be presented with the following files.
(e.g.: qwen3-vl-4b-Instruct)
If you want to download this repository to your server/local machine, you should be aware that you need to address some security concerns. You may encounter files such as .safetensors, .pickle, and .bin, which you may be seeing for the first time.
Right next to the files, you will see that Hugging Face has labeled them as Safe.
How does Huggingface determine this label?
When you click on it, a series of tools perform the following checks:
- Are the file extensions safe? (.safetensors)
- Is there a Pickle protocol header?
- Is there a match with virus signatures?
- Is it a directly executable script like .py?
Safetensors
Safetensors, developed by Hugging Face, is a model format that eliminates concerns about safe model usage.
As shown in the image, the header information, consisting of 8 bytes, contains the tensor's data type, size, and name in N bytes of json, and the remaining part contains raw binary data. Thus, since there is no risk of pickling, the Remote Code Execution vulnerability is also eliminated.
Correct Model
Downloading the correct and secure model from LLM platforms such as Huggingface, where an increasing number of models are listed every day, is of critical importance. We agree that the right model selection should be made using models with the safetensors extension. The primary model files that AI teams in organizations should choose must definitely be specified as safetensors. Model formats that may contain bin and other pickle should be the last resort when no safetensors alternative is available. However, before choosing alternatives, they must go through certain stages in terms of application security.
Pickle Scanning
It is necessary to detect and prevent harmful reduce calls within pickle files in formats other than Safetensors without using model files. For this purpose, you can see the scanning tools used by Huggingface in the image above.
pip install picklescan
picklescan --huggingface ykilcher/totally-harmless-model
https://huggingface.co/ykilcher/totally-harmless-model/resolve/main/pytorch_model.bin:archive/data.pkl: global import '__builtin__ eval' FOUND
----------- SCAN SUMMARY -----------
Scanned files: 1
Infected files: 1
Dangerous globals: 1
SDLC and LLM
One of the application security phases you should incorporate into your internal SDLC is scanning and evaluating LLM.
- All model binary files (.bin, .pkl, .pt, etc.) must be scanned during the CI process using a tool such as modelscan or picklecsan.
- If the scan output is “dangerous,” the pipeline should exit and be blocked. The relevant model can be moved to quarantine.
- Trusted sources should be whitelisted. In this process, trusted Huggingface repositories can be whitelisted via a proxy, or an internal model repository can be used.
- Safetensors-formatted models can be whitelisted. This allows the SDLC to operate more efficiently.
- Perform model uploads in the sandbox.
- You can periodically scan models in the production environment with picklescan.


