SIEM Log Generator - Mistral 7B QLoRA
A fine-tuned Mistral-7B model specialized in Security Information and Event Management (SIEM) log analysis and generation. This model has been trained using QLoRA (4-bit quantization) on multiple cybersecurity log sources to understand and generate security-related event data.
Model Description
This model is a specialized variant of Mistral-7B-Instruct fine-tuned for SIEM operations, including:
- Network traffic analysis (DDoS detection, port scanning)
- Authentication event monitoring (credential stuffing, brute force)
- Cloud security events (AWS CloudTrail analysis)
- System log interpretation
- MITRE ATT&CK framework mapping
Training Data Sources
The model was trained on a diverse set of security logs:
- Network Logs: CICIDS2017 dataset (DDoS, PortScan patterns)
- Authentication Logs: Risk-based authentication events
- System Logs: Linux/Unix syslog events
- Cloud Logs: AWS CloudTrail security events
MITRE ATT&CK Coverage
The model recognizes and maps events to MITRE ATT&CK techniques:
- T1499: Endpoint Denial of Service (DDoS)
- T1046: Network Service Scanning
- T1110: Brute Force
- T1110.004: Credential Stuffing
- T1078.004: Cloud Account Access
Training Details
Training Configuration
- Base Model: mistralai/Mistral-7B-Instruct-v0.2
- Method: QLoRA (4-bit quantization with LoRA adapters)
- LoRA Rank: 8
- LoRA Alpha: 16
- Target Modules: q_proj, v_proj
- Training Samples: ~500 diverse security events
- Batch Size: 8
- Learning Rate: 5e-4
- Precision: bfloat16
- Training Steps: 50
Hardware
- GPU: NVIDIA Tesla T4 (16GB VRAM)
- Platform: Kaggle Notebooks
- Training Time: ~5-10 minutes
Usage
Installation
pip install transformers peft torch bitsandbytes accelerate
Loading the Model
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
# Load base model with 4-bit quantization
base_model = "mistralai/Mistral-7B-Instruct-v0.2"
model = AutoModelForCausalLM.from_pretrained(
base_model,
load_in_4bit=True,
device_map="auto"
)
# Load LoRA adapters
model = PeftModel.from_pretrained(model, "your-username/siem-log-generator-mistral-7b-qlora")
tokenizer = AutoTokenizer.from_pretrained("your-username/siem-log-generator-mistral-7b-qlora")
# Generate security event analysis
prompt = "<s>[INST] event=network attack=DDoS [/INST]"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Inference Example
# Analyze a security event
event = "timestamp=2024-01-14T10:30:00Z event=auth user=admin attack=BruteForce"
prompt = f"<s>[INST] {event} [/INST]"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=150,
temperature=0.7,
top_p=0.9,
do_sample=True
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Use Cases
1. Security Event Classification
Classify incoming logs into attack types or benign traffic.
2. MITRE ATT&CK Mapping
Automatically map security events to MITRE ATT&CK framework techniques.
3. Log Enrichment
Generate additional context and metadata for security events.
4. Threat Intelligence
Analyze patterns and generate threat reports from log data.
5. Training Data Generation
Create synthetic security logs for testing SIEM systems.
Limitations
- Training Data: Model trained on limited samples (~500) for demonstration
- Domain Specific: Optimized for SIEM/security logs, not general purpose
- Language: English only
- Real-time: Not optimized for ultra-low latency applications
- Accuracy: Should be used as an assistive tool, not sole decision-maker
Ethical Considerations
⚠️ Important Security Notice:
- This model is for defensive cybersecurity purposes only
- Do not use for malicious activities or unauthorized access
- Always comply with applicable laws and regulations
- Validate all model outputs before taking action
- Use in conjunction with human security experts
Model Card Authors
Created by the SIEM Research Team
Citation
If you use this model in your research, please cite:
@misc{siem-log-generator-2025,
author = {Your Name},
title = {SIEM Log Generator - Mistral 7B QLoRA},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/your-username/siem-log-generator-mistral-7b-qlora}
}
License
This model inherits the Apache 2.0 license from Mistral-7B-Instruct-v0.2.
Acknowledgments
- Mistral AI for the base Mistral-7B-Instruct-v0.2 model
- CICIDS2017 dataset contributors
- Hugging Face for the model hosting platform
- QLoRA paper authors for the efficient fine-tuning method
Contact
For questions or issues, please open an issue on the model repository.
Note: This is a research/demonstration model. For production SIEM deployments, additional training on larger, domain-specific datasets is recommended.
Model tree for sohomn/Model_trained_on_5kparams
Base model
mistralai/Mistral-7B-Instruct-v0.2