Model Card for cisco-ai/SecureBERT2.0-base
SecureBERT 2.0 Base is a domain-specific transformer model optimized for cybersecurity tasks. It extends the ModernBERT architecture with cybersecurity-focused pretraining to produce contextualized embeddings for both technical text and code. SecureBERT 2.0 supports tasks like masked language modeling, semantic search, named entity recognition, vulnerability detection, and code analysis.
Model Details
Model Description
SecureBERT 2.0 Base is designed for deep contextual understanding of cybersecurity language and code. It leverages domain-specific pretraining on a large, heterogeneous corpus covering threat reports, blogs, documentation, and codebases, making it effective for reasoning across natural language and programming syntax.
- Developed by: Cisco AI
- Model type: Transformer (ModernBERT architecture)
- Language: English
- License: Apache 2.0
- Finetuned from model: answerdotai/ModernBERT-base
Model Sources
- Repository: https://huggingface.co/cisco-ai/SecureBERT2.0-base
- Paper: arXiv:2510.00240
Uses
Direct Use
- Masked language modeling for cybersecurity text and code
- Embedding generation for semantic search and retrieval
- Code and text feature extraction for downstream classification or clustering
- Named entity recognition (NER) on security-related entities
- Vulnerability detection in source code
Downstream Use
Fine-tuning for:
- Threat intelligence extraction
- Security question answering
- Incident analysis and summarization
- Automated code review and vulnerability prediction
Out-of-Scope Use
- Non-English or non-technical text
- General-purpose conversational AI
- Decision-making in real-time security systems without human oversight
Bias, Risks, and Limitations
The model reflects biases in the cybersecurity sources it was trained on, which may include:
- Overrepresentation of certain threat actors, technologies, or organizations
- Inconsistent code or documentation quality
- Limited exposure to non-public or proprietary data formats
Recommendations
Users should evaluate outputs in their specific context and avoid automated high-stakes decisions without expert validation.
How to Get Started with the Model
from transformers import AutoModelForMaskedLM, AutoTokenizer
model_name = "cisco-ai/SecureBERT2.0-base"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForMaskedLM.from_pretrained(model_name)
text = "The malware exploits a vulnerability in the [MASK] system."
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)
predicted_token_id = outputs.logits.argmax(-1)
predicted_word = tokenizer.decode(predicted_token_id[0])
print(predicted_word)
Training Details
Training Procedure
Preprocessing
Hybrid tokenization for text and code (natural language + structured syntax).
Training Hyperparameters
- Objective: Masked Language Modeling (MLM)
- Masking probability: 0.10
- Optimizer: AdamW
- Learning rate: 5e-5
- Weight decay: 0.01
- Epochs: 20
- Batch size: 16 per GPU ร 8 GPUs
- Curriculum: Microannealing (gradual dataset diversification)
Evaluation
Testing Data, Factors & Metrics
Testing Data
Internal held-out subset of cybersecurity and code corpora.
Factors
Evaluated across token categories:
- Objects (nouns)
- Actions (verbs)
- Code tokens
Metrics
Top-n accuracy on masked token prediction.
Results
| Top-n | Objects (Nouns) | Verbs (Actions) | Code Tokens |
|---|---|---|---|
| 1 | 56.20 % | 45.02 % | 39.27 % |
| 2 | 69.73 % | 60.00 % | 46.90 % |
| 3 | 75.85 % | 66.68 % | 50.87 % |
| 4 | 80.01 % | 71.56 % | 53.36 % |
| 5 | 82.72 % | 74.12 % | 55.41 % |
| 10 | 88.80 % | 81.64 % | 60.03 % |
This figure presents a comparative study of SecureBERT 2.0, SecureBERT, and ModernBERT on the masked language modeling (MLM) task. This shows SecureBERT 2.0 outperforms both the original SecureBERT and generic ModernBERT, particularly in code understanding and domain-specific terms.

Summary
SecureBERT 2.0 outperforms both the original SecureBERT and ModernBERT on cybersecurity-specific and code-related tasks.
Environmental Impact
- Hardware Type: 8ร GPU cluster
- Hours used: [Information Not Available]
- Cloud Provider: [Information Not Available]
- Compute Region: [Information Not Available]
- Carbon Emitted: [Estimate Not Available]
Carbon footprint can be estimated using Lacoste et al. (2019).
Technical Specifications
Model Architecture and Objective
- Architecture: ModernBERT
- Max sequence length: 1024 tokens
- Parameters: 150 M
- Objective: Masked Language Modeling (MLM)
- Tensor type: F32
Compute Infrastructure
- Framework: Transformers (PyTorch)
- Mixed Precision: fp32
- Hardware: 8 GPUs
- Checkpoint Format: Safetensors
Citation
BibTeX:
@article{aghaei2025securebert,
title={SecureBERT 2.0: Advanced Language Model for Cybersecurity Intelligence},
author={Aghaei, Ehsan and Jain, Sarthak and Arun, Prashanth and Sambamoorthy, Arjun},
journal={arXiv preprint arXiv:2510.00240},
year={2025}
}
APA:
Cisco AI (2025). SecureBERT 2.0: A Domain-Specific Transformer for Cybersecurity and Code Understanding. arXiv:2510.00240.
Model Card Authors
Cisco AI
Model Card Contact
For inquiries, please contact ai-threat-intel@cisco.com
- Downloads last month
- 158