NanoMind Security Analyst v0.1.0

A 1.0B parameter generative model for structured security analysis of AI agent configurations, MCP servers, and tool definitions. Fine-tuned from SmolLM2-1.7B-Instruct (12-layer variant) using LoRA rank-64 on 2,668 security analysis examples.

This is a generative structured-output model, not a classifier. It produces JSON analysis objects with threat verdicts, confidence scores, reasoning chains, and remediation steps.

Task Types

Task	Samples	Classification	Structure
threatAnalysis	248	83.8%	79.6%
credentialContextClassification	20	90.0%	96.0%
falsePositiveDetection	20	65.0%	86.0%
artifactClassification	20	75.0%	97.0%
checkExplanation	16	--	100.0%
governanceReasoning	7	--	42.9%
intelReport	1	--	50.0%
Overall	332	82.4%	82.2%

Architecture

Base: SmolLM2-1.7B-Instruct, 12 hidden layers, 2048 hidden size, 32 attention heads
Fine-tuning: LoRA rank-64 (dropout 0.05, scale 128.0), 1,821 iterations
Training data: 2,668 structured security analysis examples across 7 task types
Eval data: 332 held-out examples
Format: MLX safetensors (fused weights, not adapter-only)
Context: 2,048 tokens (training), 8,192 tokens (model max)
Precision: bfloat16

Usage

Requires MLX on Apple Silicon.

from mlx_lm import load, generate

model, tokenizer = load("opena2a/nanomind-security-analyst")

prompt = tokenizer.apply_chat_template([
    {"role": "system", "content": "You are NanoMind Security Analyst. Analyze the following for security threats and respond with structured JSON."},
    {"role": "user", "content": '{"task": "threatAnalysis", "input": {"name": "fetch-data", "description": "Fetches data from any URL provided by the user", "inputSchema": {"url": "string"}}}'}
], tokenize=False, add_generation_prompt=True)

response = generate(model, tokenizer, prompt=prompt, max_tokens=512)
print(response)

Training Details

Optimizer: Adam, lr=1e-5
Batch size: 4, gradient accumulation 1, gradient checkpointing enabled
Best validation loss: 1.534 (iteration 1200), final: 1.578 (iteration 1821)
Prompt masking: enabled (loss computed on completions only)
Key insight: LoRA rank matters significantly. Rank-8 achieved only 4.5% classification accuracy; rank-64 achieved 82.4%. The 6-layer variant (604M params) failed entirely at 31.8%, confirming model depth is critical for structured generation.

Known Limitations

falsePositiveDetection is the weakest task at 65% accuracy -- needs more diverse real-world training scenarios
governanceReasoning structure score regressed from 71% to 43% (only 7 eval samples, high variance)
intelReport has 1 eval sample -- not statistically meaningful
Early stopping at iteration 1200 may yield slightly better generalization (val loss 1.534 vs final 1.578)

Related Models

opena2a/nanomind-security-classifier -- 10-class threat classifier (Mamba TME, 98.5% accuracy, 0 false positives)

License

Apache 2.0

Downloads last month: -

Safetensors

Model size

0.9B params

Tensor type

F16

MLX

Hardware compatibility

Quantized

Model tree for opena2a/nanomind-security-analyst

Base model

HuggingFaceTB/SmolLM2-1.7B

Quantized

HuggingFaceTB/SmolLM2-1.7B-Instruct

Finetuned

(130)

this model

Evaluation results

Classification Accuracy
self-reported

0.824
Structure Score
self-reported

0.822