NanoMind Security Analyst v0.1.0

A 1.0B parameter generative model for structured security analysis of AI agent configurations, MCP servers, and tool definitions. Fine-tuned from SmolLM2-1.7B-Instruct (12-layer variant) using LoRA rank-64 on 2,668 security analysis examples.

This is a generative structured-output model, not a classifier. It produces JSON analysis objects with threat verdicts, confidence scores, reasoning chains, and remediation steps.

Task Types

Task Samples Classification Structure
threatAnalysis 248 83.8% 79.6%
credentialContextClassification 20 90.0% 96.0%
falsePositiveDetection 20 65.0% 86.0%
artifactClassification 20 75.0% 97.0%
checkExplanation 16 -- 100.0%
governanceReasoning 7 -- 42.9%
intelReport 1 -- 50.0%
Overall 332 82.4% 82.2%

Architecture

  • Base: SmolLM2-1.7B-Instruct, 12 hidden layers, 2048 hidden size, 32 attention heads
  • Fine-tuning: LoRA rank-64 (dropout 0.05, scale 128.0), 1,821 iterations
  • Training data: 2,668 structured security analysis examples across 7 task types
  • Eval data: 332 held-out examples
  • Format: MLX safetensors (fused weights, not adapter-only)
  • Context: 2,048 tokens (training), 8,192 tokens (model max)
  • Precision: bfloat16

Usage

Requires MLX on Apple Silicon.

from mlx_lm import load, generate

model, tokenizer = load("opena2a/nanomind-security-analyst")

prompt = tokenizer.apply_chat_template([
    {"role": "system", "content": "You are NanoMind Security Analyst. Analyze the following for security threats and respond with structured JSON."},
    {"role": "user", "content": '{"task": "threatAnalysis", "input": {"name": "fetch-data", "description": "Fetches data from any URL provided by the user", "inputSchema": {"url": "string"}}}'}
], tokenize=False, add_generation_prompt=True)

response = generate(model, tokenizer, prompt=prompt, max_tokens=512)
print(response)

Training Details

  • Optimizer: Adam, lr=1e-5
  • Batch size: 4, gradient accumulation 1, gradient checkpointing enabled
  • Best validation loss: 1.534 (iteration 1200), final: 1.578 (iteration 1821)
  • Prompt masking: enabled (loss computed on completions only)
  • Key insight: LoRA rank matters significantly. Rank-8 achieved only 4.5% classification accuracy; rank-64 achieved 82.4%. The 6-layer variant (604M params) failed entirely at 31.8%, confirming model depth is critical for structured generation.

Known Limitations

  • falsePositiveDetection is the weakest task at 65% accuracy -- needs more diverse real-world training scenarios
  • governanceReasoning structure score regressed from 71% to 43% (only 7 eval samples, high variance)
  • intelReport has 1 eval sample -- not statistically meaningful
  • Early stopping at iteration 1200 may yield slightly better generalization (val loss 1.534 vs final 1.578)

Related Models

License

Apache 2.0

Downloads last month
-
Safetensors
Model size
0.9B params
Tensor type
F16
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for opena2a/nanomind-security-analyst

Finetuned
(130)
this model

Evaluation results