Spaces:
Sleeping
Sleeping
File size: 3,785 Bytes
a508c5b c28e991 a508c5b c28e991 a508c5b 3c98a3e a508c5b 83874aa a508c5b 83874aa 3c98a3e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 |
---
title: Cyber Ner
emoji: π
colorFrom: red
colorTo: red
sdk: docker
app_port: 8501
tags:
- streamlit
pinned: false
short_description: Streamlit template space
---
# NER Benchmarking & On-Prem Deployment
This repository contains the complete solution for benchmarking, selecting, and deploying a Named Entity Recognition (NER) model for the DNRTI dataset, along with an optional web interface for end users.
The project was completed as part of a mission to evaluate SecureBert-NER vs CyNER, choose the best performer, and make it easy to use entirely offline.
The work was carried out in three main stages:
1) Benchmarking Analysis
Goal: Compare SecureBert-NER and CyNER on the DNRTI dataset.
Dataset: DNRTI, containing documents with annotated named entities.
Evaluation Metrics:
1) Precision
2) Recall
3) Latency
Special Handling: Class mapping applied to align DNRTI labels with model outputs
2) On-Prem NER Service
Goal: Package the chosen NER model into an offline-capable API service.
Features:
- HTTP-based API that accepts raw text and returns detected entities + their classes.
- Fully Dockerized for easy on-prem installation.
- No internet connection required after setup.
Deliverables:
- Dockerfile and deployment instructions.
- Local test scripts to validate API functionality.
3) Web UI with Streamlit
The web provides a simple, user-friendly interface for non-technical users.
Features:
- Upload a text file (one at a time).
- Process file content using the deployed NER model.
- Display results in a clean table:
# DNRTI dataset
28 classes
['B-Area', 'B-Exp', 'B-Features', 'B-HackOrg', 'B-Idus', 'B-OffAct', 'B-Org', 'B-Purp', 'B-SamFile', 'B-SecTeam', 'B-Time', 'B-Tool', 'B-Way', 'Despite', 'I-Area', 'I-Exp', 'I-Features', 'I-HackOrg', 'I-Idus', 'I-OffAct', 'I-Org', 'I-Purp', 'I-SamFile', 'I-SecTeam', 'I-Time', 'I-Tool', 'I-Way', 'O']
## Important classes
HackOrg, SecTeam, [dus, Org], [OffAct, Way], Exp, Tool, SamFile, Time, Area, [Purp, Features]
# CyberPeace-Institute/SecureBERT-NER
28 classes
['B-ACT', 'B-APT', 'B-DOM', 'B-ENCR', 'B-FILE', 'B-IDTY', 'B-IP', 'B-LOC', 'B-MAL', 'B-MD5', 'B-OS', 'B-PROT', 'B-SECTEAM', 'B-TIME', 'B-TOOL', 'B-VULID', 'B-VULNAME', 'I-ACT', 'I-APT', 'I-FILE', 'I-IDTY', 'I-LOC', 'I-MAL', 'I-OS', 'I-SECTEAM', 'I-TIME', 'I-TOOL', 'I-VULNAME']
=== Entity-Level Evaluation (IOB2) ===
Precision: 0.4605
Recall: 0.4811
F1-Score: 0.4705
Latency per sentence: 0.192s (CPU)
One can improve it by ignoring errors between similar groups.
# PranavaKailash/CyNER-2.0-DeBERTa-v3-base
13 classes
['B-Indicator', 'B-Malware', 'B-Organization', 'B-System', 'B-Threat_group', 'B-Vulnerability', 'I-Date', 'I-Indicator', 'I-Malware', 'I-Organization', 'I-System', 'I-Threat_group', 'I-Vulnerability']
=== Entity-Level Evaluation (IOB2) ===
Precision: 0.2006
Recall: 0.1345
F1-Score: 0.1611
Latency per sentence: 0.614s
The performances are affected by my comparison and the mapping that I chose.
dnrti_to_syner = {
"HackOrg": "Organization",
"SecTeam": "Organization",
"Idus": "Indicator",
"Org": "Indicator",
"OffAct": "System",
"Way": "System",
"Exp": "Vulnerability",
"Tool": "Malware",
"SamFile": "System",
"Time": "Date",
"Area": "O",
"Purp": "O",
"Features": "O"
}
# How to run?
# evaluation
1) python install -r requirements.txt
2) python create_dataset.py
3) python evaluate.py
# service
1) docker build -f Dockerfile_api . -t ner-app
2) docker run -p 8000:8000 ner-app
3) python tests/test.py
# streamlit
1) docker build . -t streamlit-app
2) docker run -p 8501:8501 streamlit-app
See https://huggingface.co/spaces/yairgalili/cyber-ner |