metadata

title: PII Protection Tool
emoji: 🛡️
colorFrom: pink
colorTo: blue
sdk: gradio
sdk_version: 5.25.0
app_file: app.py
pinned: false
license: apache-2.0

PII De-identification with Custom Models

This Hugging Face Space provides a tool for detecting and de-identifying personally identifiable information (PII) in text documents. The app uses a combination of regex patterns and NER models to identify sensitive information and offers three protection methods:

Replace: Replaces PII with entity type tags (e.g., <NAME>)
Mask: Masks PII with asterisks
Synthesize: Replaces PII with realistic synthetic data

Features

Detects general PII like names, emails, phone numbers, credit cards
Specialized detection for Indian identifiers (PAN, Aadhar)
Optional medical entity recognition
Multiple de-identification methods
Detailed findings report

How to Use

Enter or paste text containing PII in the input box
Select a model type (general or medical)
Choose a de-identification approach
Click "Process Text" to analyze and protect the content

Models

Main model: Kashish-jain/pii-protection-model
Medical model: Medical entity recognition (when selected)

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference