A newer version of the Gradio SDK is available:
5.42.0
metadata
title: PII Protection Tool
emoji: 🛡️
colorFrom: pink
colorTo: blue
sdk: gradio
sdk_version: 5.25.0
app_file: app.py
pinned: false
license: apache-2.0
PII De-identification with Custom Models
This Hugging Face Space provides a tool for detecting and de-identifying personally identifiable information (PII) in text documents. The app uses a combination of regex patterns and NER models to identify sensitive information and offers three protection methods:
- Replace: Replaces PII with entity type tags (e.g.,
<NAME>
) - Mask: Masks PII with asterisks
- Synthesize: Replaces PII with realistic synthetic data
Features
- Detects general PII like names, emails, phone numbers, credit cards
- Specialized detection for Indian identifiers (PAN, Aadhar)
- Optional medical entity recognition
- Multiple de-identification methods
- Detailed findings report
How to Use
- Enter or paste text containing PII in the input box
- Select a model type (general or medical)
- Choose a de-identification approach
- Click "Process Text" to analyze and protect the content
Models
- Main model: Kashish-jain/pii-protection-model
- Medical model: Medical entity recognition (when selected)
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference