File size: 1,327 Bytes
b6b8912
 
 
 
 
 
1d68ed6
b6b8912
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
---
title: PII Protection Tool
emoji: 🛡️
colorFrom: pink
colorTo: blue
sdk: gradio
sdk_version: 5.25.0
app_file: app.py
pinned: false
license: apache-2.0
---

# PII De-identification with Custom Models

This Hugging Face Space provides a tool for detecting and de-identifying personally identifiable information (PII) in text documents. The app uses a combination of regex patterns and NER models to identify sensitive information and offers three protection methods:

1. **Replace**: Replaces PII with entity type tags (e.g., `<NAME>`)
2. **Mask**: Masks PII with asterisks
3. **Synthesize**: Replaces PII with realistic synthetic data

## Features

- Detects general PII like names, emails, phone numbers, credit cards
- Specialized detection for Indian identifiers (PAN, Aadhar)
- Optional medical entity recognition
- Multiple de-identification methods
- Detailed findings report

## How to Use

1. Enter or paste text containing PII in the input box
2. Select a model type (general or medical)
3. Choose a de-identification approach
4. Click "Process Text" to analyze and protect the content

## Models

- **Main model**: Kashish-jain/pii-protection-model
- **Medical model**: Medical entity recognition (when selected)

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference