Kashish-jain's picture
Update README.md
1d68ed6 verified

A newer version of the Gradio SDK is available: 5.42.0

Upgrade
metadata
title: PII Protection Tool
emoji: 🛡️
colorFrom: pink
colorTo: blue
sdk: gradio
sdk_version: 5.25.0
app_file: app.py
pinned: false
license: apache-2.0

PII De-identification with Custom Models

This Hugging Face Space provides a tool for detecting and de-identifying personally identifiable information (PII) in text documents. The app uses a combination of regex patterns and NER models to identify sensitive information and offers three protection methods:

  1. Replace: Replaces PII with entity type tags (e.g., <NAME>)
  2. Mask: Masks PII with asterisks
  3. Synthesize: Replaces PII with realistic synthetic data

Features

  • Detects general PII like names, emails, phone numbers, credit cards
  • Specialized detection for Indian identifiers (PAN, Aadhar)
  • Optional medical entity recognition
  • Multiple de-identification methods
  • Detailed findings report

How to Use

  1. Enter or paste text containing PII in the input box
  2. Select a model type (general or medical)
  3. Choose a de-identification approach
  4. Click "Process Text" to analyze and protect the content

Models

  • Main model: Kashish-jain/pii-protection-model
  • Medical model: Medical entity recognition (when selected)

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference