All projects
Active October 2025
AI Data Scrubber
A lightweight privacy-focused tool that removes personal information from text documents before uploading them to LLMs.
Python Privacy NLP spaCy CLI
A lightweight privacy-focused tool that removes personal information from text documents before uploading them to Large Language Models.
What it does
Uses regex and spaCy’s named entity recognition to scrub sensitive data from text, replacing it with labelled placeholders like [NAME] and [EMAIL]. Handles:
- Names
- Email addresses
- Phone numbers
- Physical addresses and ZIP codes
- URLs
- US license plates
Usage
Available as both a CLI tool and a Python API:
from ai_data_scrubber import scrub_text
clean = scrub_text("Send a message to Jane Smith at jane@example.com")
# "Send a message to [NAME] at [EMAIL]"
Note: No automated tool catches everything. Always verify manually before uploading sensitive documents to an LLM.