cloak.business — Enterprise-Grade PII Detection & Anonymization
Regex-first PII protection with 317 deterministic pattern recognizers, NLP for names and locations. 390+ entity types, 48 languages, image redaction with OCR. Built on Microsoft Presidio. ISO 27001-certified German servers.
Platform Overview
cloak.business is an enterprise-grade PII detection and anonymization platform built on Microsoft Presidio. Features a regex-first approach with 317 deterministic pattern recognizers for structured data, complemented by NLP engines for names and locations. Includes image redaction with Tesseract OCR, native MCP Server integration for AI tools, and Zero-Knowledge authentication.
What Makes cloak.business Unique
Regex-First Detection
317 deterministic pattern recognizers process structured data (emails, IBANs, credit cards, SSNs) before NLP engines handle names and locations. Predictable, auditable results with no model drift.
Image Redaction
Extract text from images using Tesseract OCR in 38 languages, detect PII, and redact directly on the image. Supports JPEG, PNG, BMP, TIFF, and WebP formats.
MCP Server for AI Tools
Native integration with Claude Desktop (stdio), Cursor, and VS Code via Continue and Cline extensions (HTTP). 6 operators: encrypt, hash, mask, redact, replace, keep. Entity groups and presets.
Zero-Knowledge Authentication
Your password never leaves your device. Built with Argon2id + XChaCha20-Poly1305 encryption and AES-256-GCM vault encryption. 24-word recovery phrase for account recovery.
How Detection Works
cloak.business uses a 10-step pipeline that prioritizes deterministic regex matching before engaging NLP engines. Built on Microsoft Presidio.
- Input Reception — Text received via web app, API, MCP Server, or batch upload
- Language Detection — Automatic identification of text language for engine selection
- Regex Scanning — 317 pattern recognizers scan for structured PII (emails, IBANs, SSNs, credit cards, phone numbers)
- Checksum Validation — Detected patterns validated using checksums (Luhn, IBAN, SSN format rules)
- Context Enhancement — Surrounding text analyzed to boost or reduce confidence scores
- NLP Processing — spaCy, Stanza, or XLM-RoBERTa processes text for names, organizations, and locations
- Result Merging — Regex and NLP results merged with conflict resolution (regex wins for structured data)
- Confidence Scoring — Each detection receives a confidence score (0.0 to 1.0)
- Anonymization — Selected method applied: replace, redact, hash, encrypt, or mask
- Output Delivery — Anonymized text returned with detection report and audit trail
Multi-Engine Language Processing
cloak.business uses three NLP engines optimized for different language families with lazy-loaded models. Regex handles structured data, NLP handles names and organizations. Built on Microsoft Presidio.
spaCy Engine
25 Languages
Fast industrial-strength NLP for European languages and major world languages.
Stanza Engine
7 Languages
Stanford NLP engine for specialized language processing and academic accuracy.
XLM-RoBERTa Transformer
16 Languages
Cross-lingual transformer for low-resource languages and multilingual documents.
PII Detection & Anonymization
390+ Entity Types
Names, emails, phone numbers, credit cards, SSNs, IBANs, IP addresses, medical records, and more. 317 regex recognizers for structured data, NLP-based NER for names and organizations. Confidence scoring for all detections.
5 Anonymization Methods
- Replace — Substitute with fake data
- Redact — Complete removal
- Hash — SHA-256 hashing
- Encrypt — AES-256-GCM encryption
- Mask — Partial obscuring
75+ Countries
Country-specific recognizers for national IDs, tax numbers, social security numbers, and regional data formats across 75+ countries.
Deterministic Results
317 regex recognizers give 100% reproducible results for structured data. NLP provides high consistency for names. Fully auditable for compliance. No model drift.
OCR-Powered Image Anonymization
Extract text from images, detect PII, and redact directly on the image. Powered by Tesseract OCR.
38 OCR Languages
Tesseract OCR extracts text from images in 38 languages, enabling PII detection on scanned documents, screenshots, and photos.
Supported Formats
JPEG, PNG, BMP, TIFF, and WebP. Upload images and receive redacted versions with PII masked or removed.
Visual Redaction
Detected PII is redacted directly on the image with configurable redaction boxes. Original text positions preserved for accurate coverage.
Platform Components
Web Application
Cloud-Based Processing
Full-featured web interface with Zero-Knowledge authentication. No software installation required.
Desktop App
Windows 10+
Documents stay on your device while using cloud-powered entity detection. Only extracted text is sent for analysis. AES-256-GCM encryption with Argon2id key derivation. Supports PDF, DOCX, XLSX, TXT, CSV, JSON, XML.
Office Add-in
Word, Excel & PowerPoint
Real-time PII detection directly in Microsoft Office. Anonymize without leaving your document.
MCP Server
Claude Desktop, Cursor, VS Code (Continue & Cline)
Native stdio integration for Claude Desktop. HTTP endpoints for Cursor and VS Code via Continue and Cline extensions. 6 operators with entity groups and presets.
REST API
JWT Authentication
RESTful endpoints for workflow automation and CI/CD pipeline integration.
Batch Processing
Multi-Document Upload
Multi-document upload with parallel processing. Plan limits apply per tier.
Image Redaction
Tesseract OCR — 38 Languages
Upload images, extract text via OCR, detect PII, and receive redacted images. JPEG, PNG, BMP, TIFF, WebP supported.
Industries & Applications
Enterprise
GDPR compliance at scale. Centralized PII detection across departments with role-based access control and batch processing.
Developers
REST API and MCP Server for CI/CD pipelines. Safe test dataset generation without production PII exposure.
Legal
Contract anonymization & e-discovery. Redact sensitive information from court filings and discovery documents with audit trails.
Healthcare
Patient data protection & HIPAA support. Medical records anonymization for research and administrative documents.
Financial
PCI-DSS compliance & fraud prevention. Transaction and customer data protection with regulatory reporting.
Research
Anonymize research datasets for publication. Remove personal identifiers while preserving data utility for analysis.
Government
Public records & FOIA compliance. Automated redaction for FOIA requests and inter-agency data sharing.
Zero-Knowledge Architecture
Password Never Leaves Device
True Zero-Knowledge authentication. Your master password is used locally to derive encryption keys. Server never sees your password.
Argon2id + XChaCha20-Poly1305
Memory-hard key derivation with modern authenticated encryption. Industry-leading cryptographic primitives.
24-Word Recovery Phrase
BIP-39 compatible recovery phrase for account recovery. No password reset emails — you control your keys.
ISO 27001:2022 Certified
Hosted in Hetzner data centers in Germany. Full GDPR compliance with EU data residency. AES-256-GCM encryption. TLS 1.2+.
Transparent Token System
Each operation costs tokens based on text length, entities detected, and operations applied.
Free
€0/month
200 tokens per cycle
- Online account
- Analyzer & Anonymizer
- 48 languages
- 317 regex recognizers
- Image redaction
- No credit card required
Basic Most Popular
€3/month
1,000 tokens per cycle
- All Free features
- API access
- Batch processing
- PDF/DOCX/TXT/CSV support
- Token top-ups available
Pro Best Value
€15/month
4,000 tokens per cycle
- All Basic features
- MCP Server access
- All file types supported
- Unlimited uploads
- Token top-ups available
Business Enterprise
€29/month
10,000 tokens per cycle
- All Pro features
- Priority support
- Custom integrations
- Extended history
- Token top-ups available
cloak.business vs anonym.legal vs anonymize.today
Three platforms, different strengths. All built for GDPR compliance on German servers.
| Feature | cloak.business | anonym.legal | anonymize.today |
|---|---|---|---|
| Focus | Enterprise & Developers | Legal & Privacy-First | Simple & Transparent |
| Entity Types | 390+ | 260+ | 256 |
| Languages | 48 | 48 + RTL | 27 |
| Regex Recognizers | 317 | n/a | n/a |
| Image Redaction | Yes (38 OCR langs) | No | No |
| MCP Server | Yes | Yes | No |
| Zero-Knowledge Auth | Yes | Yes | No |
| Desktop App | Windows | Win/Mac/Linux | Win/Mac/Linux |
| Office Add-in | Word/Excel/PPT | Word/Excel/PPT | Word/Excel/PPT |
| Free Tokens | 200/cycle | 200/cycle | 300/month |
| Technology | Microsoft Presidio | Presidio-based | Regex-based |
Try cloak.business
Start with 200 free tokens per cycle. No credit card required. Zero-Knowledge from day one.
Related Platforms: anonym.legal — Zero-Knowledge PII anonymization with MCP Server | anonymize.today — Simple, transparent PII detection
Need enterprise-grade PII detection?
Let's discuss how cloak.business can support your compliance, image redaction, and API integration requirements.