FAQ

Frequently Asked Questions

40 questions about pii anonymization action hub β€” answered with data.

Zero-Knowledge Authentication

How do I verify a SaaS vendor uses true zero-knowledge encryption and cannot access my data?

Argon2id key derivation runs entirely in the browser/app (64MB memory, 3 iterations). AES-256-GCM encryption happens before any data leaves the device. The server never receives the plaintext password or the derived encryption key. Even a full anonym.legal server breach would yield only encrypted blobs without the keys to decrypt them. Example: A compliance officer at a German health insurer needs to process patient complaint logs using a cloud anonymization tool. GDPR Article 32 requires appropriate technical measures. The insurer's DPO will not approve any tool that transmits unencrypted PII or holds encryption keys server-side. Zero-knowledge architecture removes this blocker from the vendor assessment process entirely.

My company processes PHI β€” can we use cloud anonymization tools or do we need on-premise only?

Zero-knowledge design means original text is never stored on anonym.legal servers. European data storage (Hetzner EU data centers). The tool processes anonymization logic without retaining the source documents. This removes the primary blocker for HIPAA-covered entity adoption. Example: A hospital system's IT security team is evaluating tools for clinical documentation anonymization before sharing with a research partner. The HIPAA Privacy Officer needs to demonstrate compliance under 45 CFR 164.514. anonym.legal's zero-knowledge architecture means the BAA covers a tool that provably cannot expose PHI.

SaaS breaches are up 300% β€” how can I trust any cloud tool with PII?

Zero-knowledge architecture means a full anonym.legal server compromise provides attackers with AES-256-GCM ciphertext without the keys to decrypt it. Combined with EU-based data storage and ISO 27001 controls, this provides the strongest possible breach impact minimization. Example: A CISO at a German insurance company is reviewing their 2025 vendor risk posture after the industry-wide SaaS breach surge. They require all PII-handling vendors to demonstrate cryptographic data isolation. anonym.legal's zero-knowledge design is included in the approved vendor list specifically because a server breach cannot expose policyholder data.

How do I know the PII anonymization tool I'm using isn't storing my sensitive data on their servers where it could be breached?

Argon2id (64MB memory, 3 iterations) key derivation runs entirely in the browser/desktop client. The derived AES-256-GCM key never leaves the device. anonym.legal servers receive only encrypted ciphertext and cannot decrypt it even with full database access. 24-word BIP39 recovery phrase enables key recovery without server involvement. Example: A CISO at a German health insurer evaluating anonymization tools for GDPR compliance. Their procurement checklist requires proof that the vendor cannot access patient data. anonym.legal's zero-knowledge architecture satisfies Article 25 (Privacy by Design) and allows the CISO to tell the DPA: "even if the vendor is breached, our data is cryptographically inaccessible."

After the LastPass breach, can I trust any cloud service with my company's sensitive data?

Zero-knowledge authentication with open architecture documentation. The 24-word BIP39 recovery phrase is the only way to restore access, meaning even anonym.legal staff cannot reset accounts or access user data. Session management with remote logout prevents persistent access after device loss. Example: A CISO at a 500-person law firm is reviewing vendor security after their password manager vendor suffered a breach. They need to demonstrate to their malpractice insurer that all tools handling client data use verified zero-knowledge architecture. anonym.legal's client-side encryption approach allows the CISO to demonstrate that even a complete server compromise would not expose client communication data.

How do I pass a security questionnaire for a vendor that handles our sensitive documents?

Zero-knowledge authentication + ISO 27001 certification provides the strongest possible answer to VSQ encryption questions. anonym.legal can truthfully state that server compromise yields no usable plaintext data. Example: A Fortune 500 financial services company is adding anonym.legal to their approved vendor list. Their vendor risk team sends a 150-question security questionnaire. The zero-knowledge architecture allows the anonym.legal team to answer encryption, key management, and data access questions definitively, shortening the approval cycle from months to weeks.

How do we pass vendor security assessments faster without sharing our encryption architecture documentation every time?

ISO 27001 certification provides the baseline framework. Zero-knowledge architecture documentation answers the specific question of server-side data access. DPIA completion satisfies GDPR Article 35 requirements. The combination dramatically shortens procurement cycles for regulated industries. Example: A procurement officer at a Fortune 500 financial services firm needs to onboard an anonymization tool for their data science team within Q4. anonym.legal's ISO 27001 certificate + zero-knowledge architecture documentation + completed security questionnaire template allows the CISO to approve the vendor without a full custom assessment β€” saving 6-8 weeks.

Multi-Language Support (48 Languages)

Why does my PII detection tool miss names and IDs in German, French, and Polish documents?

Three-tier language support: spaCy language-native models for 25 high-resource languages (provides semantic understanding of names, places, organizations in native language), Stanza for 7 additional languages, XLM-RoBERTa cross-lingual transformers for 16 lower-resource languages. This mirrors the academic best practice identified in 2024 hybrid PII detection research. Example: A compliance officer at a European BPO processing customer service data from Germany, France, Poland, and the Netherlands. Each country's customer records contain different national identifier formats. A single English-centric tool misses all non-English PII. anonym.legal's 48-language support with region-specific entity types (Steuer-ID, NIR, PESEL, BSN) provides complete coverage in a single platform.

How do I anonymize customer data across DACH and Benelux regions with GDPR-compliant accuracy?

48-language detection stack with three complementary models. spaCy covers 25 EU languages natively. XLM-RoBERTa handles cross-lingual transfer for 16 additional languages. 260+ entity types include DACH-specific identifiers (Steuer-ID, AHV-Nr, Sozialversicherungsnummer), French NIR/SIRET, Nordic personnummers, and UK NHS/NI numbers. Example: A multinational HR software company processes employee onboarding documents across 18 EU countries. Their existing English-language PII tool misses 40% of non-English PII, creating GDPR Article 5 (data minimization) compliance gaps. anonym.legal's 48-language support closes this gap with pre-built regional identifiers, eliminating the need for country-specific custom configurations.

How do I detect PII in Arabic and Hebrew text with RTL formatting?

Full RTL support for Arabic, Hebrew, Persian, and Urdu. XLM-RoBERTa (cross-lingual transformer) provides language-agnostic entity recognition that works across script types. Stanza NER handles Hebrew (HE) specifically. Example: An Israeli legal tech firm processes employment contracts in Hebrew and English. Their US-built redaction tool fails entirely on the Hebrew sections, requiring manual review for every bilingual document. anonym.legal's Stanza-powered Hebrew NER detects names, addresses, and Israeli ID numbers (Teudat Zehut) without requiring transliteration or manual preprocessing.

We outsource customer support to a BPO in the Philippines β€” how do we ensure their agents' multilingual chat logs are anonymized before analysis?

48-language support includes APAC languages: Indonesian (ID), Thai (TH), Vietnamese (VI), Filipino (TL), and others via XLM-RoBERTa. Stanza covers additional APAC languages. Single deployment handles global customer support log anonymization. Example: A Singapore-based fintech processes 500,000 customer support chat logs monthly across 12 APAC languages. PDPA (Personal Data Protection Act) requires anonymization before analytics. Their current tool only processes English accurately. anonym.legal's multilingual support reduces their manual review burden from 60% of non-English logs to near-zero.

We process data from Brazil, India, and the EU β€” do we need three different tools for CPF, PAN, and IBAN detection?

260+ entity types include Brazil CPF, India PAN, all EU IBAN formats, Brazilian CNPJ, Indian Aadhaar, and many more. The entity library is maintained and updated by the anonym.legal team. Organizations with global operations get comprehensive coverage from a single tool. Example: A London-based marketplace processes seller onboarding documents for merchants from 45 countries. They need to detect and anonymize national ID numbers for GDPR (EU), LGPD (Brazil), and DPDP (India) compliance. anonym.legal's 260+ entity type library covers all their regional identifier requirements without custom development.

How do I detect PII in Arabic and Hebrew text? Our RTL documents are completely missed by standard NER tools.

XLM-RoBERTa provides cross-lingual entity recognition for Arabic and Hebrew with full RTL text handling. The platform includes Arabic, Hebrew, Persian, and Urdu in its 48-language support stack. Example: A fintech company in Dubai processing KYC documents for EU clients. Documents contain Arabic customer names and UAE Emirates IDs alongside English business data. GDPR applies to the EU client relationship data. Without RTL PII detection, Arabic name fields are invisible to the compliance system.

We have documents mixing English and German β€” does NER get confused when languages switch mid-document?

XLM-RoBERTa's cross-lingual transformer architecture is trained on multilingual corpora and handles mixed-language text natively without requiring explicit language switching. Combined with language-specific spaCy models for high-accuracy regions, the hybrid approach handles multilingual documents robustly. Example: A Swiss pharmaceutical company processes employment contracts that mix German, French, and English within a single document (Switzerland has four official languages). Their current tool misses French-section PII when configured for German. anonym.legal's multilingual stack processes all three languages simultaneously within the same document pass.

Hybrid Recognizer System

Our de-identification tool misses PHI in clinical notes β€” LLM studies show >50% miss rate. What should we use instead?

Hybrid three-tier detection provides both high recall (ML-based NER for names and contextual PHI) and high precision (regex for structured identifiers). The 260+ entity types include medical-specific identifiers: MRN formats, NPI, DEA numbers, health plan IDs. Confidence thresholds can be set for maximum recall in high-risk PHI scenarios. Example: A hospital system is building a de-identified research dataset from 500,000 clinical notes. Their current tool (Presidio default) misses ~30% of PHI based on internal testing. This creates research IRB compliance issues and potential HIPAA violations. anonym.legal's hybrid approach with healthcare-specific entity types reduces the miss rate to under 5%.

Over-redaction in e-discovery is causing sanctions β€” our tool blacks out too much. What causes this and how do we fix it?

Configurable confidence thresholds per entity type allow legal teams to calibrate precision vs. recall. The hybrid system's regex component provides reproducible, defensible detection for structured PII. The preview modal in the Chrome Extension shows what will be redacted before committing β€” the same principle applies across platforms. Example: A litigation support team at a large law firm handles 200,000-document e-discovery productions monthly. Their previous ML-only tool's 35% false positive rate exposed them to over-redaction sanctions. anonym.legal's configurable threshold system reduces false positives while maintaining privilege protection, and generates the entity-level audit log needed for privilege logs.

How do I ensure my automated redaction tool doesn't over-redact and hide evidence that opposing counsel needs?

Confidence scoring per entity (0-100%) provides the basis for audit trails. Per-entity operator configuration allows legal teams to apply different handling rules to different entity types (e.g., replace party names with pseudonyms but redact SSNs). Reversible encryption maintains the ability to restore original text when authorized review is needed. Example: A legal technology team at a large law firm preparing document production in a commercial litigation matter. They need to redact client identifiers from 15,000 DOCX and PDF files while preserving all non-protected content. anonym.legal's hybrid detection with per-entity configuration and confidence scoring allows them to produce a defensible redaction log for the court.

Our PII detection tool redacts too many things that aren't PII β€” it's creating a huge manual review burden. How do we reduce false positives?

Three-tier hybrid: regex handles structured data with 100% reproducibility; spaCy NLP handles contextual name/org/location detection; XLM-RoBERTa handles cross-lingual ambiguity. Confidence thresholds are configurable per entity type β€” a legal team can set names to 90% confidence while keeping phone numbers at regex-certainty. Example: A large law firm's e-discovery team processes 50,000 documents per litigation matter. Their ML-only redaction tool produces 35% false positive rate, requiring attorney review for each flagged item. At $400/hour and 10 false positives per document, the manual review cost exceeds the automation savings. anonym.legal's hybrid approach with configurable thresholds reduces the false positive rate to under 5%, making automation economically viable.

How do I explain to auditors exactly why a specific piece of text was redacted or not redacted?

Confidence scoring per entity provides the audit trail foundation. The hybrid approach's use of regex for structured data makes those detections fully reproducible and explainable (exact pattern matched). NLP detections include entity type, model, and confidence β€” sufficient for compliance documentation. Example: A clinical research organization must demonstrate to an IRB (Institutional Review Board) that their de-identification process meets HIPAA Expert Determination standards. The audit requires documentation showing which identifiers were removed and by what method. anonym.legal's confidence scoring and entity-type classification provides the audit evidence required.

We need PII detection for KYC document processing β€” false positives slow down customer onboarding. How do we balance speed and accuracy?

Context-aware hybrid detection with configurable thresholds per entity type. Financial-specific entity types (bank accounts, SWIFT codes, BICs, IBAN formats) use regex for deterministic detection. Names use NLP with context words and confidence scoring. Threshold configuration allows financial teams to tune for their specific volume/accuracy trade-off. Example: A digital banking platform processes 5,000 KYC applications daily across 15 European countries. Their PII detection step creates a 2-day backlog due to false positive rates requiring manual review. anonym.legal's hybrid approach reduces manual review to under 3% of documents, eliminating the bottleneck while maintaining AML compliance.

Presidio is flagging everything as PII in our log files β€” how do I reduce false positives without missing real PII?

The hybrid three-tier architecture separates structured data (regex with 100% reproducibility) from contextual detection (NLP) from cross-lingual detection (transformers). Confidence thresholds are configurable per entity type. Context-aware enhancement boosts scores when context words appear near matches and suppresses false positives when context is absent. The result is dramatically lower false positive rates than Presidio defaults. Example: A data engineering team at a healthcare company running Presidio on clinical notes exported to JSON. The raw Presidio output flags hundreds of numeric sequences as SSNs and phone numbers that are actually medical record numbers, dosage amounts, and procedure codes. Manual review of false positives consumes 3+ hours per batch. anonym.legal's hybrid system with configurable thresholds and the MRN entity type reduces false positives by ~70% while maintaining PHI recall.

Office Add-in (Word & Excel)

The DOJ's Epstein files showed that PDF black-box redaction can be reversed with copy-paste β€” are Word documents safer?

Office Add-in performs true PII replacement within the Word document itself. Text is permanently replaced with tokens, redacted marks, or anonymized placeholders. The original text is not hidden β€” it is gone from the document. Formatting (fonts, styles, bold, italic) is preserved. Headers, footers, and comments are processed. Full undo support for iterative review. Example: A government agency's legal team must produce 3,000 documents in response to a litigation hold. Previous productions using PDF black-highlighting were challenged when opposing counsel discovered the highlighting was reversible. anonym.legal's Word Add-in is deployed for the document review team. True text replacement ensures no underlying data remains. The production withstands forensic examination.

Our legal team spends 2-3 days manually redacting Word documents for each discovery production β€” is there a faster way?

Word Add-in works natively inside Microsoft Word β€” no conversion required. Preserves all formatting: fonts, styles, bold, italics, tables, headers, footers, footnotes, and comments. Supports per-entity operator configuration (different handling for names vs. SSNs vs. dates). Full undo support for iterative review. Reduces 2-3 days of manual work to hours. Example: A litigation boutique law firm handles 15 major matters annually, each requiring 5,000-50,000 document productions. Manual redaction was costing $400,000/year in paralegal and associate time. anonym.legal's Word Add-in reduces redaction time by 85%, saving $340,000 annually. The attorneys retain control through the review and approval workflow.

We need to anonymize Excel spreadsheets with 100,000 rows of employee data β€” does existing redaction software handle structured data?

Excel Add-in processes spreadsheets natively. Cell-level PII detection across all visible and hidden sheets. Handles up to 100,000 rows per plan. Preserves spreadsheet structure and formulas. Per-entity configuration allows different handling for names (replace with pseudonym) vs. SSNs (replace with X's) vs. phone numbers (mask with partial display). Example: A German manufacturing company's HR department must share 50,000 employee records with an external compensation consultant. GDPR requires anonymization before sharing with third parties. The Excel file contains 37 columns including names, salaries, addresses, and performance ratings. anonym.legal's Excel Add-in processes the full dataset in minutes, anonymizing all PII fields while preserving the spreadsheet structure for analysis.

How do I redact sensitive data in Word documents without destroying the formatting?

Word Add-in works natively inside Microsoft Office. No export or conversion. Formatting is preserved at the paragraph, character, and style level. Bold names remain bold after anonymization. Table structures are preserved. Headers and footers are processed without disrupting page layout. The result is a properly formatted document ready for immediate use. Example: A UK law firm specializing in employment tribunals must produce witness statements with names and identifying information anonymized per court order. Previous attempts using PDF redaction tools destroyed the document formatting, requiring manual reconstruction. anonym.legal's Word Add-in preserves formatting exactly β€” the anonymized statement looks professionally formatted and is court-ready without additional work.

FOIA requests requiring redaction of thousands of Word documents are creating backlogs β€” what automation tools help?

Office Add-in processes Word documents natively with automation support. Batch processing (1-5,000 files via Desktop App) enables volume handling. Per-entity configuration allows agency-specific redaction rules (FOIA exemption B6 for personal information, B7 for law enforcement). Presets allow FOIA staff to apply consistent configurations across the entire request. Example: A federal agency's FOIA office receives a request for 8,000 Word documents related to a policy decision. With 5,638 FOIA staff processing 1.5 million requests annually (about 266 requests per staff member per year), each staff member has roughly one day per request. anonym.legal's batch-capable Word Add-in processes all 8,000 documents in hours, with human review focused on edge cases rather than every document.

What Word redaction tools preserve styles, tables, and tracked changes during PII removal?

The Office Add-in operates directly within the Word document object model β€” no conversion to intermediate format. PII entities are detected in text runs, paragraphs, headers, footers, footnotes, and comments. Anonymization is applied in-place with full formatting preservation. Ctrl+Z undo reverts any change. This is architecturally distinct from all redaction tools that work at the rendered-document level. Example: A partner at a 50-person law firm needs to redact a 200-page merger agreement before sharing with regulatory authorities. The document contains 15 defined terms that include party names, 47 cross-references to those defined terms, and tables with financial figures linked to party identities. anonym.legal's Office Add-in detects all name instances (including in defined term contexts), applies consistent pseudonymization, and preserves all formatting β€” reducing a 6-hour manual redaction task to 15 minutes.

How do I anonymize PII in Excel spreadsheets that have thousands of rows of customer data without losing the structure?

The Office Add-in processes Excel at the cell level, supporting up to 100,000 rows and 20MB files. Per-entity operator configuration allows different handling for different entity types within the same spreadsheet. The full undo capability allows recovery if a formula column is accidentally flagged. Example: A data analyst at a retail company preparing customer purchase history for an external marketing analytics vendor. The 50,000-row Excel file contains customer names, emails, and loyalty IDs alongside purchase amounts and product categories. anonym.legal's Excel add-in replaces names and emails with pseudonyms while hashing loyalty IDs for referential integrity β€” allowing the analytics vendor to track behavior patterns without accessing real identities.

Chrome Extension (JIT Anonymization)

How do I stop my team from accidentally pasting customer data into ChatGPT through the browser?

Chrome Extension intercepts clipboard content before it appears in ChatGPT, Claude.ai, or Gemini input fields. Real-time PII detection with a preview modal shows employees exactly what will be anonymized before they submit. Employees continue their workflow β€” the protection is automatic and requires no behavior change. Example: A customer support team at a European e-commerce company uses ChatGPT to draft responses. Agents regularly paste customer names, order numbers, and addresses into prompts. anonym.legal Chrome Extension anonymizes this data before it reaches ChatGPT. Agents see tokenized placeholders in their prompts and ChatGPT's responses are de-anonymized automatically. Customer service quality is maintained; GDPR Article 5 data minimization is satisfied.

Two malicious Chrome extensions stole 900,000 people's ChatGPT conversations β€” how do I know a privacy extension is safe?

anonym.legal Chrome Extension processes everything locally β€” no data is sent to a C2 server or any third party during PII detection. Extension is published by the verified anonym.legal publisher. Zero-knowledge architecture means even anonym.legal cannot access the PII that passes through the extension. ISO 27001 certification provides independent security verification. Example: A privacy-conscious enterprise IT team wants to deploy AI PII protection for their workforce but is concerned about the malicious extension risk after the 900K-user incident. anonym.legal's verified publisher identity, local processing architecture, and ISO 27001 certification provide the assurance needed to add the extension to the corporate approved list.

Can I use ChatGPT for customer support tasks without violating GDPR?

Chrome Extension intercepts customer data before it reaches ChatGPT. Customer names are replaced with tokens (e.g., "[CUSTOMER_1]"), order numbers with "[ORDER_1]". ChatGPT processes anonymized context and produces a response using tokens. The extension's auto-decrypt feature restores real names in the AI response. Agents see real names; ChatGPT never processes them. Example: A French e-commerce company's 50-person support team uses ChatGPT for response drafting. The DPO is concerned about GDPR compliance. anonym.legal Chrome Extension anonymizes all customer PII before ChatGPT submission and automatically de-anonymizes the AI's draft responses. GDPR Article 5 data minimization is satisfied β€” ChatGPT receives no real customer identifiers. The DPO approves continued AI use.

How do I prevent employees from accidentally sending customer PII to ChatGPT when they're writing support responses?

The Chrome Extension v1.0.141 operates as a Manifest V3 extension with pre-submission interception. It detects PII in the input field using the same Presidio-based engine as all other anonym.legal platforms. A preview modal shows detected entities and the proposed anonymization before the message is sent. The user can proceed in one click. For encrypted mode, the AI response is automatically decrypted to restore context in the user's view. Example: A customer support team lead at a German e-commerce company uses ChatGPT to draft email responses to customer complaints. The workflow: copy customer complaint (contains name, order number, address) β†’ paste into ChatGPT β†’ generate response draft β†’ send. The Chrome Extension intercepts at the paste step, shows that "Maria MΓΌller, Hauptstraße 15, 10115 Berlin" was detected, replaces with "Customer_A, [ADDRESS_1]", sends the anonymized prompt to ChatGPT, and presents the response. GDPR compliance is maintained; workflow is unchanged.

Every Chrome extension for AI privacy claims to protect my data. How do I know a privacy extension isn't itself stealing my data?

The Chrome Extension processes PII detection locally using the same Presidio-based engine. The anonymization occurs client-side before the modified prompt is submitted to the AI service. No intercepted conversation content is transmitted to anonym.legal servers. The extension's data flow is: intercept prompt β†’ detect PII locally β†’ anonymize locally β†’ submit anonymized prompt to AI. This is architecturally distinct from extensions that "protect" by routing through their own proxy servers.

Developers use Claude for debugging but paste environment variables and secrets β€” how do we catch this at the browser level?

Chrome Extension intercepts developer-pasted content before submission to Claude.ai. Custom entity patterns for developer-specific secrets (API key formats, connection string patterns, JWT tokens) complement the built-in entity library. The preview modal shows developers exactly what will be anonymized before submission, creating an educational feedback loop. Example: A development team at a SaaS company has the MCP Server deployed for Cursor but developers also use Claude.ai in the browser for design discussions and code review. The Chrome Extension fills the gap β€” intercepting API keys and connection strings that appear in browser-pasted content. The two-tool deployment covers both IDE and browser AI use cases.

We need to share clinical cases with an AI for learning β€” but patient names and DOBs can't be included. How?

Chrome Extension detects and anonymizes healthcare-specific PHI (patient names, DOBs, MRNs, health plan IDs, addresses) in real time before clinical case text reaches ChatGPT or Claude.ai. Physicians can paste clinical notes directly β€” the extension handles HIPAA-required de-identification automatically. Example: A medical school's internal medicine teaching program uses Claude.ai for case-based learning discussions. Faculty members paste de-identified case summaries into Claude, but manual de-identification occasionally misses details. anonym.legal Chrome Extension provides automatic PHI detection as a safety net β€” catching missed identifiers before they reach Claude. HIPAA compliance is maintained with minimal workflow friction.

260+ Entity Types

Our tool detects US SSNs perfectly but misses German Steuer-IDs, French NIRs, and Swedish Personnummer. How do we get complete EU coverage?

260+ entity types include complete DACH coverage (Steuer-ID, AHV-Nr, Sozialversicherungsnummer), French identifiers (NIR, Carte Vitale, SIRET, SIREN), UK identifiers (NHS Number, NI Number, UTR), Nordic identifiers (Swedish Personnummer, Norwegian Fodselsnummer, Finnish Henkilotunnus), and all EU IBAN formats. This is 13x the coverage of standard Presidio (~20 default entity types). Example: A global HR manager at a multinational company processing payroll data for employees across 12 EU countries. Each country's national ID format is different. anonym.legal's 260+ entity types cover all 12 countries' formats in a single detection pass β€” eliminating the need for country-specific tool configurations or manual review for missed regional identifiers.

How do I detect Medical Record Numbers (MRNs) in clinical notes when every hospital has a different format?

The 260+ entity types include NPI numbers, DEA numbers, Medicare IDs, and health plan identifiers. The Custom Entity Creation feature allows healthcare organizations to define their specific MRN format once and apply it consistently. The AI-assisted pattern helper generates the regex from examples, removing the technical barrier for clinical informatics teams without regex expertise.

Our PII tool detects US SSNs but not German Steuer-IDs or French NIR numbers β€” how do we cover EU-specific identifiers?

260+ entity types include all major EU member state identifiers: DACH (Steuer-ID, AHV-Nr, Sozialversicherungsnummer), France (NIR, Carte Vitale, SIRET, SIREN), UK (NHS Number, NI Number, UTR), Nordic (Swedish Personnummer, Norwegian Fodselsnummer, Finnish Henkilotunnus), and others. Pre-built and maintained by the anonym.legal team. Example: A pan-European HR software provider processes onboarding documents for clients in 18 EU countries. Each country has its own national identifier format. Their US-built PII tool detects SSNs reliably but misses 14 of 18 EU country identifiers. anonym.legal's 260+ entity library covers all 18 countries' identifiers, closing the EU compliance gap without requiring custom development.

We process healthcare records and need to detect MRN numbers that are unique to each hospital β€” how do we build custom patterns?

Custom Entity Creation feature includes an AI-assisted pattern helper that suggests regex from provided examples. Healthcare teams provide 3-5 sample MRN values; the AI generates the appropriate regex pattern. The pattern is validated against additional examples. The custom entity is saved as a preset for reuse across all anonymization sessions. Example: A regional hospital system uses MRN format "SVHS-[0-9]{7}" for their 350,000 patient records. Their HIPAA compliance team needs to include MRN detection in their de-identification pipeline. Using anonym.legal's AI pattern helper, the team provides 5 example MRNs and receives a validated regex in under 2 minutes β€” without writing a single line of code.

We need to anonymize data containing internal employee IDs that don't follow any standard format β€” what do we do?

AI-assisted custom entity creation allows non-programmers to define internal identifier patterns. Visual regex pattern builder provides a guided interface. Test interface validates patterns against sample data. Custom entities integrate with the full detection pipeline alongside all 260+ built-in types. Presets allow custom patterns to be saved and shared across the team. Example: A global logistics company's compliance team must anonymize employee records for an external HR audit. Employee IDs follow the format "EMP-[REGION]-[0-9]{6}" (e.g., "EMP-EU-123456"). anonym.legal's AI pattern helper generates the regex from 3 examples in 30 seconds. The custom pattern is added to the team's GDPR compliance preset. All subsequent anonymization sessions detect employee IDs automatically.

Also from anonym.legal: anonymize.legal Β· blurgate.eu Β· privacyhub.legal Β· anonym.company Β· anonym.digital Β· anonym.management Β· anonym.marketing Β· anonym.agency

Published by George Curta, Founder of anonym.legal Β·