DLP & Content Intelligence
Detect, classify, and act on sensitive data before it leaves your organization. Multi-stage pipeline with pattern matching, ML models, and GenAI classification across six detection categories.
Detection Engine
Multi-stage pipeline combining pattern matching, machine learning, and generative AI for high-accuracy sensitive data detection.
This is not basic keyword matching. MnemoShare's DLP engine runs a three-stage pipeline: regex patterns flag candidates, ML models score confidence, and GenAI integration (Anthropic or OpenAI) provides semantic classification for edge cases. Each stage narrows false positives.
- 40+ built-in patterns across 6 detection categories
- PHI: medical record numbers (MRN), ICD-10, CPT, NPI, DEA, and NAILS identifiers
- PII: Social Security numbers, driver's license, passport numbers
- PCI: credit card numbers with Luhn algorithm validation
- SECRETS: API keys, tokens, passwords, and credentials
- INFRA: IP addresses, connection strings, and infrastructure identifiers
- REGULATORY: international regulatory identifiers
- Confidence scoring with configurable thresholds per pattern
- Post-match validation callbacks (Luhn for credit cards, checksum for SSNs)
- GenAI integration with Anthropic and OpenAI for high-accuracy semantic classification
Policy & Response
Configure detection policies with custom rules, violation thresholds, and response actions tailored to your compliance requirements.
- Configurable policy actions: log, warn, or block
- Violation threshold — minimum matches required to trigger an action
- Filename scanning for embedded sensitive data in file names
- Automatic masking of findings in logs and alerts
- Per-policy rule selection from any of the 6 detection categories
- Real-time scan results dashboard with status tracking
Presidio Integration
Augment rule-based detection with Microsoft Presidio's ML-powered PII analysis for maximum coverage.
- Microsoft Presidio integration for ML-powered PII detection
- AI-powered semantic content analysis beyond regex patterns
- Combines rule-based and ML-based detection for maximum coverage
Beyond traditional MFT
Most managed file transfer platforms were designed before modern threats existed. Here is how MnemoShare compares.
| Capability | Traditional MFT | MnemoShare |
|---|---|---|
| Detection method | Keyword matching or regex only | Multi-stage: regex, ML, and GenAI with confidence scoring |
| Coverage | Basic PII patterns | 40+ patterns across PHI, PII, PCI, secrets, infrastructure, regulatory |
| Accuracy | High false positive rates | Post-match validation (Luhn, checksums) + confidence thresholds |
| Response | Block or allow | Configurable: log, warn, or block with violation thresholds |
| Scanning scope | Uploaded files only | Files, filenames, email content, email attachments |
Real-world use cases
PHI leak prevention
Healthcare org scans all outbound files for protected health information (MRN, ICD-10 codes, patient names). Policy blocks transfers containing PHI unless the recipient is on the approved partner list.
Financial data governance
Bank scans documents for credit card numbers, SSNs, and account numbers before external sharing. Luhn validation eliminates false positives on credit card patterns. Findings logged for compliance audit.
Credential exposure detection
DevOps team uses the SECRETS category to catch API keys, tokens, and connection strings accidentally included in file transfers. Automatic masking prevents credentials from appearing in logs.
Frequently asked questions
- What types of sensitive data can MnemoShare detect?
- MnemoShare detects data across six categories: PHI (medical record numbers, ICD-10 codes, NPI, DEA numbers), PII (SSN, driver's license, passport), PCI (credit cards with Luhn validation), SECRETS (API keys, tokens), INFRA (IP addresses, connection strings), and REGULATORY (international identifiers). Over 40 built-in patterns.
- How does the multi-stage detection pipeline work?
- Files pass through three stages: pattern matching flags potential sensitive data, ML models score confidence on each finding, and optional GenAI integration (Anthropic or OpenAI) provides high-accuracy semantic classification for edge cases. Each stage narrows false positives.
- Can DLP policies be customized per organization?
- Yes. Administrators create policies with custom rule selections from any detection category, set violation thresholds (minimum matches to trigger), choose actions (log, warn, or block), and enable/disable specific patterns.
- Does DLP scanning work on email?
- Yes. The email security gateway applies DLP scanning to email body content and attachments before forwarding, using the same detection engine and policies as file uploads.
Ready to see MnemoShare in action?
Start a free trial, schedule a walkthrough, or dive into the docs.