signzy

API Marketplace

downArrow
Logo
Responsive
Decorative line

De-duplication

Overview

De-duplication detects and merges duplicate customer or entity records to create a single source of truth for screening, monitoring, and reporting. Duplicates arise from typos, transliteration, channel silos, migrations, and intentional multi-accounting. Effective programs combine deterministic keys (SSN, EIN, document numbers), probabilistic matching (name, DOB, address, phone), and device or behavior signals, all normalized and scored.
Governance defines merge rules, survivorship preferences, and reversible audit trails. Avoid over-merging by requiring evidence thresholds and manual review for gray zones; under-linking inflates false positives and hides risk. Cleaner master records improve sanctions and PEP accuracy, reduce duplicate cases, and power graph analytics for ring detection. Quality dashboards (duplicate rate, merge accuracy, re-openings) drive continuous tuning and retraining.

FAQ

Why is de-duplication so important?

Duplicates inflate alerts, fragment history, and mask coordinated abuse. Consolidation improves screening precision, investigation speed, and reporting accuracy.

How do we merge safely without losing information?

Use weighted scores with reason codes and manual review for borderlines. Preserve lineage so merges can be reversed; record survivorship rules for each retained field.

Which signals work best for linking?

Strong IDs, document numbers, device and payment linkages, and stable addresses. Phonetic and locale rules reduce spelling and transliteration noise.

How often should we run de-duplication?

Continuously for new records, with periodic batch sweeps after migrations or acquisitions; monitor KPIs to tune thresholds.