signzy

API Marketplace

downArrow
Logo
Responsive
The Alert That Almost Got Closed: What Sanctions Screening Hides in the Noise

The Alert That Almost Got Closed: What Sanctions Screening Hides in the Noise

5 Minutes

A compliance lead at a payments firm walked us through her alert queue last quarter. Not the system. The queue. The actual screen her analysts stare at for eight hours a day.

On that particular morning there were 1,300 open sanctions alerts. By her team's own historical numbers, somewhere between 12 and 60 of them were worth a second look. The rest were noise. Different people. Same-sounding names. A date of birth that shared a year and nothing else.

Her analysts had a median disposition time of under 30 seconds per alert. Read the name, glance at the match, close it as a false positive, move to the next one. Thirteen hundred times a day across the team.

Then she asked the question that should keep every sanctions officer up at night. "If a real one showed up in that queue, would we catch it? Or would it just look like the other 1,290?"

That is the actual sanctions screening problem. Not whether your system fires. It fires constantly. The problem is what happens to a true match when it arrives wearing the same clothes as the noise.

The 95 percent false positive rate nobody wants to say out loud

Here is a number the industry has quietly accepted. Across financial institutions, the false positive rate on name screening commonly sits between 85 and 95 percent. A 2026 sanctions screening accuracy benchmark put the band at 85 to 95 percent for name-screening systems. Multiple vendors report institutions where over 90 percent of alerts resolve as false. Some describe "thousands of false positives for every valid alert."

Read that again. For every real sanctioned party your system catches, it may flag a thousand people who simply have a similar name.

This is not a bug. It is a design choice, made on purpose, for a defensible reason.

Sanctions screening is built recall-first. Miss a sanctioned party and you face an OFAC penalty, a regulatory finding, and a headline. Flag a thousand innocent people and you face an annoyed analyst team. Faced with that trade, every compliance function in the world tunes the same way. Cast a wide net. Set the match threshold low. Let borderline matches become alerts. Better a flood than a miss.

The logic is sound. The consequence is a queue where the signal and the noise are indistinguishable at a glance.

Why fuzzy matching noise is structural, not a threshold you can slide away

Most teams treat false positives as a tuning problem. Turn the threshold up, fewer alerts. Turn it down, more alerts. If only it were that simple.

The noise comes from the data itself, on both sides of the match. Three drivers in particular.

Transliteration. Sanctions lists are global. The OFAC SDN List, UN, EU, and HMT lists carry names that originated in Arabic, Cyrillic, and Chinese scripts, then got romanized into Latin characters by different people using different conventions. One Arabic name becomes Mohammad, Mohammed, Mohamed, Muhammad, Muhamad. A Russian name becomes Yevgeniy or Evgenii depending on who typed it. A Chinese name appears in Pinyin, Wade-Giles, or whatever the passport office used that year.

To catch the genuine variant, your fuzzy matching has to tolerate all of them. The moment it does, it also matches every unrelated person whose romanized name falls inside that window.

Common names. A sanctioned individual named Mohammad Ali sits on a list. Your retail book has hundreds of Mohammed, Muhammad, and Mohamad Alis. Every one becomes an alert. Asian and Arabic naming conventions, with their high-frequency given names and multi-part surnames, multiply this across large populations. The system is not wrong to flag them. It simply cannot tell them apart from the name alone.

Partial identifiers. This is the one that quietly does the most damage. Sanctions lists are often incomplete. Many entries carry only a year of birth, or no date at all. No national ID. No address. Your own KYC data has gaps too. So when a fuzzy name match comes in with a date of birth that shares a year and differs by everything else, the system cannot safely rule it out. And because auto-discounting a partial match on a sanctions name feels reckless, nobody configures it to. So it becomes an alert. And it resolves as a false positive. Every time, until the time it does not.

You cannot push a single threshold high enough to remove this noise without also pushing it past real matches. The variation that creates false positives is the same variation that hides true ones. They live in the same band.

The step nobody audits: how fast your analysts close sanctions alerts

Every sanctions program audits whether the system screens. Almost nobody audits how fast their analysts close.

This is the gap. When 95 percent of a queue is noise, a human being adapts. They get fast. They develop a reflex. Same name pattern, close it. Partial date of birth, close it. After the four-hundredth false positive of the day, the four-hundred-and-first gets three seconds, not three minutes.

That reflex is rational. It is also exactly how a true match gets closed.

The UK regulator made this point with money in November 2025. The Office of Financial Sanctions Implementation fined Bank of Scotland 160,000 pounds. The detail that matters for everyone else: the bank's screening system was configured correctly. It still failed to flag certain sanctioned parties.

Configured correctly and still wrong. Sit with that. It means the thing your audit checks, that the system is set up properly, is not the thing that determines whether you catch the sanctioned party. Good-faith configuration is the floor, not the proof. The regulator's implicit message is that you have to demonstrate the system actually detects relevant sanctions in practice, with evidence, not just show that the settings look reasonable.

And the human layer sits underneath all of it. A system that surfaces the right alert still fails if the analyst reviewing it has been trained by 1,290 false positives to close on reflex.

What a loose match threshold is actually costing you

There is a second cost most teams never measure, because it never produces an error message.

A reduced treaty rate, a cleared payment, a completed onboarding. These all happen when an alert is dispositioned as false. If that disposition was wrong, nothing breaks today. The customer transacts. The payment clears. The exposure goes live and stays invisible.

Then the list updates. Russia-related designations alone added more than 6,650 sanctions after February 2022, on top of the 2,754 that existed before. Tranches land multiple times a week. A name that was clean last month is designated this month. If your screening only checks at onboarding and at payment, and you rescreen the book monthly, you have a 29-day window where a freshly designated party transacts freely through your rails.

The 50 Percent Rule widens the gap further. OFAC treats an entity as blocked if sanctioned parties own 50 percent or more of it, directly or indirectly, in aggregate, even when that entity's name appears on no list. These are hidden SDNs. Name-only screening cannot see them. You need ownership look-through, the kind of beneficial ownership and UBO data that business verification surfaces, and external research feeding the screening decision. Barclays paid over 2.4 million dollars in an early 50 Percent Rule case where the ownership evidence was sitting in its own KYC files and the screening never connected it.

So the true cost of a loose threshold is not just analyst hours. It is the false-negative disposition that clears a real party, the list-lag window where a new designation goes unscreened, and the ownership chain your name match was never built to follow.

The 10-year statute of limitations turns your screening config into a record

In April 2024, the statute of limitations for the laws underpinning most US sanctions doubled, from five years to ten. OFAC also moved to extend recordkeeping from five years to ten.

This changes what your screening configuration is. It used to be an operational setting. It is now a ten-year evidentiary record.

Every threshold you set, every tuning decision you make, every period where your list updates lagged, every alert your team dispositioned as false, all of it is now examinable for a decade. If an examiner in 2034 asks which SDN list version your system was running on a Tuesday in 2026, and why your match threshold was set where it was, "it seemed reasonable" is not an answer. "Here is the historical alert analysis that justified this threshold, here is the validation testing, here is the disposition data" is.

Sanctions screening configuration has quietly become a model-risk discipline. Most teams have not caught up to that.

What actually reduces sanctions screening false positives without hiding the signal

The instinct is to chase a lower false positive rate. That is the wrong target if you chase it by raising one global threshold, because you will push past true matches on the way down.

Here is what works, drawn from how the better-run programs we see actually operate.

Tune on your own disposition data, not vendor defaults. Pull your historical alerts. Map similarity scores to actual outcomes. Find the score band where true matches live and where pure noise lives. Set thresholds from that evidence. This is also the only tuning you can defend to an examiner over a ten-year window.

Segment your thresholds. A correspondent banking payment, a trade finance counterparty, and a low-risk retail onboarding do not deserve the same sensitivity. Run high sensitivity where the risk concentrates. Ease it, with documented justification, where it does not. One threshold for everything is what guarantees both too much noise and too little coverage at the same time.

Test the variants deliberately. Build a test set of transliterations, reversed name orders, missing middle names, and known aliases. Run it through. If your system misses genuine variants, your threshold is too tight in the wrong place. If it floods on them, too loose. You cannot know without testing the exact failure modes that generate the noise.

Add identifiers before the alert, not after. Date of birth, nationality, and address should narrow a fuzzy name match before it ever reaches a human. The fewer name-only alerts you generate, the less reflex-closing your team does, and the more attention each remaining alert gets.

Screen ownership, not just names. Integrate beneficial ownership data and look-through logic so the 50 percent threshold is calculated, not guessed. A name match alone will never find a hidden SDN.

Watch your own close rate. Track how fast each analyst dispositions alerts. A median of under 30 seconds on a sanctions queue is not efficiency. It is a warning. It tells you the noise has trained your team to stop reading.

This is the layer we focus on across the AML and sanctions screening we run for financial institutions, where the platform screens against 1,000-plus sanctions lists, PEP databases, and adverse media. The reason AI-driven false-positive reduction matters is not that it makes the number prettier. It is that every false positive you remove before it reaches a human is one less alert teaching that human to close on reflex. The goal is not a quieter queue. It is a queue where, when the real one shows up, somebody actually looks at it.

The one thing to change in your sanctions screening program

Stop measuring your sanctions program by whether the system generates alerts. It does. It generates far too many.

Measure it by what happens to a true match when it lands in a queue that is 95 percent noise. If your analysts are closing alerts in 30 seconds because the volume forces them to, your real exposure is not a system that fails to fire. It is a system that fires so often the one that matters looks like all the others.

The noise is not a nuisance to be tuned away. It is the thing hiding your actual risk. Audit the close, not just the catch.

FAQ

Our false positive rate is around 95 percent. Is that normal or a red flag?

Drop Down
It is normal, and that is the problem. 85 to 95 percent is the common band for name-screening systems across the industry. Normal does not mean safe. A 95 percent rate means your analysts are dispositioning huge volumes of noise, which trains them to close fast. The risk is not the rate itself. It is what that rate does to the attention each real alert receives. Track analyst disposition time alongside the false positive rate. If close times are dropping as volume rises, your team is adapting to the noise in a way that endangers true-match detection.

If our screening system is configured correctly, are we covered?

Drop Down
No, and a 2025 enforcement action proves it. The UK's OFSI fined Bank of Scotland 160,000 pounds even though the screening system was configured correctly. It still failed to flag certain sanctioned parties. Regulators increasingly expect you to demonstrate that the system actually detects relevant sanctions in practice, through evidence-based testing and validation, not just show that the settings are reasonable. Correct configuration is the floor. Proven detection is the bar.

Why can't we just raise the match threshold to cut false positives?

Drop Down
Because the variation that creates false positives is the same variation that hides true matches. Transliterations, common names, and partial dates of birth all live in the fuzzy-match band. Raise the threshold high enough to remove the noise and you start excluding genuine variants of sanctioned names. The fix is not one higher threshold. It is segment-based thresholds tuned on your own disposition data, plus stronger use of secondary identifiers before an alert is generated.

How does the 50 Percent Rule affect screening if the entity isn't on any list?

Drop Down
Directly. OFAC treats an entity as blocked if sanctioned parties own 50 percent or more, directly or indirectly, in aggregate, even when the entity's name is on no published list. These are effectively hidden SDNs. Name-based screening cannot detect them. You need beneficial ownership data and look-through logic that aggregates ownership across blocked persons. Barclays paid over 2.4 million dollars in a 50 Percent Rule case where the ownership evidence was in its own KYC files and screening never connected it.

We screen at onboarding and on payments. Is that enough?

Drop Down
Only if you also rescreen your existing book quickly after list updates. Russia-related designations have landed multiple times a week since 2022. If a customer becomes an SDN today and your batch rescreen runs monthly, you have up to 29 days where that party transacts freely. Define a maximum allowable lag from list publication to detection, monitor it, and consider interim filters for the highest-profile additions between vendor updates and full system refresh.

Why does the extended statute of limitations matter for screening configuration?

Drop Down
In April 2024, the statute of limitations for the laws behind most US sanctions doubled from five to ten years, with recordkeeping moving to ten years as well. Your screening configuration is now a ten-year evidentiary record. Threshold decisions, tuning rationale, list-update lags, and alert dispositions are all examinable for a decade. This effectively makes screening a model-risk discipline. You need documented design criteria, independent validation, and change controls for any tuning adjustment.

What's the single most useful metric we're probably not tracking?

Drop Down
Analyst disposition time per alert, segmented by analyst and by alert type. Most teams track alert volume and false positive rate. Almost nobody tracks how fast humans are closing. A median under 30 seconds on a sanctions queue tells you the noise has trained your team to stop reading carefully. That number is your early warning that a true match could get closed on reflex, long before an enforcement action tells you the same thing for a much higher price.

Spread the knowledge!

Found this useful ? Share what you learned!

XLinkedIn
Saurin Parikh

Saurin Parikh

Saurin is a Sales & Growth Leader at Signzy with deep expertise in digital onboarding, KYC/KYB, crypto compliance, and RegTech. With over a decade of professional experience across sales, strategy, and operations, he’s known for driving global expansions, building strategic partnerships, and leading cross-functional teams to scale secure, AI-powered fintech infrastructure.

Onboard User

Websites can't replace conversations. Let's talk?

We're just one call away, ready to answer all your queries and provide the perfect solution for your business needs.