Architecture

Real-time decisioning architecture: how the operator decides in 200ms

NBA, retention trigger, fraud block — each decision needs a fast profile, policies, models and an audit trail. Architecture of the real-time decisioning layer.

Discuss Your Challenge

What the real-time decisioning layer is

When a customer tops up their account, opens the app, calls support — the operator has a 1-3 second window to make a contextual decision: which offer to show, which queue to route to, whether to block a transaction, whether to send a push.

Real-time decisioning is the layer that makes that decision. Input: trigger event + customer context. Output: decision + audit trail.

Without a dedicated layer, decisions are scattered across systems: some in the campaign tool, some in IVR, some in the app, some in the fraud engine. Inconsistency is guaranteed.

What it consists of

Trigger ingress. Channels and source systems emit events through the event bus. Decisioning is subscribed to events in its scope.

Context fetch. Profile lookup from CDP, check of current state (balance, last transaction, active case), feature enrichment for the model.

Policy engine. Declarative rules that fire first: regulatory blocks, customer preferences (do-not-contact), business policies (no credit offer to a delinquent customer).

Model serving. ML models (churn risk, propensity, fraud score) provide a probability estimate. Fed into the decision as features.

Decision logic. The combination of policies + model scores + business rules produces a decision. Multi-arm bandits can choose between variants.

Audit trail. Each decision is logged with input data, model, policy version, output. Without this there is neither governance nor product iteration.

Where it usually breaks

Latency budget is not defined. The team builds a “common decisioning” engine that then cannot keep up with channel SLAs (IVR demands <300ms, app <500ms). It becomes rework.

Policies are hidden in code. To change “do not offer tariff X to customers with debt” requires a release. The business loses agility, decisioning becomes a bottleneck.

Models are not refreshed. A churn model trained a year ago keeps running on today’s data. Accuracy declines, nobody notices.

Audit is incomplete. When investigating an incident (why a customer received an offer they should not have), the team cannot reconstruct the logic.

A/B testing is impossible. Decisioning returns one answer, with no option to compare an alternative. The product does not learn.

Latency budget

Typical budget for decisioning: 100-300ms end-to-end, including context fetch and audit. Of which:

  • 20-50ms — trigger ingress + queue
  • 30-80ms — context fetch (profile from CDP, recent state)
  • 20-60ms — model serving
  • 10-30ms — policy + decision logic
  • 10-30ms — response + async audit

If the real budget is not tracked, latency creeps out and channels start timing out.

Operating model

Owner — Head of Decisioning or Customer Experience platform lead. Close to marketing but with engineering rigour.

Teams:

  • Platform engineering (latency, availability, deployment)
  • Decision science (models, experiments)
  • Policy management (business rules, regulatory)
  • Channel integration (contracts with consumers)

Routine — weekly review of decisions: which fired, where policies conflict, which models need retraining.

How SamaraliSoft engages

Decisioning Blueprint — 6-8 weeks. Map of current decisions across channels, latency budget, design of policy/model layer, governance, build/buy choice for the decisioning engine, pilot scope (usually 1-2 channels).

← Back

Ready to discuss your challenge?

Tell me what's not working or what needs to be built. First conversation — no obligations.

Usually respond within a few hours

Discuss a challenge
Choose a convenient way to connect
Telegram
Fast reply
Fast
WhatsApp
Voice and documents
📞
Call
+998 99 838-11-88