Architecture

Data platform architecture: DWH, Lakehouse, BI, and AI/ML layer

Modern data platform is not a choice between DWH and data lake but a layered architecture separating raw data, processed models, BI marts, and AI/ML layer. This article walks through components, typical building mistakes, and the approach to a mature data platform for a bank, operator, or industrial enterprise.

Discuss Your Challenge

This architectural article covers a modern data platform as a layered architecture with DWH, data lake, BI, and AI/ML — without ‘either/or’ choice, with a phased roadmap and realistic expectations.

What architecture challenge is being solved

Most large regional organizations have a DWH built 5-10 years ago on classic OLAP tools. In parallel, data lake solutions, AI/ML projects, and separate BI tools per business block emerge. Without architectural vision, this becomes chaos of parallel initiatives with data duplication, contradictory reports, and inability to scale AI/ML. A modern data platform solves the task through a layered architecture with explicit layer responsibility separation.

Which systems, data flows and teams are involved

  • Data sources — operational systems (core banking, BSS, CRM, ERP)
  • Ingestion layer — streaming (Kafka, Kinesis) and batch processes (ETL/ELT)
  • Raw data layer — data lake on object storage
  • Processing and modeling layer (refined) — transformed data with business logic
  • Analytical layer (curated) — marts for BI and analysts
  • AI/ML layer — feature store, model training, inference
  • Management layer — data catalog, lineage, quality control, security

Typical architecture mistakes

  • Choosing 'either DWH or data lake' — modern approach requires both as different layers of one platform
  • No unified data model — each business block builds its marts with contradictory business logic
  • ETL without quality control — data arrives in DWH with errors, reports lose trust
  • AI/ML on data unavailable to analysts — models work on one data, reporting on another
  • No feature store — each ML team rebuilds features from scratch, duplicate work and contradictions
  • BI tools per business block without a shared layer — metrics diverge, definitions diverge

Possible approaches

  • Lakehouse — unified platform on object storage with ACID transaction support (Databricks, Snowflake, Apache Iceberg). Modern approach uniting DWH and data lake
  • Classic DWH + Data Lake — separate systems for different tasks. Mature approach for large organizations
  • Cloud-native data platform — managed cloud services (BigQuery, Redshift, Synapse). Fast deployment, lower operational load
  • Hybrid — DWH for regulated data and reporting, data lake for analytics and AI/ML

How to avoid unnecessary core replacement

The data platform is built on top of existing operational systems, not replacing them. Core banking, BSS, ERP remain sources of truth — the data platform consumes events and snapshots. When a source ages out, it can be replaced without rebuilding the platform. Replacing the existing DWH with a new platform is a separate task, usually parallel to model and mart rebuilding.

Risks, dependencies, constraints

  • Source data quality — the platform will not fix garbage but will make it more visible
  • Data catalog and lineage — without them users cannot find and understand the data
  • Security and access control — especially critical for banks and government organizations
  • Cloud provider dependency when choosing cloud-native — migration is expensive, choice requires intentionality
  • Talent of analysts, data engineers, ML specialists — competition with large tech companies

How a phased roadmap should look

  1. Months 1-4: target architecture design, technology selection, team formation
  2. Months 5-12: build ingestion, raw, refined layers for priority domains (usually customers + products + transactions)
  3. Months 13-18: BI layer with unified metric definitions, migration of critical dashboards
  4. Months 19-24: AI/ML layer with feature store, first ML models in production
  5. After 24 months — expansion across domains, gradual migration of legacy DWH/marts
← Back

Ready to discuss your challenge?

Tell me what's not working or what needs to be built. First conversation — no obligations.

Usually respond within a few hours

Discuss a challenge
Choose a convenient way to connect
Telegram
Fast reply
Fast
WhatsApp
Voice and documents
📞
Call
+998 99 838-11-88