Data platform architecture: DWH, Lakehouse, BI, and AI/ML layer
Modern data platform is not a choice between DWH and data lake but a layered architecture separating raw data, processed models, BI marts, and AI/ML layer. This article walks through components, typical building mistakes, and the approach to a mature data platform for a bank, operator, or industrial enterprise.
Discuss Your ChallengeThis architectural article covers a modern data platform as a layered architecture with DWH, data lake, BI, and AI/ML — without ‘either/or’ choice, with a phased roadmap and realistic expectations.
What architecture challenge is being solved
Most large regional organizations have a DWH built 5-10 years ago on classic OLAP tools. In parallel, data lake solutions, AI/ML projects, and separate BI tools per business block emerge. Without architectural vision, this becomes chaos of parallel initiatives with data duplication, contradictory reports, and inability to scale AI/ML. A modern data platform solves the task through a layered architecture with explicit layer responsibility separation.
Which systems, data flows and teams are involved
- Data sources — operational systems (core banking, BSS, CRM, ERP)
- Ingestion layer — streaming (Kafka, Kinesis) and batch processes (ETL/ELT)
- Raw data layer — data lake on object storage
- Processing and modeling layer (refined) — transformed data with business logic
- Analytical layer (curated) — marts for BI and analysts
- AI/ML layer — feature store, model training, inference
- Management layer — data catalog, lineage, quality control, security
Typical architecture mistakes
- Choosing 'either DWH or data lake' — modern approach requires both as different layers of one platform
- No unified data model — each business block builds its marts with contradictory business logic
- ETL without quality control — data arrives in DWH with errors, reports lose trust
- AI/ML on data unavailable to analysts — models work on one data, reporting on another
- No feature store — each ML team rebuilds features from scratch, duplicate work and contradictions
- BI tools per business block without a shared layer — metrics diverge, definitions diverge
Possible approaches
- Lakehouse — unified platform on object storage with ACID transaction support (Databricks, Snowflake, Apache Iceberg). Modern approach uniting DWH and data lake
- Classic DWH + Data Lake — separate systems for different tasks. Mature approach for large organizations
- Cloud-native data platform — managed cloud services (BigQuery, Redshift, Synapse). Fast deployment, lower operational load
- Hybrid — DWH for regulated data and reporting, data lake for analytics and AI/ML
How to avoid unnecessary core replacement
The data platform is built on top of existing operational systems, not replacing them. Core banking, BSS, ERP remain sources of truth — the data platform consumes events and snapshots. When a source ages out, it can be replaced without rebuilding the platform. Replacing the existing DWH with a new platform is a separate task, usually parallel to model and mart rebuilding.
Risks, dependencies, constraints
- Source data quality — the platform will not fix garbage but will make it more visible
- Data catalog and lineage — without them users cannot find and understand the data
- Security and access control — especially critical for banks and government organizations
- Cloud provider dependency when choosing cloud-native — migration is expensive, choice requires intentionality
- Talent of analysts, data engineers, ML specialists — competition with large tech companies
How a phased roadmap should look
- Months 1-4: target architecture design, technology selection, team formation
- Months 5-12: build ingestion, raw, refined layers for priority domains (usually customers + products + transactions)
- Months 13-18: BI layer with unified metric definitions, migration of critical dashboards
- Months 19-24: AI/ML layer with feature store, first ML models in production
- After 24 months — expansion across domains, gradual migration of legacy DWH/marts
What else is worth exploring
Topics from the same area we usually explore together
CRM
Not an off-the-shelf CRM, but a properly built customer management contour — from first contact to loyalty.
→SolutionBI
Analytics is not pretty charts on the wall. It's the answer to 'why?' before the problem becomes a loss.
→SolutionContact Center
The contact center is not a phone station — it's the point where a client decides: stay with you or leave. The question is how it's built…
→SolutionOnboarding
Onboarding is your company's first impression. If it takes 5 days and 12 paper forms, there won't be a second impression.
→I do not just write about this. I can come in, examine your situation and design a solution for your specific landscape.
Discuss applying this →Ready to discuss your challenge?
Tell me what's not working or what needs to be built. First conversation — no obligations.
Usually respond within a few hours