The three-stage streaming pipeline is not a new idea — it is the streaming equivalent of a well-understood batch ETL pattern. In batch, you stage raw files before loading them. In streaming, you land raw messages before deserializing them. The principle is the same; the implementation details differ because the data is arriving continuously instead of in scheduled drops.
I keep coming back to this pattern because I keep seeing pipelines built without it. A single job that reads from Kafka, deserializes the message, applies business logic, and writes to the serving layer — all in one pass. It works until the schema changes, or the transformation logic has a bug, or the serving layer is briefly unavailable. Then everything breaks at once and you do not know which part failed.
Stage 1: Raw Ingest
The raw ingest job has one responsibility: read messages from the Kafka topic and write them to durable storage, unchanged. No parsing. No transformation. No validation beyond "is this a valid byte sequence."
What "durable storage" looks like depends on your stack. In 2015, common choices are:
- Azure Blob Storage or ADLS (especially for Event Hubs pipelines)
- HDFS (for on-premises Hadoop clusters)
- A staging SQL Server table with a
raw_payload NVARCHAR(MAX)column - A local filesystem with a reliable backup (not recommended for anything critical)
The key properties of your raw store: it must be durable (survives machine failures), it must be queryable by arrival time (so stage 2 can find the messages it has not yet processed), and it must be append-only (so you can reprocess without risk of overwriting what you are reprocessing from).
-- Raw zone table in SQL Server (works for moderate volumes)
CREATE TABLE kafka_raw_landing (
id BIGINT IDENTITY(1,1) PRIMARY KEY,
source_topic NVARCHAR(256) NOT NULL,
partition_id INT NOT NULL,
partition_offset BIGINT NOT NULL,
arrived_at DATETIME2 NOT NULL DEFAULT SYSUTCDATETIME(),
raw_payload NVARCHAR(MAX) NOT NULL,
processing_status NVARCHAR(32) NOT NULL DEFAULT 'pending',
UNIQUE (source_topic, partition_id, partition_offset) -- idempotent writes
);
The UNIQUE constraint on topic/partition/offset means you can safely replay the ingest job — duplicate messages are rejected rather than double-landed.
Stage 2: Deserialize
The deserialize job reads from the raw zone, parses messages into structured records, and writes them to a structured staging area. This is where schema enforcement happens. This is where you catch format changes. This is where bad messages go to a dead letter store rather than crashing the pipeline.
-- Structured staging table (populated by the deserialize job)
CREATE TABLE sensor_readings_staged (
raw_id BIGINT NOT NULL, -- FK back to raw_landing
sensor_id NVARCHAR(64) NOT NULL,
reading_value DECIMAL(18,4) NOT NULL,
reading_unit NVARCHAR(16) NOT NULL,
reading_ts DATETIME2 NOT NULL,
deserialized_at DATETIME2 NOT NULL DEFAULT SYSUTCDATETIME()
);
CREATE TABLE deserialize_errors (
raw_id BIGINT NOT NULL,
error_message NVARCHAR(MAX) NOT NULL,
failed_at DATETIME2 NOT NULL DEFAULT SYSUTCDATETIME()
);
The deserialize job runs as a micro-batch — every N minutes, process all pending rows in kafka_raw_landing. Successes move to sensor_readings_staged. Failures move to deserialize_errors with the error message. Both outcomes update processing_status in the raw table so the record is not re-processed on the next run.
When the schema changes and the deserializer fails, you fix the deserializer, reset the affected records back to pending, and re-run. The raw data is still there. Nothing is lost.
Stage 3: Transform and Merge
The transform job reads from the structured staging area and writes to your serving layer — a fact table, a Delta table, a data mart. This is where business logic lives: slowly changing dimension handling, deduplication, derived column calculations, surrogate key lookup.
Separating this from stage 2 matters because transform logic changes more frequently than deserialization logic. If you bundle them, every transform change requires retesting the deserializer. Kept separate, each stage has a clear interface (the structured staging table) and can be independently deployed, tested, and replayed.
The Operational Win
The pattern sounds like more moving parts, and it is. But the operational properties are better:
- Stage 1 fails: no data lost, messages still in Kafka (within retention window)
- Stage 2 fails: no data lost, raw messages in raw zone, just unprocessed
- Stage 3 fails: no data lost, structured records in staging, just un-merged
- Schema changes: fix stage 2, replay from raw zone, stages 1 and 3 unchanged
- Transform bug: fix stage 3, replay from structured staging, stages 1 and 2 unchanged
Compare that to the monolithic approach, where any of these failures requires you to figure out how far the processing got, reconstruct state, and decide whether to replay from Kafka (if the data is still within retention) or from a backup (if it is not).
Three stages. Clear interfaces. Independent failure domains. I have built this pattern across enough client environments to stop questioning it. I am here to help if you want to adapt it to your specific stack.