Advanced LabHard

Lab: Real-Time Dashboard Pipeline

Estimated time: 60 mins

Who This Lab Is For

Advanced developers seeking to master speculative triggers, accumulation modes, and low-latency metrics dashboards.

What You Will Learn

  • How to write custom trigger configurations to emit data before watermarks pass.
  • How to manage window states using AccumulationMode.
  • How to calculate rolling metrics for real-time visual dashboards.

1. Business Scenario

Calculate continuous dashboard aggregations with dynamic triggers.

2. Input Dataset (\`dataset.csv\`)

Save the following raw rows locally as \`dataset.csv\` to test your pipeline:

text
timestamp,metric_name,value
1719830400,cpu_util,45.2
1719830405,cpu_util,48.0
1719830412,cpu_util,55.1
1719830420,cpu_util,60.5

3. Starter Code Skeleton

Create a local file named \`starter.py\` and copy the following skeleton. Complete the missing transformations:

python
# starter.py - Real-Time Dashboard Pipeline
import apache_beam as beam
from apache_beam.options.pipeline_options import PipelineOptions

def run_pipeline():
    options = PipelineOptions()
    with beam.Pipeline(options=options) as p:
        # TODO: Set sliding windows (600s length, 60s slide)
        # TODO: Setup custom trigger logic
        pass

if __name__ == "__main__":
    run_pipeline()

4. Lab Requirements

  • Assign event-time timestamps to telemetry logs.
  • Apply 10-minute sliding windows (sliding every 1 minute).
  • Configure triggers to fire early updates before watermarks pass when values exceed 50.0.

5. Step-by-Step Guide & Solution

Solution for Real-Time Dashboard Pipeline

Click below to reveal the complete, runnable Python SDK implementation solution and the step-by-step walkthrough to complete the lab.

Advertisement
AdSense Slot #847392Leaderboard Banner (728x90)