Streaming LabHard

Lab: IoT Analytics

Estimated time: 45 mins

Who This Lab Is For

Advanced developers practicing sliding window aggregates, sensor streaming, and value comparisons over time.

What You Will Learn

  • How to configure overlapping Sliding Windows in Apache Beam.
  • How to find maximum values per key inside active sliding windows.
  • How to format window bounds dynamically for time-series outputs.

1. Business Scenario

Aggregate device temperature readings using sliding windows.

2. Input Dataset (\`dataset.csv\`)

Save the following raw rows locally as \`dataset.csv\` to test your pipeline:

text
timestamp,device_id,temperature
1719830400,d1,68.5
1719830410,d2,72.1
1719830415,d1,69.0
1719830440,d2,74.5
1719830450,d1,70.2

3. Starter Code Skeleton

Create a local file named \`starter.py\` and copy the following skeleton. Complete the missing transformations:

python
# starter.py - IoT Analytics
import apache_beam as beam
from apache_beam.options.pipeline_options import PipelineOptions

def run_pipeline():
    options = PipelineOptions()
    with beam.Pipeline(options=options) as p:
        # TODO: Assign timestamps
        # TODO: Apply Sliding Windows (duration=60, slide=30)
        # TODO: Find max temperature per device
        pass

if __name__ == "__main__":
    run_pipeline()

4. Lab Requirements

  • Extract event timestamps for temperature readings.
  • Apply 60-second Sliding Windows that slide every 30 seconds.
  • Identify the maximum temperature recorded per device in each sliding window.

5. Step-by-Step Guide & Solution

Solution for IoT Analytics

Click below to reveal the complete, runnable Python SDK implementation solution and the step-by-step walkthrough to complete the lab.

Advertisement
AdSense Slot #847392Leaderboard Banner (728x90)