Last Minute RevisionEvergreen

Cheatsheet: Windowing

Revision time: 4 mins

Topic Overview

Group unbounded streaming data into logical time intervals.

Syntax Snapshot

python
import apache_beam as beam
from apache_beam.transforms.window import FixedWindows, SlidingWindows, Sessions

# 1. Fixed Windows (non-overlapping)
fixed = stream | beam.WindowInto(FixedWindows(60))

# 2. Sliding Windows (overlapping)
sliding = stream | beam.WindowInto(SlidingWindows(60, 30))

# 3. Session Windows (activity gap-based)
sessions = stream | beam.WindowInto(Sessions(300))

Key Points

  • Windowing segments unbounded collections into bounded segments for aggregation.
  • Fixed: Periodic, aligned boundaries (e.g. hourly, daily).
  • Sliding: Overlapping intervals for moving averages (e.g. 5-min average every 1 min).
  • Sessions: Dynamic, unaligned boundaries closed after periods of inactivity.

Production Recommendations

Developer Checklist
Ensure window periods align with downstream storage limitations and business logic requirements to avoid state bloat.
Advertisement
AdSense Slot #556677Leaderboard Banner (728x90)