advanced

Triggers

8 min readLast updated: 2026-06-30

1. Introduction

A Trigger determines exactly when a window's aggregated results are emitted (fired) during stream processing. While windowing groups elements by time, triggers decide when to output the results of those groupings.

2. Why This Concept Exists

By default, Apache Beam waits until the watermark passes the end of a window before emitting the results. However, in real-world scenarios, you might need:

  1. Early results: Seeing a hourly total update every 5 minutes (before the hour finishes).
  2. Late updates: Re-calculating the window's total if delayed data arrives. Triggers allow you to configure exactly when these outputs (panes) fire.

3. Key Terminology

  • Pane: An individual output emission from a window. A single window can fire multiple panes (e.g., early, on-time, and late panes).
  • Accumulating Mode: Emitted panes contain all data processed in the window so far.
  • Retracting Mode: Emitted panes contain only the difference (delta) from the previous emission, or output a retraction of the old value.

4. Trigger Categories

  • Event Time Triggers: Fire based on watermark progress (e.g. AfterWatermark).
  • Processing Time Triggers: Fire based on system wall-clock time (e.g., AfterProcessingTime).
  • Element Count Triggers: Fire after a specific number of records arrive (e.g. AfterCount).
  • Composite Triggers: Combine multiple trigger strategies using logic gates (e.g. Repeatedly, AfterFirst, AfterAll).

5. Visual Diagram

Pane 1 (Early)
Triggered by count

Pane 2 (Early)
Triggered by count

Pane 3 (On-Time)
Watermark passes end

Pane 4 (Late)
Triggered by late event

6. Code Example

Configuring a trigger that fires every 10 items or when the watermark passes:

python
import apache_beam as beam
from apache_beam.transforms.trigger import AfterWatermark, AfterCount, Repeatedly
from apache_beam.transforms.window import FixedWindows

with beam.Pipeline() as p:
    (p
     | "Read" >> beam.io.ReadFromPubSub(subscription="projects/my-project/subs/my-sub")
     | "Window" >> beam.WindowInto(
         FixedWindows(60 * 60), # 1 hour
         trigger=Repeatedly(AfterFirst(AfterWatermark(), AfterCount(10))),
         accumulation_mode=beam.transforms.trigger.AccumulationMode.ACCUMULATING
     ))

7. Code Explanation

  • FixedWindows(60 * 60) groups elements into hourly intervals.
  • trigger=Repeatedly(...) allows the window to fire multiple times.
  • AfterFirst(...) fires a pane as soon as either sub-trigger is met:
    • AfterWatermark(): The watermark passes the hour.
    • AfterCount(10): 10 new elements have arrived.
  • AccumulationMode.ACCUMULATING tells Beam to include previous values in subsequent updates.

8. Real Production Example

In live web traffic dashboards, you use processing-time triggers to refresh chart stats every 10 seconds:

python
trigger=Repeatedly(AfterProcessingTime(10))

This ensures the dashboard graphs look alive and update continuously, rather than updating once per hour.

9. Common Mistakes

  • Forgetting to set accumulation mode: If you define a custom trigger, you must explicitly specify accumulation_mode. Omitting it will throw a configuration exception.
  • Infinite loops with Count Triggers: If you set AfterCount(1) without wrapping it in Repeatedly, the window will fire once on the first element and never fire again.

10. Interview Perspective

  • Question: What is the difference between Accumulating and Retracting mode?
  • Answer: In Accumulating mode, subsequent window firings output the total running sum. In Retracting mode, subsequent firings output only the difference (delta) or issue a retraction of the previous value to let downstream databases adjust their counts.
  • Question: When does a window state get deleted?
  • Answer: The window state is held until the watermark passes the end of the window plus the allowed_lateness duration. After this limit, the state is cleared.

11. Best Practices

  • Use Accumulating mode when writing to databases that support overwrites/upserts.
  • Avoid using low element count triggers (like AfterCount(1)) on massive streams, as this creates excessive network noise and disk load.

12. Summary

  • Triggers control when window results are emitted.
  • Supports Event-Time, Processing-Time, and Element-Count conditions.
  • Must be configured with either Accumulating or Retracting pane modes.

13. Interactive Challenges

Challenge 1: Element Count Trigger (Beginner)

Write a windowing transform segment that groups elements into 10-minute fixed windows and fires a pane every time 100 new elements are processed.

Challenge 2: Processing-Time Ticker (Intermediate)

Write a windowing segment that applies a 1-hour fixed window, but fires a pane every 30 seconds of processing time to refresh dashboards.

Challenge 3: Watermark or Count (Advanced)

Write a windowing segment that applies a 30-minute fixed window, firing a pane either when the watermark passes the window end, OR early if 50 elements arrive.

14. Related Content

Advertisement
AdSense Slot #000001Leaderboard Banner (728x90)