Last Minute Cheatsheets

High-density cheat sheets optimized for quick revision and recall before interviews or engineering assignments.

Cheatsheet3 mins read
Pipeline

Learn the fundamentals of creating, running, and managing Apache Beam pipelines.

Open Cheatsheet
Cheatsheet3 mins read
PCollection

Understand the core data container, its types, and state representations.

Open Cheatsheet
Cheatsheet3 mins read
ParDo

Master the fundamental transform for general-purpose parallel data processing.

Open Cheatsheet
Cheatsheet4 mins read
DoFn

Define custom element processing logic using the standard DoFn lifecycle.

Open Cheatsheet
Cheatsheet4 mins read
Windowing

Group unbounded streaming data into logical time intervals.

Open Cheatsheet
Cheatsheet4 mins read
Watermarks

Track progress and event-time completeness in streaming pipelines.

Open Cheatsheet
Cheatsheet4 mins read
Triggers

Control exactly when window results are materialized and sent downstream.

Open Cheatsheet
Cheatsheet3 mins read
Side Inputs

Pass supplementary lookup tables and configuration data into transforms.

Open Cheatsheet
Cheatsheet4 mins read
BigQuery IO

Read and write data high-throughput at scale to Google BigQuery.

Open Cheatsheet
Cheatsheet3 mins read
Pub/Sub IO

Integrate with serverless Google Cloud Pub/Sub for messaging.

Open Cheatsheet
Cheatsheet4 mins read
Dataflow

Deploy, run, and scale Apache Beam pipelines on Google Cloud Dataflow.

Open Cheatsheet