Advanced developers designing serverless data ingestion architectures in Google Cloud Platform.
Design a streaming ingestion pipeline from Google Cloud Pub/Sub to Google BigQuery.
Save the following raw rows locally as \`dataset.csv\` to test your pipeline:
subscription: "projects/corp/subscriptions/clicks-sub"
payload: {"click_id": "c982", "user_id": "u43", "url": "/pricing", "ts": 1719830400}Create a local file named \`starter.py\` and copy the following skeleton. Complete the missing transformations:
# starter.py - Pub/Sub to BigQuery
import apache_beam as beam
from apache_beam.options.pipeline_options import PipelineOptions
def run_pipeline():
options = PipelineOptions()
with beam.Pipeline(options=options) as p:
# TODO: Read from Pub/Sub
# TODO: Parse schema and handle errors
# TODO: Write to BigQuery (Storage Write API)
pass
if __name__ == "__main__":
run_pipeline()