Intermediate developers practicing transaction processing, business status filtering, and customer key aggregation.
Analyze retail transactions and generate total customer spending aggregations.
Save the following raw rows locally as \`dataset.csv\` to test your pipeline:
order_id,customer_id,amount,status
o1,c1,120.50,COMPLETED
o2,c2,80.00,PENDING
o3,c1,45.00,COMPLETED
o4,c3,300.00,COMPLETED
o5,c2,150.00,FAILEDCreate a local file named \`starter.py\` and copy the following skeleton. Complete the missing transformations:
# starter.py - Customer Orders
import apache_beam as beam
from apache_beam.options.pipeline_options import PipelineOptions
def run_pipeline():
options = PipelineOptions()
with beam.Pipeline(options=options) as p:
# TODO: Read customer orders
# TODO: Filter out FAILED and PENDING orders
# TODO: Aggregate spending per customer
pass
if __name__ == "__main__":
run_pipeline()