Beginner developers seeking to perform row-level calculations and global mathematical aggregations.
Summarize transactions to calculate total sales revenue.
Save the following raw rows locally as \`dataset.csv\` to test your pipeline:
transaction_id,product_id,quantity,unit_price
tx1,p1,1,999.99
tx2,p2,2,149.99
tx3,p3,1,250.00
tx4,p2,1,149.99
tx5,p4,10,4.99Create a local file named \`starter.py\` and copy the following skeleton. Complete the missing transformations:
# starter.py - Sales Report
import apache_beam as beam
from apache_beam.options.pipeline_options import PipelineOptions
def run_pipeline():
options = PipelineOptions()
with beam.Pipeline(options=options) as p:
# TODO: Parse sales quantities and unit prices
# TODO: Calculate revenue per transaction
# TODO: Sum global sales revenue
pass
if __name__ == "__main__":
run_pipeline()