Foundation LabEasy

Lab: Product Catalog

Estimated time: 15 mins

Who This Lab Is For

Beginner developers starting out with basic value filtering, string parsing, and simple conditional operations.

What You Will Learn

  • How to structure unstructured CSV input into typed JSON-like structures.
  • How to filter collections based on price threshold criteria.
  • How to format variables into clean readable output logs.

1. Business Scenario

Filter products by price range and list catalog summaries.

2. Input Dataset (\`dataset.csv\`)

Save the following raw rows locally as \`dataset.csv\` to test your pipeline:

text
product_id,name,category,price
p1,Laptop,Electronics,999.99
p2,Headphones,Electronics,149.99
p3,Desk Chair,Furniture,250.00
p4,Notebook,Stationery,4.99
p5,Monitor,Electronics,300.00

3. Starter Code Skeleton

Create a local file named \`starter.py\` and copy the following skeleton. Complete the missing transformations:

python
# starter.py - Product Catalog
import apache_beam as beam
from apache_beam.options.pipeline_options import PipelineOptions

def run_pipeline():
    options = PipelineOptions()
    with beam.Pipeline(options=options) as p:
        # TODO: Parse catalog records
        # TODO: Filter items >= 100
        # TODO: Write to text output
        pass

if __name__ == "__main__":
    run_pipeline()

4. Lab Requirements

  • Parse product attributes including price and category.
  • Filter out any product that is priced under $100.00.
  • Aggregate and write the qualifying products to the final sink.

5. Step-by-Step Guide & Solution

Solution for Product Catalog

Click below to reveal the complete, runnable Python SDK implementation solution and the step-by-step walkthrough to complete the lab.

Advertisement
AdSense Slot #847392Leaderboard Banner (728x90)