GitHub Repository | Duration: 2 hours | Difficulty: Beginner to Advanced
Overview
This hands-on workshop walks you through building a complete real-time data pipeline using Confluent Cloud. You’ll stream live cryptocurrency price data from the CoinGecko API, process it with Apache Flink SQL, and materialize it as Apache Iceberg tables using Tableflow — all queryable through DuckDB.
What You’ll Learn
Set up and configure a Kafka cluster on Confluent Cloud
Create topics and ingest live data using HTTP Source Connector
Materialize Kafka topics as Iceberg tables with Tableflow
Write Flink SQL for real-time stream processing
Query real-time and historical data with DuckDB
Technologies
Apache Kafka | Data streaming and topic management |
Apache Flink | Real-time stream processing via Flink SQL |
Tableflow | Materializing Kafka topics as Apache Iceberg tables |
DuckDB | Lightweight analytics on Iceberg tables |
Schema Registry | AVRO schema validation |
CoinGecko API | Live cryptocurrency price data source |
Workshop Modules
Module 1: Setting Up Confluent Cloud
15 minutes
Validate prerequisite tools, authenticate with Confluent Cloud, create a Kafka cluster, generate API keys, and validate Tableflow access.
Module 2: Kafka Hands-On with CoinGecko Data
30 minutes
Create Kafka topics, set up an HTTP Source Connector for live cryptocurrency data, produce and consume real-time price events.
Module 3: Tableflow & Iceberg Setup
25 minutes
Materialize the crypto-prices topic as an Iceberg table via Tableflow, connect DuckDB to the Iceberg REST Catalog, and run real-time analytics queries.
Module 4: Flink Stream Processing
45 minutes
The core module. Create a Flink compute pool, transform nested cryptocurrency data, write SQL queries for real-time analysis, and create derived tables:
price-alerts— threshold-based price alertscrypto-trends— rolling trend analysiscrypto-predictions— pattern-based predictions
-- Example: Real-time price alerts
SELECT symbol, price, `timestamp`
FROM crypto_prices_exploded
WHERE price > 50000 AND symbol = 'bitcoin';Getting Started
The workshop supports multiple environments:
GitHub Codespaces — one-click setup, everything pre-configured
VS Code Dev Containers — local Docker-based environment
Local setup — bring your own tools (Confluent CLI, DuckDB, jq)
git clone https://github.com/gAmUssA/cc-workshop.git
cd cc-workshopSee the prerequisites in the repo README.