Skip to content

Commit

Permalink
Feature: Enrichment using Bytewax (#13)
Browse files Browse the repository at this point in the history
Feature: Enrichment using Bytewax

- Initial commit of enrichment logic using Bytewax by @gAmUssA and @oli-kitty 
- Added logic to extract items and perform transformations.
- Implemented named joins with data formatting for compatibility. @oli-kitty 
- Integrated Redpanda console for streaming data management.
- Cleaned up code and updated Docker Compose and Makefile for better maintainability.
- Updated to a more recent version of Streamlit for improved UI. @carolinedlu 
- Replaced HTML tables with data_editor for a more user-friendly interface. @carolinedlu 
- Added wait logic for Pinot controller to ensure system readiness. @gAmUssA 
- Tidied up README and commands.md for clearer documentation. @gAmUssA 

Co-authored-by: Oli <[email protected]>
Co-authored-by: Caroline Frasca (Lu) <[email protected]>
  • Loading branch information
3 people committed Mar 14, 2024
1 parent cf6f210 commit 0698315
Show file tree
Hide file tree
Showing 29 changed files with 586 additions and 226 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
venv
raw_data
.cache
*/__pycache__
44 changes: 44 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
all: build run

build:
docker compose build --no-cache

run:
docker compose up -d

@echo "Waiting for Pinot Controller to be ready..."
@while true; do \
STATUS_CODE=$$(curl -s -o /dev/null -w '%{http_code}' \
'http://localhost:9000/health'); \
if [ "$$STATUS_CODE" -eq 200 ]; then \
break; \
fi; \
sleep 2; \
echo "Waiting for Pinot Controller..."; \
done
@echo "🍷 Pinot Controller is ready."

@echo "Waiting for Pinot Broker to be ready..."
@while true; do \
STATUS_CODE=$$(curl -s -o /dev/null -w '%{http_code}' \
'http://localhost:8099/health'); \
if [ "$$STATUS_CODE" -eq 200 ]; then \
break; \
fi; \
sleep 1; \
echo "Waiting for Pinot Broker..."; \
done
@echo "🍷 Pinot Broker is ready to receive queries."

@echo "🪲 Waiting for Kafka to be ready..."
@while ! nc -z localhost 9092; do \
sleep 1; \
echo "Waiting for Kafka..."; \
done
@echo "🪲 Kafka is ready."

@printf "Pinot Query UI - \033[4mhttp://localhost:9000\033[0m\n"
@printf "Streamlit Dashboard - \033[4mhttp://localhost:8502\033[0m"

stop:
docker compose down -v
46 changes: 24 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,39 +1,41 @@
# Pizza Shop Demo

This repository contains the code for the Pizza Shop Demo.
This repository contains the code for the Pizza Shop Demo, showcasing a real-time analytics application using Apache Pinot.

![](images/architecture.png)
## Prerequisites

## Run the demo
- [Docker Desktop](https://www.docker.com/products/docker-desktop): Ensure Docker Desktop is installed and running on your machine before starting the demo.

```bash
docker-compose \
-f docker-compose-base.yml \
-f docker-compose-pinot.yml \
-f docker-compose-dashboard-enriched-quarkus.yml \
up
```
![Architecture Diagram](images/architecture.png)

## How to Run the Demo

To start the demo, execute the following command in your terminal:

```bash
docker-compose \
-f docker-compose-base.yml \
-f docker-compose-pinot-m1.yml \
-f docker-compose-dashboard-enriched-quarkus.yml \
up
make
```

Once that's run, you can navigate to the following:
After the demo is up and running, you can access the following interfaces in your web browser:

* Pinot UI - http://localhost:9000
* Streamlit Dashboard - http://localhost:8502
- **Pinot UI:** [http://localhost:9000](http://localhost:9000)
- **Streamlit Dashboard:** [http://localhost:8502](http://localhost:8502)

You can find a deeper dive on each of the components at https://dev.startree.ai/docs/pinot/demo-apps/pizza-shop
For a detailed explanation of each component in the demo, visit the [documentation page](https://dev.startree.ai/docs/pinot/demo-apps/pizza-shop).

## Things that sometimes don't work
## Common Issues

Less frequently, the `pinot-add-table` service never returns code 0 if it created one table and not the other. To stop that service, run this:
Occasionally, the `pinot-add-table` service may not return a success code (0) if it creates one table but fails to create the other.
To resolve this issue, you can manually stop the service by running:

```
```bash
docker stop pinot-add-table
```

## Stopping the Demo

To stop all the services and clean up the resources used by the demo, run:

```bash
make stop
```
80 changes: 48 additions & 32 deletions demo/commands.md
Original file line number Diff line number Diff line change
@@ -1,75 +1,91 @@
# Demo Script Commands to run
# Demo Script Commands to Run

This is the demo for the Real-Time Analytics Stack Demo.

## Existing Architecture

To connect to the MySQL database, run:

```bash
docker exec -it mysql mysql -u mysqluser -p
```

Execute the following SQL queries to view sample data:

```sql
select id, first_name, last_name, email, country
FROM pizzashop.users
LIMIT 5;
select id, first_name, last_name, email, country
FROM pizzashop.users LIMIT 5;
```

```sql
select id, name, category, price
FROM pizzashop.products
LIMIT 5;
select id, name, category, price
FROM pizzashop.products LIMIT 5;
```

Exit the MySQL prompt:

```sql
exit
```

## RTA Architecture

Consume messages from the `orders` topic using kcat:

```bash
kcat -C -b localhost:29092 -t orders -u | jq '.'
```

Open VS Code
View configuration files using a generic editor or `cat` command:

```bash
cat pinot/config/orders/schema.json
```

* Show pinot/config/orders/schema.json
* Show pinot/config/orders/table.json
```bash
cat pinot/config/orders/table.json
```

Open Pinot UI
Access the Pinot UI and execute the following SQL query:

```sql
select id, price, productsOrdered, status, totalQuantity, ts, userId
from orders
limit 10
select id, price, productsOrdered, status, totalQuantity, ts, userId
from orders limit 10
```
Back to the terminal

Return to the terminal and consume messages from the `products` topic:

```bash
kcat -C -b localhost:29092 -t products -u | jq '.'
```

Open VS Code
View the `TopologyProducer.java` file:

* Show TopologyProducer.java
```bash
cat TopologyProducer.java
```

Back to terminal
Back in the terminal, consume a single message from the `enriched-order-items` topic:

```bash
kcat -C -b localhost:29092 -t enriched-order-items -c1 | jq '.'
```

Open VS Code
View the configuration files for enriched order items:

* Show pinot/config/order_items_enriched/schema.json
* Show pinot/config/order_items_enriched/table.json
```bash
cat pinot/config/order_items_enriched/schema.json
```

Back to Pinot UI
```bash
cat pinot/config/order_items_enriched/table.json
```

Switch to the Pinot UI and execute the following SQL queries:

```sql
select *
from order_items_enriched
LIMIT 10;
select *
from order_items_enriched LIMIT 10;
```

```sql
Expand All @@ -80,13 +96,13 @@ group by category
order by count(*) DESC
```

Open Streamlit

http://localhost:8502
Open the Streamlit dashboard at [http://localhost:8502](http://localhost:8502):

* Change the amount of data being shown
* Change the refresh rate
- Change the amount of data being shown.
- Change the refresh rate.

Open VS Code
Finally, review the Streamlit app code:

* Show app_enriched.py
```bash
cat app_enriched.py
```
54 changes: 0 additions & 54 deletions docker-compose-base.yml

This file was deleted.

80 changes: 80 additions & 0 deletions docker-compose-kafka.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
version: "3.8"

services:
kafka:
image: docker.io/bitnami/kafka:3.6
hostname: kafka
container_name: kafka
ports:
- "9092:9092"
- "29092:29092"
healthcheck: { test: nc -z localhost 9092, interval: 20s }
environment:
# KRaft settings
- KAFKA_CFG_NODE_ID=0
- KAFKA_CFG_PROCESS_ROLES=controller,broker
- KAFKA_CFG_CONTROLLER_QUORUM_VOTERS=0@kafka:9093
# Listeners
- KAFKA_CFG_LISTENERS=PLAINTEXT://:9092,CONTROLLER://:9093,PLAINTEXT_HOST://:29092
- KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://:9092,PLAINTEXT_HOST://localhost:29092
- KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
- KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER
- KAFKA_CFG_INTER_BROKER_LISTENER_NAME=PLAINTEXT

orders-service:
build: orders-service
restart: unless-stopped
container_name: orders-service
depends_on:
- mysql
- kafka
environment:
- MYSQL_SERVER=mysql
- KAFKA_BROKER_HOSTNAME=kafka
- KAFKA_BROKER_PORT=9092

mysql:
image: mysql/mysql-server:8.0.27
hostname: mysql
container_name: mysql
ports:
- "3306:3306"
environment:
- MYSQL_ROOT_PASSWORD=debezium
- MYSQL_USER=mysqluser
- MYSQL_PASSWORD=mysqlpw
volumes:
- ./mysql/mysql.cnf:/etc/mysql/conf.d
- ./mysql/mysql_bootstrap.sql:/docker-entrypoint-initdb.d/mysql_bootstrap.sql
- ./mysql/data:/var/lib/mysql-files/data

dataflow:
image: enrichment-bytewax
build: enrichment-bytewax
container_name: enrichment-bytewax
depends_on:
- kafka

console:
hostname: console
container_name: console
image: docker.redpanda.com/redpandadata/console:latest
restart: on-failure
entrypoint: /bin/sh
command: -c "echo \"$$CONSOLE_CONFIG_FILE\" > /tmp/config.yml; /app/console"
environment:
CONFIG_FILEPATH: /tmp/config.yml
CONSOLE_CONFIG_FILE: |
server:
listenPort: 9080
kafka:
brokers: ["kafka:9092"]
schemaRegistry:
enabled: false
urls: ["http://schema-registry:8081"]
connect:
enabled: false
ports:
- "9080:9080"
depends_on:
- kafka
Loading

0 comments on commit 0698315

Please sign in to comment.