This project aims to create a real-time data pipeline for collecting, processing, and visualizing telemetry data from Assetto Corsa Competizione (ACC), a popular racing simulation game. The project utilizes Mage for ETL processes, Kafka as a streaming broker, PostgreSQL as the data warehouse, and Metabase for creating interactive visualizations. By leveraging these open-source tools, we provide real-time insights into vehicle performance metrics and driver statistics.
The primary objectives of this project are:
- Automate Data Collection: Collect telemetry data from ACC, such as speed, tire pressure, and lap times, using a custom producer.
- Streaming ETL Pipeline: Use Mage to implement an ETL pipeline for processing real-time data and transforming it before loading it into a warehouse.
- Data Storage & Visualization: Store transformed data in PostgreSQL* and create interactive dashboards using Metabase to visualize driver and vehicle statistics in real time.
- Scalability: Ensure the pipeline can handle large volumes of data efficiently by using distributed, containerized components.
- Mage: Open-source ETL tool for managing data transformations.
- Apache Kafka: For message brokering between producer and consumer.
- PostgreSQL: To store the telemetry data for analysis.
- Metabase: For building and visualizing interactive dashboards.
- Docker & Docker Compose: To containerize all components for simplified deployment.
- Python: For the custom producer that streams telemetry data.
- PyAccSharedMemory: To read telemetry data from ACC via shared memory in real time.
- Data Producer: The
scripts/telemetry_producer.py
script produces telemetry data from Assetto Corsa Competizione and streams it to a Kafka topic using the PyAccSharedMemory library. - Kafka Broker: Kafka manages the real-time data flow, making it available for further transformation.
- Mage ETL Pipeline: The data passes through a Mage ETL pipeline for transformation and cleaning, ensuring it is properly formatted before being loaded.
- Data Storage: The cleaned data is loaded into PostgreSQL for storage.
- Visualization: Metabase uses the data from PostgreSQL to create real-time dashboards, providing insight into driver performance.
.
├── assetto-corsa-mage/ # Directory containing Mage ETL pipeline configurations
├── images/ # Directory with visual assets for documentation
│ ├── architecture_diagram.png # Architecture diagram for the project
│ └── sample_visualization.png # Example of the Metabase dashboard visualization
├── scripts/ # Directory for Python scripts
│ ├── telemetry_producer.py # Script to produce telemetry data from ACC
│ └── init.sql # Initialization script for setting up PostgreSQL schema
├── docker-compose.yaml # Docker Compose file to orchestrate all services
├── README.md # Project README file
└── requirements.txt # Python dependencies for running the telemetry producer script
assetto-corsa-mage/
: Contains configurations for the Mage ETL pipeline, which defines data extraction, transformation, and loading tasks.scripts/
:telemetry_producer.py
: The Python script responsible for collecting telemetry data from Assetto Corsa Competizione using PyAccSharedMemory and streaming it to Kafka.init.sql
: SQL script to set up the PostgreSQL database schema, including creating tables and any necessary initial configurations.
docker-compose.yaml
: Configuration file for Docker Compose, used to deploy and run Kafka, PostgreSQL, Mage, and Metabase as containerized services.requirements.txt
: Lists the Python dependencies needed to run the telemetry producer script.
- Docker and Docker Compose installed on your machine.
- Python environment: To run the telemetry producer script.
- Assetto Corsa or Assetto Corsa Competizione on your local machine
-
Clone the Repository
git clone https://github.com/robin-ede/ac-data-eng-project.git cd ac-data-eng-project
-
Install Python Dependencies
- Create a virtual environment and install the required packages.
python -m venv venv source venv/bin/activate # On Windows use `venv\Scripts\activate` pip install -r requirements.txt
-
Build and Start Docker Containers
docker-compose up -d
-
Verify Mage Pipeline Setup
- Open your browser and go to
http://localhost:6789
and navigate to the ac_pipeline in the Mage UI. - Initialize the pipeline to ensure it is ready to handle Kafka streams.
- Open your browser and go to
-
Run the Data Producer
- Start Assetto Corsa Competizione or Assetto Corsa on your machine.
- Begin a race session in the game to generate telemetry data.
- Navigate to the
scripts
directory.
cd scripts python telemetry_producer.py
-
Configure Metabase
- Start Metabase using Docker Compose and open your browser to
http://localhost:3000
. - Complete the initial setup with your information (e.g., name, email, and password).
- Connect to PostgreSQL:
- Database Type: PostgreSQL; Host:
analytics-postgres
; Port:5432
. - Database Name:
analyticsdb
; Username:analytics
; Password:analytics_password
.
- Database Type: PostgreSQL; Host:
- Start Metabase using Docker Compose and open your browser to
After successfully setting up and running the pipeline, the following outputs are generated:
-
Real-Time Data in PostgreSQL: The telemetry data from Assetto Corsa Competizione is collected, processed, and stored in PostgreSQL.
-
Interactive Dashboards: Using Metabase, users can visualize driver stats such as speed, tire pressure, race position, lap times, etc.
-
Live Dashboard Demonstration: Watch the following video for a demonstration of the live dashboard in action during an actual race: https://www.youtube.com/watch?v=y1Xymd8Zgic
- Expand Visualizations: Add more telemetry metrics, such as brake force and gear usage, to the Metabase dashboards.
- Performance Optimization: Tune the Mage ETL pipeline to handle larger volumes of telemetry data.
- Machine Learning Integration: Incorporate predictive analytics to anticipate vehicle performance based on telemetry data.
- Game Data API: Develop an API for users to query real-time game stats for other applications.
- Game Compatibility: Note that PyAccSharedMemory was developed for Assetto Corsa Competizione. However, this project uses the older version, Assetto Corsa, which may explain why some telemetry metrics like brake temperature, gap ahead/behind, etc., are not being recognized.
- Expand script functionality: Modify the telemetry producer script to collect data not only from user-controlled cars but also from other player or AI-controlled cars, providing broader insights into overall race dynamics.