This project focuses on using reinforcement learning to train AI agents for the Trackmania F-1 racing game. Utilizing the TMRL framework, we implemented deep reinforcement learning strategies to enable real-time learning and adaptation to dynamic racing conditions.
- Algorithm: Soft Actor-Critic (SAC)
- Inputs: LIDAR data and Recurrent Neural Networks (RNNs)
- Environments:
- Pure LIDAR
- LIDAR with track progress
- Hybrid environments combining visual and sensory data
- Best lap time: 35 seconds (nearing the best human performance of 30 seconds)
- Significant improvements in training efficiency and track navigation
Trackmania (2020) is a renowned racing game offering a blend of high-speed driving and intricate track design. Applying Reinforcement Learning (RL) to Trackmania allows for AI-driven optimization in a complex, interactive environment. Using the TMRL framework, developers can train agents to navigate the game’s challenging tracks efficiently.
- Observations: Images, speed, telemetry data, and velocity norms.
- Actions: Analog inputs emulating an Xbox360 controller or binary arrow presses for gas, brake, steering angle, etc.
Inspired by behaviorism, the environment provides rewards based on performance metrics, such as the effectiveness in covering track sections efficiently.
- LIDAR Environment: Simplified input with 19-beam LIDAR measurements, optimized for MLP models.
- LIDAR with Track Progress Environment: Enhanced LIDAR environment with track completion data for predictive capabilities.
- Hybrid Environment: Combines visual data and LIDAR measurements for a comprehensive training approach.
- Performance: Improved training speed and track navigation, reduced collision frequency.
- Lap Time: 45-50 seconds
- Performance: Rapid training progress, better anticipation of track sections.
- Lap Time: 35 seconds
- Performance: Gradual improvements but slow training, tendency to hug track edges.
Each experiment provided valuable insights into AI-driven racing strategies, demonstrating the potential of incorporating contextual awareness into AI systems. The integration of RNNs with LIDAR track progress data achieved significant performance improvements, suggesting promising strategies for refining autonomous systems in high-speed racing environments.
- Clone the project
- Navigate into the folder and do
pip install -e tmrl-drive
-
Download the Config for LIDAR from here
-
Use the TMRL track editor to create your desired track.
-
Next, use
python -m tmrl --record-reward
to generated the reward file. This step tracks the global points that you travel in the track, and helps the model penalize itself if it doesnt reach all the points. -
Run these 3 commands in 3 separate terminals
python -m tmrl --server
python -m tmrl --train
python -m tmrl --worker
The
server
is responsible for consolidating model weights, and passing it to the trainer, whiletrainer
is responsible for actually training it.worker
is responsible for interacting with the game.
Training will take anywhere between 1-3 days on RTX 3070.
- Download the Config for LIDAR from above,
- Paste the weights found in the folder in your home/weights folder.
- Run
python -m tmrl --test
- Checkout to the hybrid_environment branch:
git checkout hybrid_environment
- Follow the above steps, but use
config-imgs.json
instead of the LIDAR config.