| run-id: | epochs | Reward Function | Training Setup | Avg 10 Laps Time | Average # Crashes/Lap | Comments | Run Script |
|---|---|---|---|---|---|---|---|
| exp_0 | 1,000,000 | None | - Collision Reset | N/A | N/A | The kart seems performs random actions, and it will never make it to the first checkpoint. However, one interesting observation is that it seems like it tries to slow down and turn away when about to collide with walls, not sure why it does that. | mlagents-learn ./Assets/Karting/Prefabs/AI/train_config.yaml --train --run-id=exp_1 --results-dir ./Assets/results/ |
| exp_1 | 1,000,000 | +1 for pass checkpoints | |||||
| -1 for hit wall | - Collision Reset | N/A | N/A | The training “mean rewards” starts from negative and converges into 0. And the model performs really bad. This is somehow expected since most of the time the car is not passing the checkpoints, thus not receiving rewards. There has to be a reward that guides the kart towards the checkpoint, which is in exp_2. | mlagents-learn ./Assets/Karting/Prefabs/AI/train_config.yaml --train --run-id=exp_1 --results-dir ./Assets/results/ | ||
| exp_2 | 1,000,000 | +1 for pass checkpoints | |||||
| -1 for hit wall | |||||||
| +0.05 for driving towards checkpoint | - Collision Reset | 00:01:57:92 | 313 | Simply adding reward for heading towards checkpoint greatly improves model’s performance. | mlagents-learn ./Assets/Karting/Prefabs/AI/train_config.yaml --train --run-id=exp_2 --results-dir ./Assets/results/ | ||
| exp_3 | 1,000,000 | +1 for pass checkpoints | |||||
| -1 for hit wall | |||||||
| +0.05 for driving towards checkpoint | |||||||
| +0.05 for local speed | - Collision Reset | 00:01:26:45 | 600 | The model runs faster! (expected) but also crashed more often (also expected). | mlagents-learn ./Assets/Karting/Prefabs/AI/train_config.yaml --train --run-id=exp_3 --results-dir ./Assets/results/ | ||
| exp_4 | 1,000,000 | -1 for hit wall | |||||
| +0.05 for driving towards checkpoint |
Incrementally adding reward function design
Define performance metrics for the entire experiment
Define clear experiment setup for each experiment
Experiment Setup
Effectiveness of Reward Function
Ablation Study Under Standard Setting(Hit Penalty = -1, Pass Ckpt = 1):
Self-Designed Reward Function
speed + direction + baseline + self designed reward function 1
Parameter Effectiveness Experiment
Hello everyone, this is a tutorial of training your own model in our Formula-RL environment. We assume that you have pre-installed Unity3D and ml-agents. If you haven’t install them already, please refer to the official documentation from Unity, which I’ll post the link below in the description.
First thing we need to do is to open our project in the Unity. To open our project, you need to start Unity hub and select open on the upper right corner, and select our source project folder.


Once it loaded up the unity, you will probably see some scene popped up. We have severals scenes in our project, and for training your model specifically, we will select scene located under ./Assets/Karting/Scene/MountainTopTrainingScene.unity

Double click the scene file, if everything goes right, it will popup the scene we see right now, this is the training environment for your own model.