Formula-RL | Notion

Experiments

run-id:	epochs	Reward Function	Training Setup	Avg 10 Laps Time	Average # Crashes/Lap	Comments	Run Script
exp_0	1,000,000	None	- Collision Reset	N/A	N/A	The kart seems performs random actions, and it will never make it to the first checkpoint. However, one interesting observation is that it seems like it tries to slow down and turn away when about to collide with walls, not sure why it does that.	mlagents-learn ./Assets/Karting/Prefabs/AI/train_config.yaml --train --run-id=exp_1 --results-dir ./Assets/results/
exp_1	1,000,000	+1 for pass checkpoints
-1 for hit wall	- Collision Reset	N/A	N/A	The training “mean rewards” starts from negative and converges into 0. And the model performs really bad. This is somehow expected since most of the time the car is not passing the checkpoints, thus not receiving rewards. There has to be a reward that guides the kart towards the checkpoint, which is in exp_2.	mlagents-learn ./Assets/Karting/Prefabs/AI/train_config.yaml --train --run-id=exp_1 --results-dir ./Assets/results/
exp_2	1,000,000	+1 for pass checkpoints
-1 for hit wall
+0.05 for driving towards checkpoint	- Collision Reset	00:01:57:92	313	Simply adding reward for heading towards checkpoint greatly improves model’s performance.	mlagents-learn ./Assets/Karting/Prefabs/AI/train_config.yaml --train --run-id=exp_2 --results-dir ./Assets/results/
exp_3	1,000,000	+1 for pass checkpoints
-1 for hit wall
+0.05 for driving towards checkpoint
+0.05 for local speed	- Collision Reset	00:01:26:45	600	The model runs faster! (expected) but also crashed more often (also expected).	mlagents-learn ./Assets/Karting/Prefabs/AI/train_config.yaml --train --run-id=exp_3 --results-dir ./Assets/results/
exp_4	1,000,000	-1 for hit wall
+0.05 for driving towards checkpoint

improved speed reward (0.1) | - Collision Reset | 00:01:29:47 | 588 | The model runs faster especially during the curved road (more smoothly). The number of collision also reduces because most of the collision happens in the curved road. | mlagents-learn ./Assets/Karting/Prefabs/AI/train_config.yaml --train --run-id=exp_4 --results-dir ./Assets/results/ | | exp_5 | 1,000,000 | +1 for pass checkpoints -1 for hit wall +0.05 for local speed | - Collision Reset | N/A | N/A | | mlagents-learn ./Assets/Karting/Prefabs/AI/train_config.yaml --train --run-id=exp_5 --results-dir ./Assets/results/ |

Plan

Incrementally adding reward function design
- No Reward Function
- Simple Reward Function
- Complex Reward Function
- Ablation Study of Complex Reward Function
Define performance metrics for the entire experiment
- Average Lap Time
- Average # of Crashes per Lap
- Subjective Comments of Model Performance
Define clear experiment setup for each experiment
- How many iteration=1,000,000 to train?
- Training setup:
  - Do we want position reset after colliding with wall?
    - Training Mode / Inference Mode
  - ...
Experiment Setup
- Iteration: 1, 000, 000 (experiment: 100, 000; 30, 000; 1, 000, 000; 2, 000, 000 做一个对比实验，讲为什么选iteration = 1000000)
- standard: hit penalty = -1, checkpoint reward = 5
- experiment evaluation standard:
  - Avg 10 Laps Time
  - Average number of crashes per lab (last 10 laps & every lap during iteration? 这样算两个结果)
  - iteration vs mean reward (training process)
Effectiveness of Reward Function
- No reward function
- baseline
Ablation Study Under Standard Setting(Hit Penalty = -1, Pass Ckpt = 1)：
1. baseline: checkpoint reward + 撞墙
2. speed + baseline (Xinyu)
3. direction + baseline (ZB)
4. speed + direction + baseline (ZB)
5. speed + direction + baseline + self designed reward function 1 (ZB)
6. direction + baseline + self designed reward function 1 (不一定做)
7. speed + baseline + self designed reward function 1 (不一定做)
Self-Designed Reward Function

speed + direction + baseline + self designed reward function 1
Parameter Effectiveness Experiment
- hit penalty = -1, -5, -10 under different experiment

Demo Script

Hello everyone, this is a tutorial of training your own model in our Formula-RL environment. We assume that you have pre-installed Unity3D and ml-agents. If you haven’t install them already, please refer to the official documentation from Unity, which I’ll post the link below in the description.
- https://github.com/Unity-Technologies/ml-agents/blob/main/docs/Installation.md
First thing we need to do is to open our project in the Unity. To open our project, you need to start Unity hub and select open on the upper right corner, and select our source project folder.
Once it loaded up the unity, you will probably see some scene popped up. We have severals scenes in our project, and for training your model specifically, we will select scene located under ./Assets/Karting/Scene/MountainTopTrainingScene.unity
Double click the scene file, if everything goes right, it will popup the scene we see right now, this is the training environment for your own model.