DL Experiments

All Conv/FC layers are appended with a BN layer

Some of Validation Loss Plots are wrong!


Model Model Details Dataset Epochs Optimizer Batchsize Train Acc% Val Acc% Data Aug Others
1 Deep-Emotion Vanilla, same as deep-emotion repo, baseline FER_CKPLUS 100 Adam, lr=1e-4, weight_decay=1e-4 64 64.853 62.573 RandomFlip params=67,037
2 Deep-Emotion Wider, channel: 10⇒64 FER_CKPLUS 100 Adam, lr=1e-4, weight_decay=1e-4 64 85.233 64.375 RandomFlip params=394,763
3 Deep-Emotion Deeper, two more conv layers FER_CKPLUS 100 Adam, lr=1e-4, weight_decay=1e-4 64 67.977 58.509 RandomFlip params=68,897
4 Deep-Emotion Wider + Deformable Conv (2 lyrs) FER_CKPLUS 100 Adam, lr=1e-4, weight_decay=1e-4 64 89.101 68.561 RandomFlip params=352,065
5 VGG Vanilla VGG FER_CKPLUS 100 Adam, lr=1e-4, weight_decay=1e-4 64 98.587 70.241 RandomFlip params=40,376,519
6 Deep-Emotion Wider, channel: 10⇒64 CK_PLUS_256
(48x48) 100 Adam, lr=1e-4, weight_decay=1e-4 64 100 72.727
[80+] RandomFlip params=394,763
7 Deep-Emotion Wider, channel: 10⇒64 CK_PLUS_256
(256x256) 100 Adam, lr=1e-4, weight_decay=1e-4 16 98.639 93.939 RandomFlip
RandomCrop
(224x224) params=9,969,163
8 Deep-Emotion
Wider, channel: 10⇒64
(10-fold cross validation) CK_PLUS_256
(256x256) 100
Adam, lr=1e-4, weight_decay=1e-4 16 99.660
(10-fold) 92.055
(10-fold) RandomFlip
RandomCrop
(224x224) params=9,969,163
9 Deep-Emotion
Wider + Deformable Conv (2 lyrs)
(10-fold cross validation) CK_PLUS_256
(256x256) 100 Adam, lr=1e-4, weight_decay=1e-4 16 100
(10-fold) 95.095
(10-fold) RandomFlip
RandomCrop
(224x224) params=9,926,465
10 Deep-Emotion224
Deep_Emotion224
(10-fold cross validation) CK_PLUS_256
(256x256) 200 Adam, lr=5e-5, weight_decay=1e-4 16 99.660
(10-fold) 94.186
(10-fold) RandomFlip
RandomCrop
(224x224) params=1,344,279
11 Deep-Emotion Baseline Deep-Emotion
(10-fold cross validation) CK_PLUS_256
(48x48) 300 Adam, lr=1e-4, weight_decay=1e-4 32 86.646
(10-fold) 85.597
(10-fold) RandomFlip
~~RandomCrop
(224x224)~~ params=67,037
12 Deep-Emotion Baseline Deep-Emotion FER_2013 100 Adam, lr=1e-4, weight_decay=1e-4 64 66.571 51.212 None params=67,037
13 Deep-Emotion Baseline Deep-Emotion FER_2013 100 Adam, lr=1e-4, weight_decay=1e-4 64 58.783 54.611 RandomFlip params=67,037
14 Deep-Emotion Baseline Deep-Emotion
(Closest to paper settings) FER_2013 500 Adam, lr=5e-3, weight_decay=1e-4 64 - Loss Explode RandomFlip params=67,037
15 Deep-Emotion Baseline Deep-Emotion
(changed the order of RELU) FER_2013 100 Adam, lr=1e-4, weight_decay=1e-4 64 60.256 54.249 RandomFlip params=67,037
16 Deep-Emotion Baseline Deep-Emotion
(changed the order of RELU) FER_2013 500 Adam, lr=1e-4, weight_decay=1e-4 64 69.563 54.444 RandomFlip params=67,037
17 Deep-Emotion Baseline Deep-Emotion
(changed the order of RELU) FER_2013 500 Adam, lr=1e-4, weight_decay=1e-4, ReduceLROnPlateau 64 ReduceLROnPlateau just keep decreasing LR to tiny value And it does not help lowering val loss RandomFlip params=67,037
patience=5
18 Deep-Emotion Baseline + Deformable Conv
(2 layers) FER_2013 100 Adam, lr=1e-4, weight_decay=1e-4 64 61.754 56.617 RandomFlip
ColorJitter(0.3,0.3,0.3) params=70,131
19 Simple CNN 3Conv + 2FC, simple form
(while gets similar acc as Deep-Emotion???) FER_2013 100 Adam, lr=1e-4, weight_decay=1e-4 64 60.782 55.782 RandomFlip
ColorJitter(0.3,0.3,0.3) params=44,935
20 Deep-Emotion Baseline + Deformable Conv
(2 layers)
Corrected Val Loss Plot FER_2013 100 Adam, lr=1e-4, weight_decay=1e-2 64 59.915 56.590 RandomFlip
ColorJitter(0.3,0.3,0.3) params=70,131
21
Deep-Emotion Deep_Emotion224
(10-fold cross validation) CK_PLUS_256
(256x256) 200 Adam, lr=1e-4, weight_decay=1e-2 64 99.966
(10-fold) 94.517
(10-fold) RandomFlip
RandomCrop
(224x224) params=1,344,279
compare w/ 10
22
Deep-Emotion Deep_Emotion224 w/ 2 dropout (0.5) (10-fold cross validation) CK_PLUS_256
(256x256) 200 Adam, lr=1e-4, weight_decay=1e-2 64 58.443
(dropout too much) 91.126
(10-fold) RandomFlip
RandomCrop
(224x224) params=1,344,279
23 Deep-Emotion Deep_Emotion224 w/ 1 dropout(0.8) (after first FC)
(10-fold cross validation) CK_PLUS_256
(256x256) 200 Adam, lr=1e-4, weight_decay=1e-2 64 80.667
(10-fold) 89.015
(10-fold) RandomFlip
RandomCrop
(224x224) params=1,344,279
24 Deep-Emotion Deep_Emotion224 w/ 2 dropout (0.2) (10-fold cross validation) CK_PLUS_256
(256x256) 400 Adam, lr=1e-4, weight_decay=1e-2 64 90.927
(10-fold) 96.932
(10-fold) RandomFlip
RandomCrop
(224x224) params=1,344,279
25 Deep-Emotion Deep_Emotion + wider + Deformable Conv w/ 2 dropout (0.2) (10-fold cross validation) CK_PLUS_256
(48x48) 200 Adam, lr=1e-4, weight_decay=1e-2 64 89.092
(10-fold) 95.133
(10-fold) RandomFlip
params=352,065
26 Deep-Emotion Deep_Emotion + wider + Deformable Conv w/ 2 dropout (0.2) (10-fold cross validation) CK_PLUS_256
(48x48) 200 Adam, lr=1e-4, weight_decay=1e-2, ReduceLRonPlateau(pat.=15) 64 88.175
(10-fold) 96.628
(10-fold) RandomFlip
params=352,065
27 Deep-Emotion Deep_Emotion + wider + Deformable Conv w/ 2 dropout (0.2) (10-fold cross validation) CK_PLUS_256
(48x48) 200 Adam, lr=1e-4, weight_decay=1e-2, ReduceLRonPlateau(pat.=20,min_lr=1e-7) 64 89.058
(10-fold) 96.648
(10-fold) RandomFlip
params=352,065
28 Deep-Emotion Deep_Emotion + wider + Deformable Conv w/ 2 dropout (0.4) (10-fold cross validation)

2 * dropout(0.4) seems too much regularization | CK_PLUS_256 (48x48) | 200 | Adam, lr=1e-4, weight_decay=1e-2, ReduceLRonPlateau(pat.=20,min_lr=1e-7) | 64 | 70.473 (10-fold) | 95.417 (10-fold) | RandomFlip | params=352,065 | | 29 | Deep-Emotion | Deep_Emotion + wider + Deformable Conv w/ 2 dropout (0.2) (10-fold cross validation) | CK_PLUS_256 (48x48) | 200 | Adam, lr=1e-4, weight_decay=1e-2, ReduceLRonPlateau(pat.=20,min_lr=1e-7) | 64 | 92.490 (10-fold) | 90.237 (10-fold) | RandomFlip Corrected Augmentation: only trainset augment, valset does not augment | params=352,065 | | 30 | | | | | | | | | | | | 31 | | | | | | | | | | |

Plots


(associated experiment is specified in the caption)


Some Intermediate Conclusion