All Conv/FC layers are appended with a BN layer
Some of Validation Loss Plots are wrong!
| Model | Model Details | Dataset | Epochs | Optimizer | Batchsize | Train Acc% |
Val Acc% |
Data Aug | Others | |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Deep-Emotion | Vanilla, same as deep-emotion repo, baseline | FER_CKPLUS | 100 | Adam, lr=1e-4, weight_decay=1e-4 | 64 | 64.853 | 62.573 | RandomFlip | params=67,037 |
| 2 | Deep-Emotion | Wider, channel: 10⇒64 | FER_CKPLUS | 100 | Adam, lr=1e-4, weight_decay=1e-4 | 64 | 85.233 | 64.375 | RandomFlip | params=394,763 |
| 3 | Deep-Emotion | Deeper, two more conv layers | FER_CKPLUS | 100 | Adam, lr=1e-4, weight_decay=1e-4 | 64 | 67.977 | 58.509 | RandomFlip | params=68,897 |
| 4 | Deep-Emotion | Wider + Deformable Conv (2 lyrs) | FER_CKPLUS | 100 | Adam, lr=1e-4, weight_decay=1e-4 | 64 | 89.101 | 68.561 | RandomFlip | params=352,065 |
| 5 | VGG | Vanilla VGG | FER_CKPLUS | 100 | Adam, lr=1e-4, weight_decay=1e-4 | 64 | 98.587 | 70.241 | RandomFlip | params=40,376,519 |
| 6 | Deep-Emotion | Wider, channel: 10⇒64 | CK_PLUS_256 | |||||||
| (48x48) | 100 | Adam, lr=1e-4, weight_decay=1e-4 | 64 | 100 | 72.727 | |||||
| [80+] | RandomFlip | params=394,763 | ||||||||
| 7 | Deep-Emotion | Wider, channel: 10⇒64 | CK_PLUS_256 | |||||||
| (256x256) | 100 | Adam, lr=1e-4, weight_decay=1e-4 | 16 | 98.639 | 93.939 | RandomFlip | ||||
| RandomCrop | ||||||||||
| (224x224) | params=9,969,163 | |||||||||
| 8 | Deep-Emotion | |||||||||
| Wider, channel: 10⇒64 | ||||||||||
| (10-fold cross validation) | CK_PLUS_256 | |||||||||
| (256x256) | 100 | |||||||||
| Adam, lr=1e-4, weight_decay=1e-4 | 16 | 99.660 | ||||||||
| (10-fold) | 92.055 | |||||||||
| (10-fold) | RandomFlip | |||||||||
| RandomCrop | ||||||||||
| (224x224) | params=9,969,163 | |||||||||
| 9 | Deep-Emotion | |||||||||
| Wider + Deformable Conv (2 lyrs) | ||||||||||
| (10-fold cross validation) | CK_PLUS_256 | |||||||||
| (256x256) | 100 | Adam, lr=1e-4, weight_decay=1e-4 | 16 | 100 | ||||||
| (10-fold) | 95.095 | |||||||||
| (10-fold) | RandomFlip | |||||||||
| RandomCrop | ||||||||||
| (224x224) | params=9,926,465 | |||||||||
| 10 | Deep-Emotion224 | |||||||||
| Deep_Emotion224 | ||||||||||
| (10-fold cross validation) | CK_PLUS_256 | |||||||||
| (256x256) | 200 | Adam, lr=5e-5, weight_decay=1e-4 | 16 | 99.660 | ||||||
| (10-fold) | 94.186 | |||||||||
| (10-fold) | RandomFlip | |||||||||
| RandomCrop | ||||||||||
| (224x224) | params=1,344,279 | |||||||||
| 11 | Deep-Emotion | Baseline Deep-Emotion | ||||||||
| (10-fold cross validation) | CK_PLUS_256 | |||||||||
| (48x48) | 300 | Adam, lr=1e-4, weight_decay=1e-4 | 32 | 86.646 | ||||||
| (10-fold) | 85.597 | |||||||||
| (10-fold) | RandomFlip | |||||||||
| ~~RandomCrop | ||||||||||
| (224x224)~~ | params=67,037 | |||||||||
| 12 | Deep-Emotion | Baseline Deep-Emotion | FER_2013 | 100 | Adam, lr=1e-4, weight_decay=1e-4 | 64 | 66.571 | 51.212 | None | params=67,037 |
| 13 | Deep-Emotion | Baseline Deep-Emotion | FER_2013 | 100 | Adam, lr=1e-4, weight_decay=1e-4 | 64 | 58.783 | 54.611 | RandomFlip | params=67,037 |
| (Closest to paper settings) | - | Loss Explode | ||||||||
| 15 | Deep-Emotion | Baseline Deep-Emotion | ||||||||
| (changed the order of RELU) | FER_2013 | 100 | Adam, lr=1e-4, weight_decay=1e-4 | 64 | 60.256 | 54.249 | RandomFlip | params=67,037 | ||
| 16 | Deep-Emotion | Baseline Deep-Emotion | ||||||||
| (changed the order of RELU) | FER_2013 | 500 | Adam, lr=1e-4, weight_decay=1e-4 | 64 | 69.563 | 54.444 | RandomFlip | params=67,037 | ||
| 17 | Deep-Emotion | Baseline Deep-Emotion | ||||||||
| (changed the order of RELU) | FER_2013 | 500 | Adam, lr=1e-4, weight_decay=1e-4, ReduceLROnPlateau | 64 | ReduceLROnPlateau just keep decreasing LR to tiny value | And it does not help lowering val loss | RandomFlip | params=67,037 | ||
| patience=5 | ||||||||||
| 18 | Deep-Emotion | Baseline + Deformable Conv | ||||||||
| (2 layers) | FER_2013 | 100 | Adam, lr=1e-4, weight_decay=1e-4 | 64 | 61.754 | 56.617 | RandomFlip | |||
| ColorJitter(0.3,0.3,0.3) | params=70,131 | |||||||||
| 19 | Simple CNN | 3Conv + 2FC, simple form | ||||||||
| (while gets similar acc as Deep-Emotion???) | FER_2013 | 100 | Adam, lr=1e-4, weight_decay=1e-4 | 64 | 60.782 | 55.782 | RandomFlip | |||
| ColorJitter(0.3,0.3,0.3) | params=44,935 | |||||||||
| 20 | Deep-Emotion | Baseline + Deformable Conv | ||||||||
| (2 layers) | ||||||||||
| Corrected Val Loss Plot | FER_2013 | 100 | Adam, lr=1e-4, weight_decay=1e-2 | 64 | 59.915 | 56.590 | RandomFlip | |||
| ColorJitter(0.3,0.3,0.3) | params=70,131 | |||||||||
| 21 | ||||||||||
| Deep-Emotion | Deep_Emotion224 | |||||||||
| (10-fold cross validation) | CK_PLUS_256 | |||||||||
| (256x256) | 200 | Adam, lr=1e-4, weight_decay=1e-2 | 64 | 99.966 | ||||||
| (10-fold) | 94.517 | |||||||||
| (10-fold) | RandomFlip | |||||||||
| RandomCrop | ||||||||||
| (224x224) | params=1,344,279 | |||||||||
| compare w/ 10 | ||||||||||
| 22 | ||||||||||
| Deep-Emotion | Deep_Emotion224 w/ 2 dropout (0.5) (10-fold cross validation) | CK_PLUS_256 | ||||||||
| (256x256) | 200 | Adam, lr=1e-4, weight_decay=1e-2 | 64 | 58.443 | ||||||
| (dropout too much) | 91.126 | |||||||||
| (10-fold) | RandomFlip | |||||||||
| RandomCrop | ||||||||||
| (224x224) | params=1,344,279 | |||||||||
| 23 | Deep-Emotion | Deep_Emotion224 w/ 1 dropout(0.8) (after first FC) | ||||||||
| (10-fold cross validation) | CK_PLUS_256 | |||||||||
| (256x256) | 200 | Adam, lr=1e-4, weight_decay=1e-2 | 64 | 80.667 | ||||||
| (10-fold) | 89.015 | |||||||||
| (10-fold) | RandomFlip | |||||||||
| RandomCrop | ||||||||||
| (224x224) | params=1,344,279 | |||||||||
| 24 | Deep-Emotion | Deep_Emotion224 w/ 2 dropout (0.2) (10-fold cross validation) | CK_PLUS_256 | |||||||
| (256x256) | 400 | Adam, lr=1e-4, weight_decay=1e-2 | 64 | 90.927 | ||||||
| (10-fold) | 96.932 | |||||||||
| (10-fold) | RandomFlip | |||||||||
| RandomCrop | ||||||||||
| (224x224) | params=1,344,279 | |||||||||
| 25 | Deep-Emotion | Deep_Emotion + wider + Deformable Conv w/ 2 dropout (0.2) (10-fold cross validation) | CK_PLUS_256 | |||||||
| (48x48) | 200 | Adam, lr=1e-4, weight_decay=1e-2 | 64 | 89.092 | ||||||
| (10-fold) | 95.133 | |||||||||
| (10-fold) | RandomFlip | |||||||||
| params=352,065 | ||||||||||
| 26 | Deep-Emotion | Deep_Emotion + wider + Deformable Conv w/ 2 dropout (0.2) (10-fold cross validation) | CK_PLUS_256 | |||||||
| (48x48) | 200 | Adam, lr=1e-4, weight_decay=1e-2, ReduceLRonPlateau(pat.=15) | 64 | 88.175 | ||||||
| (10-fold) | 96.628 | |||||||||
| (10-fold) | RandomFlip | |||||||||
| params=352,065 | ||||||||||
| 27 | Deep-Emotion | Deep_Emotion + wider + Deformable Conv w/ 2 dropout (0.2) (10-fold cross validation) | CK_PLUS_256 | |||||||
| (48x48) | 200 | Adam, lr=1e-4, weight_decay=1e-2, ReduceLRonPlateau(pat.=20,min_lr=1e-7) | 64 | 89.058 | ||||||
| (10-fold) | 96.648 | |||||||||
| (10-fold) | RandomFlip | |||||||||
| params=352,065 | ||||||||||
| 28 | Deep-Emotion | Deep_Emotion + wider + Deformable Conv w/ 2 dropout (0.4) (10-fold cross validation) |
2 * dropout(0.4) seems too much regularization | CK_PLUS_256 (48x48) | 200 | Adam, lr=1e-4, weight_decay=1e-2, ReduceLRonPlateau(pat.=20,min_lr=1e-7) | 64 | 70.473 (10-fold) | 95.417 (10-fold) | RandomFlip | params=352,065 | | 29 | Deep-Emotion | Deep_Emotion + wider + Deformable Conv w/ 2 dropout (0.2) (10-fold cross validation) | CK_PLUS_256 (48x48) | 200 | Adam, lr=1e-4, weight_decay=1e-2, ReduceLRonPlateau(pat.=20,min_lr=1e-7) | 64 | 92.490 (10-fold) | 90.237 (10-fold) | RandomFlip Corrected Augmentation: only trainset augment, valset does not augment | params=352,065 | | 30 | | | | | | | | | | | | 31 | | | | | | | | | | |
(associated experiment is specified in the caption)