QED5 Epsilon decay 10k

Find the optimal epsilon decay

training.parallel.ParallelConfig.q-ed-1

Cross validation for epsilon decay

L0 L1 L2 L3
learning rate 0.2 0.2 0.2 0.2
E0
epsilon 0.05
ED0 ED1 ED2 ED3
epsilon decay none decay-100-80 decay-100-50 decay-100-20
D0
discount 0.3
M0
mapping non-linear-3
R0
reward handler can-see
F0
fetch mode eager

L0 E0 ED0 D0 M0 R0 F0

q-values
video 0 video 1 video 2 video 3
video 4 video 5 video 6 video 7
video 8 video 9 video 10
L0 E0 ED1 D0 M0 R0 F0

q-values
video 0 video 1 video 2 video 3
video 4 video 5 video 6 video 7
video 8 video 9 video 10
L0 E0 ED2 D0 M0 R0 F0

q-values
video 0 video 1 video 2 video 3
video 4 video 5 video 6 video 7
video 8 video 9 video 10
L0 E0 ED3 D0 M0 R0 F0

q-values
video 0 video 1 video 2 video 3
video 4 video 5 video 6 video 7
video 8 video 9 video 10
L1 E0 ED0 D0 M0 R0 F0

q-values
video 0 video 1 video 2 video 3
video 4 video 5 video 6 video 7
video 8 video 9 video 10
L1 E0 ED1 D0 M0 R0 F0

q-values
video 0 video 1 video 2 video 3
video 4 video 5 video 6 video 7
video 8 video 9 video 10
L1 E0 ED2 D0 M0 R0 F0

q-values
video 0 video 1 video 2 video 3
video 4 video 5 video 6 video 7
video 8 video 9 video 10
L1 E0 ED3 D0 M0 R0 F0

q-values
video 0 video 1 video 2 video 3
video 4 video 5 video 6 video 7
video 8 video 9 video 10
L2 E0 ED0 D0 M0 R0 F0

q-values
video 0 video 1 video 2 video 3
video 4 video 5 video 6 video 7
video 8 video 9 video 10
L2 E0 ED1 D0 M0 R0 F0

q-values
video 0 video 1 video 2 video 3
video 4 video 5 video 6 video 7
video 8 video 9 video 10
L2 E0 ED2 D0 M0 R0 F0

q-values
video 0 video 1 video 2 video 3
video 4 video 5 video 6 video 7
video 8 video 9 video 10
L2 E0 ED3 D0 M0 R0 F0

q-values
video 0 video 1 video 2 video 3
video 4 video 5 video 6 video 7
video 8 video 9 video 10