Find the optimal epsilon decay for decay of 1000 and 3000 epochs
Cross validation for epsilon decay
| L0 | L1 | L2 | L3 |
---|---|---|---|---|
learning rate | 0.2 | 0.2 | 0.2 | 0.2 |
| E0 |
---|---|
epsilon | 0.05 |
| ED0 | ED1 | ED2 | ED3 |
---|---|---|---|---|
epsilon decay | none | decay-1000-80 | decay-1000-50 | decay-1000-20 |
| ED4 | ED5 | ED6 | ED7 |
| none | decay-3000-80 | decay-3000-50 | decay-3000-20 |
| D0 |
---|---|
discount | 0.3 |
| M0 |
---|---|
mapping | non-linear-3 |
| R0 |
---|---|
reward handler | can-see |
| F0 |
---|---|
fetch mode | eager |