Problem QLOW03: Results do not change
Goal: Find better values for L and E based on the results of QLOW03
| | L0 | L1 | L2 |
|---|---|---|---|
| learning rate | 0.01 | 0.005 | 0.001 |
| | E0 | E1 | E2 |
|---|---|---|---|
| epsilon | 0.01 | 0.005 | 0.001 |
| | D0 | D1 | D2 | D3 | D4 | D5 | D6 | D7 | D8 | D9 |
|---|---|---|---|---|---|---|---|---|---|---|
| discount | 0.3 | 0.3 | 0.3 | 0.3 | 0.3 | 0.3 | 0.3 | 0.3 | 0.3 | 0.3 |
| | M0 |
|---|---|
| mapping | non-linear-3 |
| | R0 |
|---|---|
| reward handler | speed-bonus |