Package jline.api.rl
Class RlTDAgentGeneral
-
- All Implemented Interfaces:
public final class RlTDAgentGeneral
General Temporal Difference Learning Agent for Queueing Network Control Supports multiple value function approximation methods:
Tabular
HashMap-based
Linear function approximation
Quadratic function approximation
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description public class
RlTDAgentGeneral.Companion
-
Field Summary
Fields Modifier and Type Field Description public final static RlTDAgentGeneral.Companion
Companion
-
Constructor Summary
Constructors Constructor Description RlTDAgentGeneral(Double lr, Double epsilon, Double epsDecay)
-
Method Summary
Modifier and Type Method Description final Unit
reset(RlEnvGeneral env)
Reset the agent and environment to initial state final Array<DoubleArray>
getValueFunction()
Get the current value function final Array<DoubleArray>
solveForFixedPolicy(RlEnvGeneral env, Integer numEpisodes)
TD learning for value function with fixed heuristic routing policy final Array<DoubleArray>
solve(RlEnvGeneral env, Integer numEpisodes)
TD Control with tabular value function final Pair<Array<DoubleArray>, DoubleArray>
solveByHashmap(RlEnvGeneral env, Integer numEpisodes)
TD Control using HashMap value function final Triple<Array<DoubleArray>, DoubleArray, DoubleArray>
solveByLinear(RlEnvGeneral env, Integer numEpisodes)
TD control using linear value function approximator final Triple<Array<DoubleArray>, DoubleArray, DoubleArray>
solveByQuad(RlEnvGeneral env, Integer numEpisodes)
TD control using quadratic value function approximator -
-
Method Detail
-
reset
final Unit reset(RlEnvGeneral env)
Reset the agent and environment to initial state
-
getValueFunction
final Array<DoubleArray> getValueFunction()
Get the current value function
-
solveForFixedPolicy
final Array<DoubleArray> solveForFixedPolicy(RlEnvGeneral env, Integer numEpisodes)
TD learning for value function with fixed heuristic routing policy
-
solve
final Array<DoubleArray> solve(RlEnvGeneral env, Integer numEpisodes)
TD Control with tabular value function
-
solveByHashmap
final Pair<Array<DoubleArray>, DoubleArray> solveByHashmap(RlEnvGeneral env, Integer numEpisodes)
TD Control using HashMap value function
-
solveByLinear
final Triple<Array<DoubleArray>, DoubleArray, DoubleArray> solveByLinear(RlEnvGeneral env, Integer numEpisodes)
TD control using linear value function approximator
-
solveByQuad
final Triple<Array<DoubleArray>, DoubleArray, DoubleArray> solveByQuad(RlEnvGeneral env, Integer numEpisodes)
TD control using quadratic value function approximator
-
-
-
-