Reinforcement Learning
RL utilities for queueing control.
The rl module provides reinforcement learning utilities for queueing control
and optimization problems.
Key function categories:
Environment setup: rl_env(), rl_env_general()
TD agents: rl_td_agent(), rl_td_agent_general()
Reinforcement learning environments for queueing networks.
This module provides reinforcement learning (RL) environments that
integrate with LINE queueing network models, enabling RL agents
to learn control policies for queueing systems.
Key classes:
- RlEnv: Basic RL environment for queueing networks
- RlEnvGeneral: General-purpose RL environment
- RlTDAgent: Temporal difference learning agent
- RlTDAgentGeneral: General TD agent
These environments support research into adaptive control of queueing
systems using reinforcement learning techniques.
-
class RlEnv(model, idx_of_queue_in_nodes, idx_of_source_in_nodes, state_size, gamma)[source]
Bases: object
Initialize the RL environment.
-
__init__(model, idx_of_queue_in_nodes, idx_of_source_in_nodes, state_size, gamma)[source]
Initialize the RL environment.
-
property model
Get the network model.
-
property action_size
Get the number of possible actions.
-
is_in_state_space(nodes)[source]
Check if nodes configuration is in state space.
-
is_in_action_space(nodes)[source]
Check if nodes configuration is in action space.
-
sample()[source]
Sample state and action from environment.
-
update(new_state)[source]
Update environment with new state.
-
reset()[source]
Reset environment to initial state.
-
class RlEnvGeneral(model, idx_of_queue_in_nodes, idx_of_action_nodes, state_size, gamma)[source]
Bases: object
Initialize the general RL environment.
-
__init__(model, idx_of_queue_in_nodes, idx_of_action_nodes, state_size, gamma)[source]
Initialize the general RL environment.
-
property model
Get the network model.
-
property nqueues
Get the number of queues.
-
property action_space
Get the action space mapping.
-
is_in_state_space(state)[source]
Check if state is in state space.
-
is_in_action_space(state)[source]
Check if state is in action space.
-
sample()[source]
Sample from environment.
-
update(sample)[source]
Update environment with sample.
-
reset()[source]
Reset environment to initial state.
-
class RlTDAgent(lr=0.05, epsilon=1.0, eps_decay=0.99)[source]
Bases: object
Initialize the TD agent.
-
__init__(lr=0.05, epsilon=1.0, eps_decay=0.99)[source]
Initialize the TD agent.
-
reset(env)[source]
Reset agent with environment.
-
get_value_function()[source]
Get value function as numpy array.
-
get_q_function()[source]
Get Q-function as numpy array.
-
solve(env)[source]
Solve RL problem in environment.
-
static create_greedy_policy(state_q, epsilon, n_a)[source]
Create greedy policy from Q-values.
-
static get_state_from_loc(obj_size, loc)[source]
Get state vector from location indices.
-
class RlTDAgentGeneral(lr=0.1, epsilon=1.0, eps_decay=0.9999)[source]
Bases: object
Initialize the advanced TD agent.
-
__init__(lr=0.1, epsilon=1.0, eps_decay=0.9999)[source]
Initialize the advanced TD agent.
-
reset(env)[source]
Reset agent with environment.
-
get_value_function()[source]
Get value function as numpy array.
-
solve_for_fixed_policy(env, num_episodes=10000)[source]
Solve for fixed policy with given episodes.
-
solve(env, num_episodes=10000)[source]
Solve RL problem with given episodes.
-
solve_by_hashmap(env, num_episodes=10000)[source]
Solve using hashmap-based value iteration.
-
solve_by_linear(env, num_episodes=10000)[source]
Solve using linear function approximation.
-
solve_by_quad(env, num_episodes=10000)[source]
Solve using quadratic function approximation.
-
static create_greedy_policy(state_q, epsilon, n_a)[source]
Create greedy policy from Q-values.