Scheduler
Public Member Functions | Static Public Attributes | List of all members
Mdp::DelayedQLearning Class Reference

#include <delayedQLearning.h>

Inheritance diagram for Mdp::DelayedQLearning:
Mdp::EligibilityTraceAlgo Mdp::RlBackupAlgorithm

Public Member Functions

 DelayedQLearning (std::shared_ptr< Context > c, TabularActionValues *av)
 
virtual void updateActionValues (state_t previousState, state_t nextState, action_t previousAction, double reward)
 
- Public Member Functions inherited from Mdp::EligibilityTraceAlgo
 EligibilityTraceAlgo (std::shared_ptr< Context > c, TabularActionValues *av)
 
virtual void init ()
 
virtual void end ()
 
- Public Member Functions inherited from Mdp::RlBackupAlgorithm
 RlBackupAlgorithm (std::shared_ptr< Context > c, ActionValuesFunction *av)
 
virtual ~RlBackupAlgorithm ()
 
virtual double getMaxQ (state_t state)
 
virtual std::pair< action_t, double > getBestActionAndQ (state_t state)
 
virtual action_t getBestAction (state_t state)
 
virtual void updateBestActionAndQ (state_t state)
 
virtual void notifyUpdateNeeded ()
 

Static Public Attributes

static constexpr const char * configKey = "delayedQLearning"
 

Additional Inherited Members

- Protected Member Functions inherited from Mdp::EligibilityTraceAlgo
void updateState (state_t previousState, action_t previousAction, double reward)
 
- Protected Member Functions inherited from Mdp::RlBackupAlgorithm
virtual void initAlpha ()
 
virtual void updateAlpha ()
 
virtual void updateIfNeeded (state_t state)
 
- Protected Attributes inherited from Mdp::EligibilityTraceAlgo
TabularActionValuestabularAv {nullptr}
 
state_t previousPreviousState {0}
 
action_t previousPreviousAction {0}
 
double previousReward {0.0}
 
std::vector< std::vector< double > > e
 
double lambda {0.5}
 
double discountFactor {0.5}
 
size_t stateSize {0}
 
size_t actionSize {0}
 
- Protected Attributes inherited from Mdp::RlBackupAlgorithm
std::shared_ptr< Contextcontext {nullptr}
 
ActionValuesFunctionactionValues {nullptr}
 
double alpha {-1.0}
 
double alpha0 {0.1}
 
double alphaCounter {1.0}
 
double alphaDecaySpeed {1.0}
 
bool hyperbolic {false}
 
bool stepwise {false}
 
unsigned long long stepwiseCounter {0}
 
unsigned long long int stepLength {0}
 
std::vector< double > bestQ
 
std::vector< action_tbestAction
 
std::vector< bool > needsUpdate
 

Detailed Description

Note: this is not an eligibility trace algorithm, but it is here for testing purposes

Definition at line 21 of file delayedQLearning.h.

Constructor & Destructor Documentation

Mdp::DelayedQLearning::DelayedQLearning ( std::shared_ptr< Context c,
TabularActionValues av 
)
inline

Definition at line 26 of file delayedQLearning.h.

Member Function Documentation

void DelayedQLearning::updateActionValues ( state_t  previousState,
state_t  nextState,
action_t  previousAction,
double  reward 
)
virtual

Implements Mdp::EligibilityTraceAlgo.

Definition at line 14 of file delayedQLearning.cpp.

Member Data Documentation

constexpr const char* Mdp::DelayedQLearning::configKey = "delayedQLearning"
static

Definition at line 24 of file delayedQLearning.h.


The documentation for this class was generated from the following files: