prl.function_approximators package¶

Submodules¶

prl.function_approximators.function_approximators module¶

class FunctionApproximator[source]¶

Bases: prl.typing.FunctionApproximatorABC, abc.ABC

Class for function approximators used by the agents. For example it could be a neural network for value function or policy approximation.

id¶

Function Approximator UUID

Return type:	`str`

predict(x)[source]¶: Makes prediction based on input

train(x, *loss_args)[source]¶

Trains FA for one or more steps. Returns training loss value.

Return type:	`float`

prl.function_approximators.pytorch_nn module¶

class DQNLoss(mode='huber', size_average=None, reduce=None, reduction='mean')[source]¶

Bases: sphinx.ext.autodoc.importer._MockObject

forward(nn_outputs, actions, target_outputs)[source]¶

class PolicyGradientLoss(size_average=None, reduce=None, reduction='mean')[source]¶

Bases: sphinx.ext.autodoc.importer._MockObject

forward(nn_outputs, actions, returns)[source]¶

class PytorchConv(x_shape, hidden_sizes, y_size)[source]¶

Bases: prl.function_approximators.pytorch_nn.PytorchNet

forward(x)[source]¶

Defines the computation performed at every training step.

Parameters:	x – input data
Returns:	network output

predict(x)[source]¶

Makes prediction based on input data.

Parameters:	x – input data
Returns:	prediction for agent.act(x) method

class PytorchFA(net, loss, optimizer, device='cpu', batch_size=64, last_batch=True, network_id='pytorch_nn')[source]¶

Bases: prl.function_approximators.function_approximators.FunctionApproximator

Class for pytorch based neural networks function approximators.

Parameters:

net (PytorchNet) – PytorchNet class neural network
loss (<sphinx.ext.autodoc.importer._MockObject object at 0x7f4706eb4cf8>) – loss function
optimizer (<sphinx.ext.autodoc.importer._MockObject object at 0x7f4706ea90f0>) – optimizer
device (str) – device for computation: “cpu” or “cuda”
batch_size (int) – size of a training batch
last_batch (bool) – flag if the last batch (usually shorter than batch_size) is going to be feed into network
network_id (str) – name of the network for debugging and logging purposes

convert_to_pytorch(y)[source]¶

id¶: Function Approximator UUID

predict(x)[source]¶: Makes prediction

train(x, *loss_args)[source]¶

Trains network on a dataset

Parameters:	x (`ndarray`) – input array for the network *loss_args – arguments passed directly to loss function

class PytorchMLP(x_shape, y_size, output_activation, hidden_sizes)[source]¶

Bases: prl.function_approximators.pytorch_nn.PytorchNet

forward(x)[source]¶

Defines the computation performed at every training step.

Parameters:	x – input data
Returns:	network output

predict(x)[source]¶

Makes prediction based on input data.

Parameters:	x – input data
Returns:	prediction for agent.act(x) method

class PytorchNet(*args, **kwargs)[source]¶

Bases: prl.typing.PytorchNetABC

Neural networks for PytorchFA. It has separate predict method strictly for Agent.act() method, wchich can act differently than forward() method.

Note

This class has two abstract methods that need to be implemented (listed above).

forward(x)[source]¶

Defines the computation performed at every training step.

Parameters:	x (<sphinx.ext.autodoc.importer._MockObject object at 0x7f4706ea9048>) – input data
Returns:	network output

predict(x)[source]¶

Makes prediction based on input data.

Parameters:	x (<sphinx.ext.autodoc.importer._MockObject object at 0x7f470c036ef0>) – input data
Returns:	prediction for agent.act(x) method

prl.function_approximators package¶

Submodules¶

prl.function_approximators.function_approximators module¶

prl.function_approximators.pytorch_nn module¶

Module contents¶