podium.models package

Submodules

podium.models.batch_transform_functions module

Module contains functions used to transform batch to tensors that models accept.

podium.models.experiment module

Modules defines an experiment - class used to combine iteration over data, model training and prediction.

class podium.models.experiment.Experiment(model: Union[Type[podium.models.model.AbstractSupervisedModel], podium.models.model.AbstractSupervisedModel], trainer: podium.models.trainer.AbstractTrainer = None, feature_transformer: Union[podium.models.transformers.FeatureTransformer, Callable[[NamedTuple], numpy.array]] = None, label_transform_fn: Callable[[NamedTuple], numpy.ndarray] = None)

Bases: object

Class used to streamline model fitting and prediction.

fit(dataset: podium.datasets.dataset.Dataset, model_kwargs: Dict = None, trainer_kwargs: Dict = None, feature_transformer: podium.models.transformers.FeatureTransformer = None, trainer: podium.models.trainer.AbstractTrainer = None)

Fits the model to the provided Dataset. During fitting, the provided Iterator and Trainer are used.

Parameters
  • dataset (Dataset) – Dataset to fit the model to.

  • model_kwargs (dict) – Dict containing model arguments. Arguments passed to the model are the default arguments defined with set_default_model_args updated/overridden by model_kwargs.

  • trainer_kwargs (dict) – Dict containing trainer arguments. Arguments passed to the trainer are the default arguments defined with set_default_trainer_args updated/overridden by ‘trainer_kwargs’.

  • feature_transformer (FeatureTransformer, Optional) – FeatureTransformer that transforms the input part of the batch returned by the iterator into features that can be fed into the model. Will also be fitted during Experiment fitting. If None, the default FeatureTransformer provided in the constructor will be used. Otherwise, this will overwrite the default feature transformer.

  • trainer (AbstractTrainer, Optional) – Trainer used to fit the model. If None, the trainer provided in the constructor will be used.

  • training_iterator_callable (Callable[[Dataset], Iterator]) – Callable used to instantiate new instances of the Iterator used in fitting the model. If None, the training_iterator_callable provided in the constructor will be used.

Raises

RuntimeError – If trainer is not provided either in the constructor or as an argument to the method.

partial_fit(dataset: podium.datasets.dataset.Dataset, trainer_kwargs: Dict = None, trainer: podium.models.trainer.AbstractTrainer = None)

Fits the model to the data without resetting the model.

Parameters
  • dataset (Dataset) – Dataset to fit the model to.

  • trainer_kwargs (dict) – Dict containing trainer arguments. Arguments passed to the trainer are the default arguments defined with set_default_trainer_args updated/overridden by ‘trainer_kwargs’.

  • trainer (AbstractTrainer, Optional) – Trainer used to fit the model. If None, the trainer provided in the constructor will be used.

Raises

RuntimeError – If trainer is not provided either in the constructor or as an argument to the method.

predict(dataset: podium.datasets.dataset.Dataset, batch_size: int = 128, **kwargs) → numpy.ndarray

Computes the prediction of the model for every example in the provided dataset.

Parameters
  • dataset (Dataset) – Dataset to compute predictions for.

  • batch_size (int) – If None, predictions for the whole dataset will be done in a single batch. Else, predictions will be calculated in batches of batch_size size. This argument is useful in case the whole dataset can’t be processed in a single batch.

  • kwargs – Keyword arguments passed to the model’s predict method

Returns

Tensor containing predictions for examples in the passed Dataset.

Return type

ndarray

set_default_model_args(**kwargs)

Sets the default model arguments. Model arguments are keyword arguments passed to the model constructor. Default arguments can be updated/overridden by arguments in the model_kwargs dict in the fit method.

Parameters

kwargs – Default model arguments.

set_default_trainer_args(**kwargs)

Sets the default trainer arguments. Trainer arguments are keyword arguments passed to the trainer during model fitting. Default arguments can be updated/overridden by arguments in the trainer_kwargs parameter in the fit method. :param kwargs: Default trainer arguments.

podium.models.model module

Module contains base model interfaces.

class podium.models.model.AbstractFrameworkModel

Bases: abc.ABC

Interface for framework models.

abstract load(**kwargs)

Method loads model from given file_path with additional arguments defined in kwargs.

Parameters
  • file_path (str) – path to file where the model should be saved

  • **kwargs (dict) – Additional key-value parameters for loading model

Returns

method returns loaded model

Return type

model

Raises
  • ValueError – if the given path doesn’t exist

  • IOError – if there was an error while reading from a file

abstract save(file_path, **kwargs)

Method saves model to given file_path with additional arguments defined in kwargs.

Parameters
  • file_path (str) – path to file where the model should be saved

  • **kwargs (dict) – Additional key-value parameters for saving mode

Raises

IOError – if there was an error while writing to a file

class podium.models.model.AbstractSupervisedModel

Bases: abc.ABC

Interface for supervised models.

PREDICTION_KEY

key for defining prediction return variable

Type

str

abstract fit(X, y, **kwargs)

Method trains the model and returns dictionary of values defined by model specific key parameters

Parameters
  • X (np.array) – input data

  • y (np.array) – data labels

  • **kwargs (dict) – Additional key-value parameters for model

Returns

result – dictionary mapping fit results to defined model specific key parameters

Return type

dict

abstract predict(X, **kwargs)

Predict labels for given data

Parameters
  • X (np.array) – input data

  • **kwargs (dict) – Additional key-value parameters for model

Returns

result – dictionary mapping fit results to defined model specific key parameters

Return type

dict

abstract reset(**kwargs)

Resets the model to its initial state so it can be re-trained.

Parameters

kwargs – Additional key-value parameters for model

podium.models.trainer module

Module contains interfaces for a trainer.

class podium.models.trainer.AbstractTrainer

Bases: abc.ABC

Interface for base trainer that can train the model.

abstract train(model: podium.models.model.AbstractSupervisedModel, dataset: podium.datasets.dataset.Dataset, feature_transformer: podium.models.transformers.FeatureTransformer, label_transform_fun: Callable[[NamedTuple], numpy.ndarray], **kwargs)

Method trains a model with data from given Iterator.

Parameters
  • model (AbstractSupervisedModel) – The model that needs to be trained.

  • dataset (Dataset) – Dataset the model will be trained on

  • feature_transformer (Callable[[NamedTuple], np.ndarray]) – Callable that transforms the input part of the batch returned by the iterator into features that can be fed into the model.

  • label_transform_fun (Callable[[NamedTuple], np.ndarray]) – Callable that transforms the target part of the batch returned by the iterator into the same format the model prediction is. For a hypothetical perfect model the prediction result of the model for some examples must be identical to the result of this callable for those same examples.

  • kwargs (dict) – Trainer specific parameters.

Module contents

Module contains ML models.

class podium.models.AbstractFrameworkModel

Bases: abc.ABC

Interface for framework models.

abstract load(**kwargs)

Method loads model from given file_path with additional arguments defined in kwargs.

Parameters
  • file_path (str) – path to file where the model should be saved

  • **kwargs (dict) – Additional key-value parameters for loading model

Returns

method returns loaded model

Return type

model

Raises
  • ValueError – if the given path doesn’t exist

  • IOError – if there was an error while reading from a file

abstract save(file_path, **kwargs)

Method saves model to given file_path with additional arguments defined in kwargs.

Parameters
  • file_path (str) – path to file where the model should be saved

  • **kwargs (dict) – Additional key-value parameters for saving mode

Raises

IOError – if there was an error while writing to a file

class podium.models.AbstractSupervisedModel

Bases: abc.ABC

Interface for supervised models.

PREDICTION_KEY

key for defining prediction return variable

Type

str

abstract fit(X, y, **kwargs)

Method trains the model and returns dictionary of values defined by model specific key parameters

Parameters
  • X (np.array) – input data

  • y (np.array) – data labels

  • **kwargs (dict) – Additional key-value parameters for model

Returns

result – dictionary mapping fit results to defined model specific key parameters

Return type

dict

abstract predict(X, **kwargs)

Predict labels for given data

Parameters
  • X (np.array) – input data

  • **kwargs (dict) – Additional key-value parameters for model

Returns

result – dictionary mapping fit results to defined model specific key parameters

Return type

dict

abstract reset(**kwargs)

Resets the model to its initial state so it can be re-trained.

Parameters

kwargs – Additional key-value parameters for model

class podium.models.Experiment(model: Union[Type[podium.models.model.AbstractSupervisedModel], podium.models.model.AbstractSupervisedModel], trainer: podium.models.trainer.AbstractTrainer = None, feature_transformer: Union[podium.models.transformers.FeatureTransformer, Callable[[NamedTuple], numpy.array]] = None, label_transform_fn: Callable[[NamedTuple], numpy.ndarray] = None)

Bases: object

Class used to streamline model fitting and prediction.

fit(dataset: podium.datasets.dataset.Dataset, model_kwargs: Dict = None, trainer_kwargs: Dict = None, feature_transformer: podium.models.transformers.FeatureTransformer = None, trainer: podium.models.trainer.AbstractTrainer = None)

Fits the model to the provided Dataset. During fitting, the provided Iterator and Trainer are used.

Parameters
  • dataset (Dataset) – Dataset to fit the model to.

  • model_kwargs (dict) – Dict containing model arguments. Arguments passed to the model are the default arguments defined with set_default_model_args updated/overridden by model_kwargs.

  • trainer_kwargs (dict) – Dict containing trainer arguments. Arguments passed to the trainer are the default arguments defined with set_default_trainer_args updated/overridden by ‘trainer_kwargs’.

  • feature_transformer (FeatureTransformer, Optional) – FeatureTransformer that transforms the input part of the batch returned by the iterator into features that can be fed into the model. Will also be fitted during Experiment fitting. If None, the default FeatureTransformer provided in the constructor will be used. Otherwise, this will overwrite the default feature transformer.

  • trainer (AbstractTrainer, Optional) – Trainer used to fit the model. If None, the trainer provided in the constructor will be used.

  • training_iterator_callable (Callable[[Dataset], Iterator]) – Callable used to instantiate new instances of the Iterator used in fitting the model. If None, the training_iterator_callable provided in the constructor will be used.

Raises

RuntimeError – If trainer is not provided either in the constructor or as an argument to the method.

partial_fit(dataset: podium.datasets.dataset.Dataset, trainer_kwargs: Dict = None, trainer: podium.models.trainer.AbstractTrainer = None)

Fits the model to the data without resetting the model.

Parameters
  • dataset (Dataset) – Dataset to fit the model to.

  • trainer_kwargs (dict) – Dict containing trainer arguments. Arguments passed to the trainer are the default arguments defined with set_default_trainer_args updated/overridden by ‘trainer_kwargs’.

  • trainer (AbstractTrainer, Optional) – Trainer used to fit the model. If None, the trainer provided in the constructor will be used.

Raises

RuntimeError – If trainer is not provided either in the constructor or as an argument to the method.

predict(dataset: podium.datasets.dataset.Dataset, batch_size: int = 128, **kwargs) → numpy.ndarray

Computes the prediction of the model for every example in the provided dataset.

Parameters
  • dataset (Dataset) – Dataset to compute predictions for.

  • batch_size (int) – If None, predictions for the whole dataset will be done in a single batch. Else, predictions will be calculated in batches of batch_size size. This argument is useful in case the whole dataset can’t be processed in a single batch.

  • kwargs – Keyword arguments passed to the model’s predict method

Returns

Tensor containing predictions for examples in the passed Dataset.

Return type

ndarray

set_default_model_args(**kwargs)

Sets the default model arguments. Model arguments are keyword arguments passed to the model constructor. Default arguments can be updated/overridden by arguments in the model_kwargs dict in the fit method.

Parameters

kwargs – Default model arguments.

set_default_trainer_args(**kwargs)

Sets the default trainer arguments. Trainer arguments are keyword arguments passed to the trainer during model fitting. Default arguments can be updated/overridden by arguments in the trainer_kwargs parameter in the fit method. :param kwargs: Default trainer arguments.

class podium.models.AbstractTrainer

Bases: abc.ABC

Interface for base trainer that can train the model.

abstract train(model: podium.models.model.AbstractSupervisedModel, dataset: podium.datasets.dataset.Dataset, feature_transformer: podium.models.transformers.FeatureTransformer, label_transform_fun: Callable[[NamedTuple], numpy.ndarray], **kwargs)

Method trains a model with data from given Iterator.

Parameters
  • model (AbstractSupervisedModel) – The model that needs to be trained.

  • dataset (Dataset) – Dataset the model will be trained on

  • feature_transformer (Callable[[NamedTuple], np.ndarray]) – Callable that transforms the input part of the batch returned by the iterator into features that can be fed into the model.

  • label_transform_fun (Callable[[NamedTuple], np.ndarray]) – Callable that transforms the target part of the batch returned by the iterator into the same format the model prediction is. For a hypothetical perfect model the prediction result of the model for some examples must be identical to the result of this callable for those same examples.

  • kwargs (dict) – Trainer specific parameters.

class podium.models.FeatureTransformer(feature_extraction_fn: Callable[[NamedTuple], numpy.ndarray], tensor_transformer: podium.models.transformers.TensorTransformer = None)

Bases: object

Class used to transform Dataset batches into features used in model prediction and training.

__call__(x: NamedTuple)

Trasforms the provided podium feature batch into a numpy array. :param x: Feature batch to be transformed. :type x: NamedTuple

Returns

Transformed features.

Return type

np.ndarray

fit(x: NamedTuple, y: numpy.ndarray)

Fits this tensor transformer to the provided data.

Parameters
  • x (NamedTuple) – Podium feature batch containing the features to be transformed.

  • y (np.ndarray) – Labels corresponding to the features in x.

requires_fitting()

Returns True if the contained TensorTransformer exists and requires fitting, else returns None.

Returns

True if the contained TensorTransformer exists and requires fitting, else returns False.

Return type

bool

transform(x: NamedTuple) → numpy.ndarray

Trasforms the provided podium feature batch into a numpy array.

Parameters

x (NamedTuple) – Feature batch to be transformed.

Returns

Transformed features.

Return type

np.ndarray

class podium.models.TensorTransformer

Bases: abc.ABC

Abstract class used to transform tensors. Used in feature pre-processing during training and prediction. Usually used in FeatureTransformer to transform tensors returned by the feature extraction callable.

abstract fit(x: numpy.ndarray, y: numpy.ndarray)

Fits the transformer to the provided data.

Parameters
  • x (np.ndarray) – Features in numpy array form.

  • y (np.ndarray) – Labels in numpy array form.

abstract requires_fitting() → bool

Returns True if this TensorTransformer requires fitting.

Returns

Return type

True if this TensorTransformer requires fitting, else returns False.

abstract transform(x: numpy.ndarray) → numpy.ndarray

Transforms the passed features.

Parameters

x (np.ndarray) – Features to be transformed in numpy array form.

Returns

Transformed features.

Return type

np.array

class podium.models.SklearnTensorTransformerWrapper(feature_transformer, requires_fitting=True)

Bases: podium.models.transformers.TensorTransformer

Wrapper class for Sklearn feature transformers.

fit(x: numpy.ndarray, y: numpy.ndarray)

Fits the transformer to the provided data.

Parameters
  • x (np.ndarray) – Features in numpy array form.

  • y (np.ndarray) – Labels in numpy array form.

requires_fitting() → bool

Returns True if this TensorTransformer requires fitting.

Returns

Return type

True if this TensorTransformer requires fitting, else returns False.

transform(x: numpy.array) → numpy.ndarray

Transforms the passed features.

Parameters

x (np.ndarray) – Features to be transformed in numpy array form.

Returns

Transformed features.

Return type

np.array