podium.models package¶
Subpackages¶
Submodules¶
podium.models.batch_transform_functions module¶
Module contains functions used to transform batch to tensors that models accept.
podium.models.experiment module¶
Modules defines an experiment - class used to combine iteration over data, model training and prediction.
-
class
podium.models.experiment.
Experiment
(model: Union[Type[podium.models.model.AbstractSupervisedModel], podium.models.model.AbstractSupervisedModel], trainer: podium.models.trainer.AbstractTrainer = None, feature_transformer: Union[podium.models.transformers.FeatureTransformer, Callable[[NamedTuple], numpy.array]] = None, label_transform_fn: Callable[[NamedTuple], numpy.ndarray] = None)¶ Bases:
object
Class used to streamline model fitting and prediction.
-
fit
(dataset: podium.datasets.dataset.Dataset, model_kwargs: Dict = None, trainer_kwargs: Dict = None, feature_transformer: podium.models.transformers.FeatureTransformer = None, trainer: podium.models.trainer.AbstractTrainer = None)¶ Fits the model to the provided Dataset. During fitting, the provided Iterator and Trainer are used.
- Parameters
dataset (Dataset) – Dataset to fit the model to.
model_kwargs (dict) – Dict containing model arguments. Arguments passed to the model are the default arguments defined with set_default_model_args updated/overridden by model_kwargs.
trainer_kwargs (dict) – Dict containing trainer arguments. Arguments passed to the trainer are the default arguments defined with set_default_trainer_args updated/overridden by ‘trainer_kwargs’.
feature_transformer (FeatureTransformer, Optional) – FeatureTransformer that transforms the input part of the batch returned by the iterator into features that can be fed into the model. Will also be fitted during Experiment fitting. If None, the default FeatureTransformer provided in the constructor will be used. Otherwise, this will overwrite the default feature transformer.
trainer (AbstractTrainer, Optional) – Trainer used to fit the model. If None, the trainer provided in the constructor will be used.
training_iterator_callable (Callable[[Dataset], Iterator]) – Callable used to instantiate new instances of the Iterator used in fitting the model. If None, the training_iterator_callable provided in the constructor will be used.
- Raises
RuntimeError – If trainer is not provided either in the constructor or as an argument to the method.
-
partial_fit
(dataset: podium.datasets.dataset.Dataset, trainer_kwargs: Dict = None, trainer: podium.models.trainer.AbstractTrainer = None)¶ Fits the model to the data without resetting the model.
- Parameters
dataset (Dataset) – Dataset to fit the model to.
trainer_kwargs (dict) – Dict containing trainer arguments. Arguments passed to the trainer are the default arguments defined with set_default_trainer_args updated/overridden by ‘trainer_kwargs’.
trainer (AbstractTrainer, Optional) – Trainer used to fit the model. If None, the trainer provided in the constructor will be used.
- Raises
RuntimeError – If trainer is not provided either in the constructor or as an argument to the method.
-
predict
(dataset: podium.datasets.dataset.Dataset, batch_size: int = 128, **kwargs) → numpy.ndarray¶ Computes the prediction of the model for every example in the provided dataset.
- Parameters
dataset (Dataset) – Dataset to compute predictions for.
batch_size (int) – If None, predictions for the whole dataset will be done in a single batch. Else, predictions will be calculated in batches of batch_size size. This argument is useful in case the whole dataset can’t be processed in a single batch.
kwargs – Keyword arguments passed to the model’s predict method
- Returns
Tensor containing predictions for examples in the passed Dataset.
- Return type
ndarray
-
set_default_model_args
(**kwargs)¶ Sets the default model arguments. Model arguments are keyword arguments passed to the model constructor. Default arguments can be updated/overridden by arguments in the model_kwargs dict in the fit method.
- Parameters
kwargs – Default model arguments.
-
set_default_trainer_args
(**kwargs)¶ Sets the default trainer arguments. Trainer arguments are keyword arguments passed to the trainer during model fitting. Default arguments can be updated/overridden by arguments in the trainer_kwargs parameter in the fit method. :param kwargs: Default trainer arguments.
-
podium.models.model module¶
Module contains base model interfaces.
-
class
podium.models.model.
AbstractFrameworkModel
¶ Bases:
abc.ABC
Interface for framework models.
-
abstract
load
(**kwargs)¶ Method loads model from given file_path with additional arguments defined in kwargs.
- Parameters
file_path (str) – path to file where the model should be saved
**kwargs (dict) – Additional key-value parameters for loading model
- Returns
method returns loaded model
- Return type
model
- Raises
ValueError – if the given path doesn’t exist
IOError – if there was an error while reading from a file
-
abstract
save
(file_path, **kwargs)¶ Method saves model to given file_path with additional arguments defined in kwargs.
- Parameters
file_path (str) – path to file where the model should be saved
**kwargs (dict) – Additional key-value parameters for saving mode
- Raises
IOError – if there was an error while writing to a file
-
abstract
-
class
podium.models.model.
AbstractSupervisedModel
¶ Bases:
abc.ABC
Interface for supervised models.
-
PREDICTION_KEY
¶ key for defining prediction return variable
- Type
str
-
abstract
fit
(X, y, **kwargs)¶ Method trains the model and returns dictionary of values defined by model specific key parameters
- Parameters
X (np.array) – input data
y (np.array) – data labels
**kwargs (dict) – Additional key-value parameters for model
- Returns
result – dictionary mapping fit results to defined model specific key parameters
- Return type
dict
-
abstract
predict
(X, **kwargs)¶ Predict labels for given data
- Parameters
X (np.array) – input data
**kwargs (dict) – Additional key-value parameters for model
- Returns
result – dictionary mapping fit results to defined model specific key parameters
- Return type
dict
-
abstract
reset
(**kwargs)¶ Resets the model to its initial state so it can be re-trained.
- Parameters
kwargs – Additional key-value parameters for model
-
podium.models.trainer module¶
Module contains interfaces for a trainer.
-
class
podium.models.trainer.
AbstractTrainer
¶ Bases:
abc.ABC
Interface for base trainer that can train the model.
-
abstract
train
(model: podium.models.model.AbstractSupervisedModel, dataset: podium.datasets.dataset.Dataset, feature_transformer: podium.models.transformers.FeatureTransformer, label_transform_fun: Callable[[NamedTuple], numpy.ndarray], **kwargs)¶ Method trains a model with data from given Iterator.
- Parameters
model (AbstractSupervisedModel) – The model that needs to be trained.
dataset (Dataset) – Dataset the model will be trained on
feature_transformer (Callable[[NamedTuple], np.ndarray]) – Callable that transforms the input part of the batch returned by the iterator into features that can be fed into the model.
label_transform_fun (Callable[[NamedTuple], np.ndarray]) – Callable that transforms the target part of the batch returned by the iterator into the same format the model prediction is. For a hypothetical perfect model the prediction result of the model for some examples must be identical to the result of this callable for those same examples.
kwargs (dict) – Trainer specific parameters.
-
abstract
Module contents¶
Module contains ML models.
-
class
podium.models.
AbstractFrameworkModel
¶ Bases:
abc.ABC
Interface for framework models.
-
abstract
load
(**kwargs)¶ Method loads model from given file_path with additional arguments defined in kwargs.
- Parameters
file_path (str) – path to file where the model should be saved
**kwargs (dict) – Additional key-value parameters for loading model
- Returns
method returns loaded model
- Return type
model
- Raises
ValueError – if the given path doesn’t exist
IOError – if there was an error while reading from a file
-
abstract
save
(file_path, **kwargs)¶ Method saves model to given file_path with additional arguments defined in kwargs.
- Parameters
file_path (str) – path to file where the model should be saved
**kwargs (dict) – Additional key-value parameters for saving mode
- Raises
IOError – if there was an error while writing to a file
-
abstract
-
class
podium.models.
AbstractSupervisedModel
¶ Bases:
abc.ABC
Interface for supervised models.
-
PREDICTION_KEY
¶ key for defining prediction return variable
- Type
str
-
abstract
fit
(X, y, **kwargs)¶ Method trains the model and returns dictionary of values defined by model specific key parameters
- Parameters
X (np.array) – input data
y (np.array) – data labels
**kwargs (dict) – Additional key-value parameters for model
- Returns
result – dictionary mapping fit results to defined model specific key parameters
- Return type
dict
-
abstract
predict
(X, **kwargs)¶ Predict labels for given data
- Parameters
X (np.array) – input data
**kwargs (dict) – Additional key-value parameters for model
- Returns
result – dictionary mapping fit results to defined model specific key parameters
- Return type
dict
-
abstract
reset
(**kwargs)¶ Resets the model to its initial state so it can be re-trained.
- Parameters
kwargs – Additional key-value parameters for model
-
-
class
podium.models.
Experiment
(model: Union[Type[podium.models.model.AbstractSupervisedModel], podium.models.model.AbstractSupervisedModel], trainer: podium.models.trainer.AbstractTrainer = None, feature_transformer: Union[podium.models.transformers.FeatureTransformer, Callable[[NamedTuple], numpy.array]] = None, label_transform_fn: Callable[[NamedTuple], numpy.ndarray] = None)¶ Bases:
object
Class used to streamline model fitting and prediction.
-
fit
(dataset: podium.datasets.dataset.Dataset, model_kwargs: Dict = None, trainer_kwargs: Dict = None, feature_transformer: podium.models.transformers.FeatureTransformer = None, trainer: podium.models.trainer.AbstractTrainer = None)¶ Fits the model to the provided Dataset. During fitting, the provided Iterator and Trainer are used.
- Parameters
dataset (Dataset) – Dataset to fit the model to.
model_kwargs (dict) – Dict containing model arguments. Arguments passed to the model are the default arguments defined with set_default_model_args updated/overridden by model_kwargs.
trainer_kwargs (dict) – Dict containing trainer arguments. Arguments passed to the trainer are the default arguments defined with set_default_trainer_args updated/overridden by ‘trainer_kwargs’.
feature_transformer (FeatureTransformer, Optional) – FeatureTransformer that transforms the input part of the batch returned by the iterator into features that can be fed into the model. Will also be fitted during Experiment fitting. If None, the default FeatureTransformer provided in the constructor will be used. Otherwise, this will overwrite the default feature transformer.
trainer (AbstractTrainer, Optional) – Trainer used to fit the model. If None, the trainer provided in the constructor will be used.
training_iterator_callable (Callable[[Dataset], Iterator]) – Callable used to instantiate new instances of the Iterator used in fitting the model. If None, the training_iterator_callable provided in the constructor will be used.
- Raises
RuntimeError – If trainer is not provided either in the constructor or as an argument to the method.
-
partial_fit
(dataset: podium.datasets.dataset.Dataset, trainer_kwargs: Dict = None, trainer: podium.models.trainer.AbstractTrainer = None)¶ Fits the model to the data without resetting the model.
- Parameters
dataset (Dataset) – Dataset to fit the model to.
trainer_kwargs (dict) – Dict containing trainer arguments. Arguments passed to the trainer are the default arguments defined with set_default_trainer_args updated/overridden by ‘trainer_kwargs’.
trainer (AbstractTrainer, Optional) – Trainer used to fit the model. If None, the trainer provided in the constructor will be used.
- Raises
RuntimeError – If trainer is not provided either in the constructor or as an argument to the method.
-
predict
(dataset: podium.datasets.dataset.Dataset, batch_size: int = 128, **kwargs) → numpy.ndarray¶ Computes the prediction of the model for every example in the provided dataset.
- Parameters
dataset (Dataset) – Dataset to compute predictions for.
batch_size (int) – If None, predictions for the whole dataset will be done in a single batch. Else, predictions will be calculated in batches of batch_size size. This argument is useful in case the whole dataset can’t be processed in a single batch.
kwargs – Keyword arguments passed to the model’s predict method
- Returns
Tensor containing predictions for examples in the passed Dataset.
- Return type
ndarray
-
set_default_model_args
(**kwargs)¶ Sets the default model arguments. Model arguments are keyword arguments passed to the model constructor. Default arguments can be updated/overridden by arguments in the model_kwargs dict in the fit method.
- Parameters
kwargs – Default model arguments.
-
set_default_trainer_args
(**kwargs)¶ Sets the default trainer arguments. Trainer arguments are keyword arguments passed to the trainer during model fitting. Default arguments can be updated/overridden by arguments in the trainer_kwargs parameter in the fit method. :param kwargs: Default trainer arguments.
-
-
class
podium.models.
AbstractTrainer
¶ Bases:
abc.ABC
Interface for base trainer that can train the model.
-
abstract
train
(model: podium.models.model.AbstractSupervisedModel, dataset: podium.datasets.dataset.Dataset, feature_transformer: podium.models.transformers.FeatureTransformer, label_transform_fun: Callable[[NamedTuple], numpy.ndarray], **kwargs)¶ Method trains a model with data from given Iterator.
- Parameters
model (AbstractSupervisedModel) – The model that needs to be trained.
dataset (Dataset) – Dataset the model will be trained on
feature_transformer (Callable[[NamedTuple], np.ndarray]) – Callable that transforms the input part of the batch returned by the iterator into features that can be fed into the model.
label_transform_fun (Callable[[NamedTuple], np.ndarray]) – Callable that transforms the target part of the batch returned by the iterator into the same format the model prediction is. For a hypothetical perfect model the prediction result of the model for some examples must be identical to the result of this callable for those same examples.
kwargs (dict) – Trainer specific parameters.
-
abstract
-
class
podium.models.
FeatureTransformer
(feature_extraction_fn: Callable[[NamedTuple], numpy.ndarray], tensor_transformer: podium.models.transformers.TensorTransformer = None)¶ Bases:
object
Class used to transform Dataset batches into features used in model prediction and training.
-
__call__
(x: NamedTuple)¶ Trasforms the provided podium feature batch into a numpy array. :param x: Feature batch to be transformed. :type x: NamedTuple
- Returns
Transformed features.
- Return type
np.ndarray
-
fit
(x: NamedTuple, y: numpy.ndarray)¶ Fits this tensor transformer to the provided data.
- Parameters
x (NamedTuple) – Podium feature batch containing the features to be transformed.
y (np.ndarray) – Labels corresponding to the features in x.
-
requires_fitting
()¶ Returns True if the contained TensorTransformer exists and requires fitting, else returns None.
- Returns
True if the contained TensorTransformer exists and requires fitting, else returns False.
- Return type
bool
-
transform
(x: NamedTuple) → numpy.ndarray¶ Trasforms the provided podium feature batch into a numpy array.
- Parameters
x (NamedTuple) – Feature batch to be transformed.
- Returns
Transformed features.
- Return type
np.ndarray
-
-
class
podium.models.
TensorTransformer
¶ Bases:
abc.ABC
Abstract class used to transform tensors. Used in feature pre-processing during training and prediction. Usually used in FeatureTransformer to transform tensors returned by the feature extraction callable.
-
abstract
fit
(x: numpy.ndarray, y: numpy.ndarray)¶ Fits the transformer to the provided data.
- Parameters
x (np.ndarray) – Features in numpy array form.
y (np.ndarray) – Labels in numpy array form.
-
abstract
requires_fitting
() → bool¶ Returns True if this TensorTransformer requires fitting.
- Returns
- Return type
True if this TensorTransformer requires fitting, else returns False.
-
abstract
transform
(x: numpy.ndarray) → numpy.ndarray¶ Transforms the passed features.
- Parameters
x (np.ndarray) – Features to be transformed in numpy array form.
- Returns
Transformed features.
- Return type
np.array
-
abstract
-
class
podium.models.
SklearnTensorTransformerWrapper
(feature_transformer, requires_fitting=True)¶ Bases:
podium.models.transformers.TensorTransformer
Wrapper class for Sklearn feature transformers.
-
fit
(x: numpy.ndarray, y: numpy.ndarray)¶ Fits the transformer to the provided data.
- Parameters
x (np.ndarray) – Features in numpy array form.
y (np.ndarray) – Labels in numpy array form.
-
requires_fitting
() → bool¶ Returns True if this TensorTransformer requires fitting.
- Returns
- Return type
True if this TensorTransformer requires fitting, else returns False.
-
transform
(x: numpy.array) → numpy.ndarray¶ Transforms the passed features.
- Parameters
x (np.ndarray) – Features to be transformed in numpy array form.
- Returns
Transformed features.
- Return type
np.array
-