podium.models.impl package

Submodules

podium.models.impl.fc_model module

Module contains fully connected neural network models.

class podium.models.impl.fc_model.ScikitMLPClassifier(classes, **kwargs)

Bases: podium.models.model.AbstractSupervisedModel

Simple scikitlearn multiperceptron model.

fit(X, y, **kwargs)

Method calls fit on multiperceptron model with given batch. It is supposed to be used as online learning.

predict(X, **kwargs)

Predict labels for given data

Parameters
  • X (np.array) – input data

  • **kwargs (dict) – Additional key-value parameters for model

Returns

result – dictionary mapping fit results to defined model specific key parameters

Return type

dict

reset(**kwargs)

Resets the model to its initial state so it can be re-trained.

Parameters

kwargs – Additional key-value parameters for model

podium.models.impl.simple_trainers module

Module contains simple trainer classes.

class podium.models.impl.simple_trainers.SimpleTrainer

Bases: podium.models.trainer.AbstractTrainer

Simple trainer class.

MAX_EPOCH_KEY

keyword argument key for maximal number of epochs used for training

Type

str

train(model, dataset, feature_transformer, label_transform_fun, max_epoch, iterator=None)

Method trains a model with data from given Iterator.

Parameters
  • model (AbstractSupervisedModel) – The model that needs to be trained.

  • dataset (Dataset) – Dataset the model will be trained on

  • feature_transformer (Callable[[NamedTuple], np.ndarray]) – Callable that transforms the input part of the batch returned by the iterator into features that can be fed into the model.

  • label_transform_fun (Callable[[NamedTuple], np.ndarray]) – Callable that transforms the target part of the batch returned by the iterator into the same format the model prediction is. For a hypothetical perfect model the prediction result of the model for some examples must be identical to the result of this callable for those same examples.

  • kwargs (dict) – Trainer specific parameters.

podium.models.impl.svm_model module

Module contains svm models.

class podium.models.impl.svm_model.ScikitLinearSVCModel(**kwargs)

Bases: podium.models.impl.svm_model.ScikitSVCModel

Simple scikitlearn linear SVM model.

reset(**kwargs)

Resets the model to its initial state so it can be re-trained.

Parameters

kwargs – Additional key-value parameters for model

class podium.models.impl.svm_model.ScikitSVCModel(**kwargs)

Bases: podium.models.model.AbstractSupervisedModel

Simple scikitlearn SVM model.

fit(X, y, **kwargs)

Method trains the model and returns dictionary of values defined by model specific key parameters

Parameters
  • X (np.array) – input data

  • y (np.array) – data labels

  • **kwargs (dict) – Additional key-value parameters for model

Returns

result – dictionary mapping fit results to defined model specific key parameters

Return type

dict

predict(X, **kwargs)

Predict labels for given data

Parameters
  • X (np.array) – input data

  • **kwargs (dict) – Additional key-value parameters for model

Returns

result – dictionary mapping fit results to defined model specific key parameters

Return type

dict

reset(**kwargs)

Resets the model to its initial state so it can be re-trained.

Parameters

kwargs – Additional key-value parameters for model

Module contents

Package contains implementations of concrete models.

class podium.models.impl.ScikitSVCModel(**kwargs)

Bases: podium.models.model.AbstractSupervisedModel

Simple scikitlearn SVM model.

fit(X, y, **kwargs)

Method trains the model and returns dictionary of values defined by model specific key parameters

Parameters
  • X (np.array) – input data

  • y (np.array) – data labels

  • **kwargs (dict) – Additional key-value parameters for model

Returns

result – dictionary mapping fit results to defined model specific key parameters

Return type

dict

predict(X, **kwargs)

Predict labels for given data

Parameters
  • X (np.array) – input data

  • **kwargs (dict) – Additional key-value parameters for model

Returns

result – dictionary mapping fit results to defined model specific key parameters

Return type

dict

reset(**kwargs)

Resets the model to its initial state so it can be re-trained.

Parameters

kwargs – Additional key-value parameters for model

class podium.models.impl.ScikitLinearSVCModel(**kwargs)

Bases: podium.models.impl.svm_model.ScikitSVCModel

Simple scikitlearn linear SVM model.

reset(**kwargs)

Resets the model to its initial state so it can be re-trained.

Parameters

kwargs – Additional key-value parameters for model

class podium.models.impl.ScikitMLPClassifier(classes, **kwargs)

Bases: podium.models.model.AbstractSupervisedModel

Simple scikitlearn multiperceptron model.

fit(X, y, **kwargs)

Method calls fit on multiperceptron model with given batch. It is supposed to be used as online learning.

predict(X, **kwargs)

Predict labels for given data

Parameters
  • X (np.array) – input data

  • **kwargs (dict) – Additional key-value parameters for model

Returns

result – dictionary mapping fit results to defined model specific key parameters

Return type

dict

reset(**kwargs)

Resets the model to its initial state so it can be re-trained.

Parameters

kwargs – Additional key-value parameters for model

class podium.models.impl.MultilabelSVM

Bases: podium.models.model.AbstractSupervisedModel

Multilabel SVM with hyperparameter optimization via grid search using K-fold cross-validation.

Multilabel SVM is implemented as a set of binary SVM classifiers, one for each class in dataset (one vs. rest).

fit(X, y, parameter_grid, n_splits=3, max_iter=10000, cutoff=1, scoring='f1', n_jobs=1)

Fits the model on given data.

For each class present in y (for each column of the y matrix), a separate SVM model is trained. If there are no positive training instances for some label (the entire column is filled with zeros), no model is trained. Upon calling the predict function, a zero vector is returned for that class. The indexes of the columns containing such labels are stored and can be retrieved using the get_indexes_of_missing_models method.

Parameters
  • X (np.array) – input data

  • y (np.array) – data labels, 2D array (number of examples, number of labels)

  • parameter_grid (dict or list(dict)) – Dictionary with parameters names (string) as keys and lists of parameter settings to try as values, or a list of such dictionaries, in which case the grids spanned by each dictionary in the list are explored. This enables searching over any sequence of parameter settings. For more information, refer to https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html The parameter_grid may contain any of the parameters used to train an instance of the LinearSVC model, most notably penalty parameter ‘C’ and regularization penalty ‘penalty’ that can be set to ‘l1’ or ‘l2’. For more information, please refer to: https://scikit-learn.org/stable/modules/generated/sklearn.svm.LinearSVC.html

  • n_splits (int) – Number of splits for K-fold cross-validation

  • max_iter (int) – Maximum number of iterations for training a single SVM within the model.

  • cutoff (int >= 1) – If the number of positive training examples for a class is less than the cut-off, no model is trained for such class and the index of the label is added in the missing model indexes.

  • scoring (string, callable, list/tuple, dict or None) – Indicates what scoring function to use in order to determine the best hyperparameters via grid search. For more details, view https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html

  • n_jobs (int) – Number of threads to be used.

Raises

ValueError – If cutoff is not a positive integer >= 1. If n_jobs is not a positive integer or -1. If n_jobs is not a positive integer >= 1. If max_iter is not a positive integer >= 1.

get_indexes_of_missing_models()

Returns the indexes of classes for which the models have not been trained due to the lack of positive training examples.

Returns

result – Indexes of models that were not trained.

Return type

list(int)

Raises

RuntimeError – If the model instance is not fitted.

predict(X)

Predict labels for given data.

If no model has been trained for some class (because the was not enough examples for this label in the train set), a zero column is returned. If one wishes to exclude such labels from the evaluation, their indexes can be retrieved through the get_indexes_of_missing_models method.

Parameters

X (np.array) – input data

Returns

result – Predictions of the model for the given examples.

Return type

2D np.array (number of examples, number of classes)

Raises

RuntimeError – If the model instance is not fitted.

reset(**kwargs)

Resets the model to its initial state so it can be re-trained.

Parameters

kwargs – Additional key-value parameters for model