GaleShapleyFeatureSelectionTransform

class GaleShapleyFeatureSelectionTransform(relevance_table: etna.analysis.feature_relevance.relevance.RelevanceTable, top_k: int, features_to_use: Union[List[str], Literal['all']] = 'all', use_rank: bool = False, return_features: bool = False, **relevance_params)[source]

Bases: etna.transforms.feature_selection.base.BaseFeatureSelectionTransform

Transform that provides feature filtering by Gale-Shapley matching algorithm according to the relevance table.

Transform works with any type of features, however most of the models works only with regressors. Therefore, it is recommended to pass the regressors into the feature selection transforms.

As input, we have a table of relevances with size \(N\_{f} imes N\_{s}\) where \(N\_{f}\) – number of features, \(N\_{s}\) – number of segments. Procedure of filtering features consist of :math:`lceil

rac{k}{N_{s}} ceil` iterations.

Algorithm of each iteration:

according to the relevance table, during the matching segments send proposals to features; - select features to add by taking matched feature for each segment; - add selected features to accumulated list of selected features taking into account that this list shouldn’t exceed the size of top_k; - remove added features from future consideration.

Init GaleShapleyFeatureSelectionTransform.

Parameters
  • relevance_table (etna.analysis.feature_relevance.relevance.RelevanceTable) – class to build relevance table

  • top_k (int) – number of features that should be selected from all the given ones

  • features_to_use (Union[List[str], Literal['all']]) – columns of the dataset to select from if “all” value is given, all columns are used

  • use_rank (bool) – if True, use rank in relevance table computation

  • return_features (bool) – indicates whether to return features or not.

Inherited-members

Methods

fit(ts)

Fit the transform.

fit_transform(ts)

Fit and transform TSDataset.

get_regressors_info()

Return the list with regressors created by the transform.

inverse_transform(ts)

Inverse transform TSDataset.

load(path)

Load an object.

params_to_tune()

Get default grid for tuning hyperparameters.

save(path)

Save the object.

set_params(**params)

Return new object instance with modified parameters.

to_dict()

Collect all information about etna object in dict.

transform(ts)

Transform TSDataset inplace.

params_to_tune() Dict[str, etna.distributions.distributions.BaseDistribution][source]

Get default grid for tuning hyperparameters.

This grid tunes parameters: top_k, use_rank. Other parameters are expected to be set by the user.

For top_k parameter the maximum suggested value is not greater than self.top_k.

Returns

Grid to tune.

Return type

Dict[str, etna.distributions.distributions.BaseDistribution]