Code Documentation

Method

class cv_pruner.Method(value)[source]

Bases: enum.Enum

Extrapolation method for the threshold-based pruner.

MEDIAN

No extrapolation. Pruning against the current median.

MAX_DEVIATION_TO_MEDIAN

Maximum deviation from the median in direction to optimize serves as basis for the extrapolation of missing performance evaluation values of the complete inner cross-validation.

MEAN_DEVIATION_TO_MEDIAN

Mean deviation from the median in direction to optimize serves as basis for the extrapolation of missing performance evaluation values of the complete inner cross-validation.

OPTIMAL_METRIC

Optimal value for the performance evaluation metric serves as basis for the extrapolation of missing performance evaluation values of the complete inner cross-validation.

no_features_selected()

cv_pruner.no_features_selected(feature_importances)[source]

Pruner to detect semantically meaningless trials.

Parameters

feature_importances (Union[numpy.ndarray, List[float]]) – Weights, importances or coefficients for each feature after training.

Returns

If a trial should be pruned. TRUE if a trial includes a training result without any selected features. FALSE otherwise.

Return type

bool

should_prune_against_threshold()

cv_pruner.should_prune_against_threshold(current_step_of_complete_nested_cross_validation, folds_outer_cv, folds_inner_cv, validation_metric_history, threshold_for_pruning, direction_to_optimize_is_minimize, optimal_metric, method=Method.OPTIMAL_METRIC)[source]

Pruner to detect an invalid performance evaluation value of a trial.

Parameters
  • current_step_of_complete_nested_cross_validation (int) – One based step of the complete nested cross-validation.

  • folds_outer_cv (int) – Absolute number of folds for the outer cross-validation loop (one based): Set to zero for standard cross-validation.

  • folds_inner_cv (int) – Absolute number of folds for the inner cross validation loop (one based).

  • validation_metric_history (List[float]) – List of all previously calculated performance evaluation metric values.

  • threshold_for_pruning (float) – Threshold that should not be exceeded (minimizing) or fallen below (maximizing).

  • direction_to_optimize_is_minimize (bool) – True - in case of minimizing and False - in case of maximizing.

  • optimal_metric (float) – Optimal value for the performance evaluation metric.

  • method (cv_pruner.cv_pruner.Method) – The extrapolation method to be used (see Method).

Returns

If the trial should be pruned. TRUE if it is likely that the final performance evaluation metric will exceed the upper threshold or fall below the lower threshold respectively. FALSE otherwise.

Return type

bool