biomedical_data_generator.features.informative

Generation of free informative features and class separation.

This module builds numeric class labels from DatasetConfig.class_configs, samples base values for free informative features according to per-class distributions, and applies class-wise mean shifts controlled by DatasetConfig.class_sep.

Correlated clusters (including anchors) are handled in correlated.py. Noise features are handled in noise.py. The shifting logic is implemented in shift_classes and can be reused by other modules (for example, for anchor effects in correlated clusters).

Functions

generate_informative_features(cfg, rng)

Generate all free informative features (no anchors, no clusters).

biomedical_data_generator.features.informative.generate_informative_features(cfg, rng)[source]

Generate all free informative features (no anchors, no clusters).

The function performs: 1. Build numeric labels y from cfg.class_configs. 2. Allocate a matrix x_informative of shape (n_samples, n_informative_free). 3. Sample base values for each class via sample_2d_array. 4. Apply class-wise mean shifts (multi-class offsets only).

Parameters:
  • cfg (DatasetConfig) – DatasetConfig with validated fields and derived quantities.

  • rng (Generator) – NumPy random Generator.

Returns:

x_informative: Array of shape (n_samples, n_informative_free) with

free informative features.

y: Array of shape (n_samples,) with class labels in {0, …, K-1}.

Return type:

tuple