biomedical_data_generator.features.informative
Generation of free informative features and class separation.
This module builds numeric class labels from DatasetConfig.class_configs, samples base values for free informative features according to per-class distributions, and applies class-wise mean shifts controlled by DatasetConfig.class_sep.
Correlated clusters (including anchors) are handled in correlated.py. Noise features are handled in noise.py. The shifting logic is implemented in shift_classes and can be reused by other modules (for example, for anchor effects in correlated clusters).
Functions
|
Generate all free informative features (no anchors, no clusters). |
- biomedical_data_generator.features.informative.generate_informative_features(cfg, rng)[source]
Generate all free informative features (no anchors, no clusters).
The function performs: 1. Build numeric labels y from cfg.class_configs. 2. Allocate a matrix x_informative of shape (n_samples, n_informative_free). 3. Sample base values for each class via sample_2d_array. 4. Apply class-wise mean shifts (multi-class offsets only).
- Parameters:
cfg (DatasetConfig) – DatasetConfig with validated fields and derived quantities.
rng (Generator) – NumPy random Generator.
- Returns:
- x_informative: Array of shape (n_samples, n_informative_free) with
free informative features.
y: Array of shape (n_samples,) with class labels in {0, …, K-1}.
- Return type: