biomedical_data_generator.CorrClusterConfig

class biomedical_data_generator.CorrClusterConfig(*, n_cluster_features, structure='equicorrelated', correlation=0.8, anchor_role='informative', anchor_effect_size=None, anchor_class=None, label=None)[source]

Bases: BaseModel

Correlated feature cluster simulating coordinated biomarker patterns.

A cluster represents a group of biomarkers that move together, such as markers in a metabolic pathway or proteins in a signaling cascade. One marker acts as the “anchor” (driver), while the others are “proxies” (followers).

Two correlation modes are supported:

  1. Global correlation (most common):

    correlation: float structure: “equicorrelated” or “toeplitz”

Example

correlation = 0.7 structure = “equicorrelated”

All samples share the same correlation pattern.

  1. Class-specific correlation:

    correlation: dict[int, float]

    Example (pathway only active in class 1):

    correlation = {0: 0.0, 1: 0.8}

    Classes not listed in the dict default to 0.0 (independent cluster).

    The correlation structure is global for the cluster and applies to all classes.

Parameters:
  • n_cluster_features (int) – Number of biomarkers in the cluster (including anchor). Must be >= 1.

  • structure (Literal['equicorrelated', 'toeplitz']) –

    Correlation structure for this cluster:
    • ”equicorrelated”: all pairwise correlations are equal.

    • ”toeplitz”: correlation decays with feature distance.

  • correlation (float | dict[int, float]) –

    Either a single global correlation strength (float) or a mapping {class_index -> correlation} for class-specific correlations. Typical magnitudes:

    • 0.0 = independent

    • 0.3 ≈ weak correlation

    • 0.5 ≈ moderate correlation

    • 0.8+ ≈ strong correlation

  • anchor_role (Literal['informative', 'noise']) – “informative” or “noise”.

  • anchor_effect_size (Literal['small', 'medium', 'large'] | float | None) – “small” (0.5), “medium” (1.0), “large” (1.5), custom > 0, or None.

  • anchor_class (int | None) – Class index the anchor predicts (if informative). None → all classes.

  • label (str | None) – Descriptive name for documentation.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

__init__(**data)

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

Return type:

None

Methods

__init__(**data)

Create a new model by parsing and validating input data from keyword arguments.

construct([_fields_set])

copy(*[, include, exclude, update, deep])

Returns a copy of the model.

dict(*[, include, exclude, by_alias, ...])

from_orm(obj)

get_correlation_for_class(class_idx)

Resolve correlation for a specific class.

is_class_specific()

Return True if this cluster uses class-specific correlations.

json(*[, include, exclude, by_alias, ...])

model_construct([_fields_set])

Creates a new instance of the Model class with validated data.

model_copy(*[, update, deep])

!!! abstract "Usage Documentation"

model_dump(*[, mode, include, exclude, ...])

!!! abstract "Usage Documentation"

model_dump_json(*[, indent, ensure_ascii, ...])

!!! abstract "Usage Documentation"

model_json_schema([by_alias, ref_template, ...])

Generates a JSON schema for a model class.

model_parametrized_name(params)

Compute the class name for parametrizations of generic classes.

model_post_init(context, /)

Override this method to perform additional initialization after __init__ and model_construct.

model_rebuild(*[, force, raise_errors, ...])

Try to rebuild the pydantic-core schema for the model.

model_validate(obj, *[, strict, extra, ...])

Validate a pydantic model instance.

model_validate_json(json_data, *[, strict, ...])

!!! abstract "Usage Documentation"

model_validate_strings(obj, *[, strict, ...])

Validate the given object with string data against the Pydantic model.

parse_file(path, *[, content_type, ...])

parse_obj(obj)

parse_raw(b, *[, content_type, encoding, ...])

resolve_anchor_effect_size()

Convert anchor_effect_size to a numeric effect size.

schema([by_alias, ref_template])

schema_json(*[, by_alias, ref_template])

update_forward_refs(**localns)

validate(value)

Attributes

model_computed_fields

model_config

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_extra

Get extra fields set during validation.

model_fields

model_fields_set

Returns the set of fields that have been explicitly set on this model instance.

n_cluster_features

structure

correlation

anchor_role

anchor_effect_size

anchor_class

label

get_correlation_for_class(class_idx)[source]

Resolve correlation for a specific class.

  • Global mode: return the single correlation value.

  • Class-specific mode: return mapping value or 0.0 if not specified.

Parameters:

class_idx (int)

Return type:

float

is_class_specific()[source]

Return True if this cluster uses class-specific correlations.

Return type:

bool

model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

resolve_anchor_effect_size()[source]

Convert anchor_effect_size to a numeric effect size.

Return type:

float