biomedical_data_generator.CorrClusterConfig

class biomedical_data_generator.CorrClusterConfig(*, n_cluster_features, structure='equicorrelated', correlation=0.8, anchor_role='informative', anchor_effect_size=None, anchor_class=None, label=None)[source]

Bases: BaseModel

Correlated feature cluster simulating coordinated biomarker patterns.

A cluster represents a group of biomarkers that move together, such as markers in a metabolic pathway or proteins in a signaling cascade. One marker acts as the “anchor” (driver), while the others are “proxies” (followers).

Two correlation modes are supported:

Global correlation (most common):
correlation: float structure: “equicorrelated” or “toeplitz”

Example

correlation = 0.7 structure = “equicorrelated”

All samples share the same correlation pattern.

Class-specific correlation:
correlation: dict[int, float]

Example (pathway only active in class 1):
correlation = {0: 0.0, 1: 0.8}

Classes not listed in the dict default to 0.0 (independent cluster).

The correlation structure is global for the cluster and applies to all classes.

Parameters:

n_cluster_features (int) – Number of biomarkers in the cluster (including anchor). Must be >= 1.
structure (Literal['equicorrelated', 'toeplitz']) –
Correlation structure for this cluster:
- ”equicorrelated”: all pairwise correlations are equal.
- ”toeplitz”: correlation decays with feature distance.
correlation (float | dict[int, float]) –
Either a single global correlation strength (float) or a mapping {class_index -> correlation} for class-specific correlations. Typical magnitudes:
- 0.0 = independent
- 0.3 ≈ weak correlation
- 0.5 ≈ moderate correlation
- 0.8+ ≈ strong correlation
anchor_role (Literal['informative', 'noise']) – “informative” or “noise”.
anchor_effect_size (Literal['small', 'medium', 'large'] | float | None) – “small” (0.5), “medium” (1.0), “large” (1.5), custom > 0, or None.
anchor_class (int | None) – Class index the anchor predicts (if informative). None → all classes.
label (str | None) – Descriptive name for documentation.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

__init__(**data)

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:: data (Any)
Return type:: None

Methods

`__init__`(**data)	Create a new model by parsing and validating input data from keyword arguments.
`construct`([_fields_set])
`copy`(*[, include, exclude, update, deep])	Returns a copy of the model.
`dict`(*[, include, exclude, by_alias, ...])
`from_orm`(obj)
`get_correlation_for_class`(class_idx)	Resolve correlation for a specific class.
`is_class_specific`()	Return True if this cluster uses class-specific correlations.
`json`(*[, include, exclude, by_alias, ...])
`model_construct`([_fields_set])	Creates a new instance of the Model class with validated data.
`model_copy`(*[, update, deep])	!!! abstract "Usage Documentation"
`model_dump`(*[, mode, include, exclude, ...])	!!! abstract "Usage Documentation"
`model_dump_json`(*[, indent, ensure_ascii, ...])	!!! abstract "Usage Documentation"
`model_json_schema`([by_alias, ref_template, ...])	Generates a JSON schema for a model class.
`model_parametrized_name`(params)	Compute the class name for parametrizations of generic classes.
`model_post_init`(context, /)	Override this method to perform additional initialization after __init__ and model_construct.
`model_rebuild`(*[, force, raise_errors, ...])	Try to rebuild the pydantic-core schema for the model.
`model_validate`(obj, *[, strict, extra, ...])	Validate a pydantic model instance.
`model_validate_json`(json_data, *[, strict, ...])	!!! abstract "Usage Documentation"
`model_validate_strings`(obj, *[, strict, ...])	Validate the given object with string data against the Pydantic model.
`parse_file`(path, *[, content_type, ...])
`parse_obj`(obj)
`parse_raw`(b, *[, content_type, encoding, ...])
`resolve_anchor_effect_size`()	Convert anchor_effect_size to a numeric effect size.
`schema`([by_alias, ref_template])
`schema_json`(*[, by_alias, ref_template])
`update_forward_refs`(**localns)
`validate`(value)

Attributes

`model_computed_fields`
`model_config`	Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
`model_extra`	Get extra fields set during validation.
`model_fields`
`model_fields_set`	Returns the set of fields that have been explicitly set on this model instance.
`n_cluster_features`
`structure`
`correlation`
`anchor_role`
`anchor_effect_size`
`anchor_class`
`label`

get_correlation_for_class(class_idx)[source]

Resolve correlation for a specific class.

Global mode: return the single correlation value.
Class-specific mode: return mapping value or 0.0 if not specified.

Parameters:: class_idx (int)
Return type:: float

is_class_specific()[source]

Return True if this cluster uses class-specific correlations.

Return type:: bool

model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

resolve_anchor_effect_size()[source]

Convert anchor_effect_size to a numeric effect size.

Return type:: float