biomedical_data_generator.utils.visualization

Plot utilities for correlation analysis.

Functions

plot_all_correlation_clusters(df, *[, ...])

Plot correlation matrix for all corr-cluster features with intra-cluster ordering.

plot_correlation_matrices_per_cluster(df, ...)

Draw one correlation matrix per cluster (cluster_id -> list of column indices).

plot_correlation_matrix(correlation_matrix, *)

Draw a correlation matrix as a heatmap.

plot_correlation_matrix_for_cluster(df, ...)

Slice a cluster via meta, compute its correlation, and plot it.

biomedical_data_generator.utils.visualization.plot_all_correlation_clusters(df, *, correlation_method='spearman', title=None, figsize=(10, 10), vmin=-1.0, vmax=1.0, annot=None, fmt='.2f', draw_cluster_boundaries=True, show=True)[source]

Plot correlation matrix for all corr-cluster features with intra-cluster ordering.

Features are sorted by (cluster_id, feature_index) so that within each cluster, the anchor appears first, followed by features 2, 3, 4, etc.

Parameters:
  • df (DataFrame) – DataFrame with all features.

  • correlation_method (Literal['pearson', 'kendall', 'spearman']) – Correlation method to use.

  • title (str | None) – Optional plot title.

  • figsize (tuple[int, int]) – Figure size.

  • vmin (float) – Color scale minimum.

  • vmax (float) – Color scale maximum.

  • annot (bool | None) – If True, show numeric values. If None, auto-decide based on size.

  • fmt (str) – Number format for annotations.

  • draw_cluster_boundaries (bool) – If True, draw black lines between clusters.

  • show (bool) – If True, call plt.show().

Returns:

Tuple of (figure, axes).

Return type:

tuple[Figure | SubFigure, Axes]

biomedical_data_generator.utils.visualization.plot_correlation_matrices_per_cluster(df, clusters, *, labels_map=None, correlation_method='pearson', vmin=-1.0, vmax=1.0, annot=False, fmt='.2f', show=True)[source]

Draw one correlation matrix per cluster (cluster_id -> list of column indices).

Parameters:
  • df (DataFrame) – DataFrame with all features.

  • clusters (Mapping[Any, list[int]]) – Mapping cluster_id -> list of column indices in df.

  • labels_map (Mapping[Any, str] | None) – Optional mapping cluster_id -> cluster label for titles.

  • correlation_method (Literal['pearson', 'kendall', 'spearman']) – Correlation method to use.

  • vmin (float) – Color scale limits.

  • vmax (float) – Color scale limits.

  • annot (bool) – If True, draw numeric values for small matrices (p <= 25).

  • fmt (str) – Number format for annotations.

  • show (bool) – If True and a new figure is created here, call plt.show().

Returns:

Mapping cluster_id -> (fig, ax) tuple for each plotted correlation matrix.

Return type:

out

Notes:

  • Computation is delegated to compute_correlation_matrix (SoC).

  • If you have a meta object instead of an index mapping, pass meta.corr_cluster_indices.

biomedical_data_generator.utils.visualization.plot_correlation_matrix(correlation_matrix, *, title=None, ax=None, vmin=-1.0, vmax=1.0, annot=False, fmt='.2f', labels=None, show=True)[source]

Draw a correlation matrix as a heatmap.

Parameters:
  • correlation_matrix (ndarray[tuple[Any, ...], dtype[float64]]) – Square correlation matrix of shape (p, p).

  • title (str | None) – Optional plot title.

  • ax (Axes | None) – Optional Matplotlib Axes to draw on (created if None).

  • vmin (float) – Color scale limits.

  • vmax (float) – Color scale limits.

  • annot (bool) – If True, draw numeric values for small matrices (p <= 25).

  • fmt (str) – Number format for annotations.

  • labels (Sequence[str] | None) – Optional tick labels (length p). If not given, ‘feature’ axes labels are used.

  • show (bool) – If True and a new figure is created here, call plt.show().

Returns:

The Figure and Axes used.

Return type:

(fig, ax)

biomedical_data_generator.utils.visualization.plot_correlation_matrix_for_cluster(df, meta, cluster_id, *, correlation_method='pearson', anchor_first=True, natural_sort_rest=True, title=None, ax=None, vmin=-1.0, vmax=1.0, annot=False, fmt='.2f', show=True)[source]

Slice a cluster via meta, compute its correlation, and plot it.

Returns the numeric correlation matrix in the plotted column order.

Parameters:
  • df (DataFrame) – DataFrame with all features.

  • meta (Any) – Meta object with cluster information.

  • cluster_id (int) – ID of the cluster to plot.

  • correlation_method (Literal['pearson', 'kendall', 'spearman']) – Correlation method to use.

  • anchor_first (bool) – If True, anchor features are placed first in the cluster frame.

  • natural_sort_rest (bool) – If True, non-anchor features are sorted naturally.

  • title (str | None) – Optional plot title.

  • ax (Axes | None) – Optional Matplotlib Axes to draw on (created if None).

  • vmin (float) – Color scale limits.

  • vmax (float) – Color scale limits.

  • annot (bool) – If True, draw numeric values for small matrices (p <= 25).

  • fmt (str) – Number format for annotations.

  • show (bool) – If True and a new figure is created here, call plt.show().

Returns:

The computed correlation matrix as a 2D NumPy array.

Return type:

C