Note

This page is a reference documentation. It only explains the class signature, and not how to use it. Please refer to the user guide for the big picture.

biolearn.data_library.GeoData

class biolearn.data_library.GeoData(metadata, dnam=None, rna=None, protein_alamar=None, protein_olink=None)

Represents genomic data with a focus on metadata and methylation data.

GeoData facilitates the organization and access to metadata and methylation data.

metadata

A pandas DataFrame where rows represent different samples and columns represent different data fields.

Type:

DataFrame

dnam

A pandas DataFrame where columns represent different samples and rows represent different methylation sites.

Type:

DataFrame

__init__(metadata, dnam=None, rna=None, protein_alamar=None, protein_olink=None)

Initializes the GeoData instance.

Parameters:
  • metadata (DataFrame) – Metadata associated with genomic samples.

  • dnam (DataFrame) – Methylation data associated with genomic samples.

copy()

Creates a deep copy of the GeoData instance.

Returns:

A new instance of GeoData with copies of the metadata and dnam DataFrames.

Return type:

GeoData

quality_report(sites=None)

Generates a quality control report for the genomic data, optionally filtered by specified methylation sites, and includes a detailed section reporting the missing percentage for each methylation site.

Parameters:

sites (list, optional) – A list of methylation site identifiers to include in the report. If None, all sites are included.

Returns:

An object containing both detailed methylation data, a summary,

and a detailed section for missing percentages per site.

Return type:

QualityReport

classmethod from_methylation_matrix(matrix)

Creates a GeoData instance from a methylation matrix which can be either a DataFrame directly or a path to a CSV file.

Parameters:

matrix (Union[str, DataFrame]) – Methylation matrix as a DataFrame or the path to the CSV file containing the matrix.

Returns:

An instance of GeoData with the methylation data loaded and metadata initialized.

Return type:

GeoData

save_csv(folder_path, name)

Saves the GeoData instance to CSV files according to the DNA Methylation Array Data Standard V-2410.

Parameters:
  • folder_path (str) – The directory where the files will be saved.

  • name (str) – The base name for the saved files.

Returns:

None

classmethod load_csv(folder_path, name, series_part='all', validate=True)

Loads a GeoData instance from CSV files according to the DNA Methylation Array Data Standard V-2410.

Parameters:
  • folder_path (str) – The directory where the files are located.

  • name (str) – The base name for the files.

  • series_part (str or int) – “all” to load all methylation parts and concatenate; otherwise, an integer specifying the part number to load.

  • validate (bool) – Whether to validate metadata-omics consistency. Default is True.

Returns:

A GeoData instance with metadata, methylation data, RNA, and protein data loaded.

Return type:

GeoData

Examples using biolearn.data_library.GeoData

“Deconvolution Example”

"Deconvolution Example"