Note

This page is a reference documentation. It only explains the class signature, and not how to use it. Please refer to the user guide for the big picture.

biolearn.data_library.DataLibrary#

class biolearn.data_library.DataLibrary(library_file=None, cache=None)#

Manages a collection of data sources for biomarkers research.

The DataLibrary class is responsible for loading, storing, and retrieving data sources. Data sources are defined in a library file and new sources can easily be added at runtime. Currently DNA methylation data from GEO is supported.

__init__(library_file=None, cache=None)#

Initializes the DataLibrary instance with an optional library file and cache mechanism.

Parameters:
  • library_file (str, optional) – The path to the library file. If None, the default biolearn library file is loaded.

  • cache (object, optional) – An object that adheres to the caching interface used in the caching module. If None, the default cache is used. This cache will be used by all returned data sources

load_sources(library_file)#

Loads data sources from a given library file appending them to the current set of data sources.

Parameters:

library_file (str) – The file path of the library file to load data sources from.

get(source_id)#

Retrieves a data source by its identifier.

Parameters:

source_id (str) – The identifier of the data source to retrieve.

Returns:

The data source with the given identifier if found, otherwise None.

lookup_sources(organism=None, format=None)#

Looks up data sources based on the specified organism and/or format.

Parameters:
  • organism (str, optional) – The organism to filter the data sources by.

  • format (str, optional) – The format to filter the data sources by.

Returns:

A list of data sources that match the specified organism and format criteria.

Examples using biolearn.data_library.DataLibrary#

Training an ElasticNet model

Training an ElasticNet model