Local Data Loading#

This example loads data from a local file

Load up a local data file#

from biolearn.data_library import GeoData
from biolearn.util import get_test_data_file

#Files formatted as described in the standard https://bio-learn.github.io/methylation-standard.html will load correctly.
#Load will search for files names [name]_metadata.csv and [name]_methylation_part0.csv

file_path = get_test_data_file("")
data = GeoData.load_csv(file_path, "example")

data.dnam
GSM3074480 GSM3074481 GSM3074482 GSM3074483
cpgSite
cg00000029 0.395 0.499 0.420 0.423
cg00000109 0.821 0.868 0.877 0.841
cg00000155 0.878 0.890 0.894 NaN


Metadata is loaded if available#

Sex Age Disease_State
SampleID
GSM3074480 1 25.2 None
GSM3074481 1 29.5 COVID
GSM3074482 2 39.7 None


You can now use it like any other Biolearn dataset#

Quality Report Summary

------------------------------------------------
Sample Count: 4
Methylation Sites: 3
Missing Methylation Data: 1 (8.33%)
Samples With High Deviation: 0 (0.00%)
Methylation Sites With Over 20% of Reads Missing: 1 (33.33%)

Notes:
------------------------------------------------
- Your data set includes methylation sites that have over 20% of reads missing. Default imputation may replace the values for all reads from this site with a gold standard.

Total running time of the script: (0 minutes 0.482 seconds)

Estimated memory usage: 78 MB

Gallery generated by Sphinx-Gallery