Metamarker: differential correlation network methodology and software for metabolomic data analysis

Arafat, Arshad (2021) Metamarker: differential correlation network methodology and software for metabolomic data analysis. Masters thesis, Memorial University of Newfoundland.

[img] [English] PDF - Accepted Version
Available under License - The author retains copyright ownership and moral rights in this thesis. Neither the thesis nor substantial extracts from it may be printed or otherwise reproduced without the author's permission.

Download (12MB)


Biomarkers are the substances with quantitative properties present within organisms indicating disease progression. Metabolomics is a newer approach towards understanding the human body following the footsteps of other "omics" techniques (genomics, proteomics, transcriptomics). Metabolomics refers to the scientific study of low molecular intracellular elements called metabolites. With the advancement of technology, it is now easier to extract different sets of metabolites from various forms of biological samples such as cells, tissues, bio-fluids, etc. Metabolomic data analysis is a complex workflow. It requires sophisticated data processing and statistical analysis. Various tools have been developed, such as data cleaning and preprocessing tools, modeling tools, validation/ result visualizations, and many more. Most of these software tools are developed for comprehensive studies rather than precisely focusing on metabolomic biomarker discovery. As a result, their capacity, in most cases, is limited. The modeling techniques commonly used in these tools are also not adequate. Many of these software tools provide basic analysis methods rather than more advanced machine learning techniques. The high throughput metabolomic datasets require compound analysis techniques. This thesis designed and developed a software tool that encompasses the general metabolomic biomarker research workflow. Our software platform is equipped with many basic to advanced analysis techniques, interactive visualizations, delicate result analysis, and comparison modules (The first version release can be found at, Our software is designed so that users do not have to switch in between different tools during the study since the platform provides necessary features that are commonly used throughout the workflow. Some of the software’s significant features are outlier handling of the uploaded datasets, analyzing the dataset with principal component analysis or partial least square discriminant analysis, and comparing different models. The software makes the study process fast and convenient. We employed a differential correlation network analysis model for the biomarker discovery studies, which is advantageous in finding key metabolites that influence diseases through interaction.

Item Type: Thesis (Masters)
Item ID: 14944
Additional Information: Includes bibliographical references (pages 88-98).
Keywords: Metabolomics, Machine Learning, Data Visualization, Differential Correlation Network, Biomarker Discovery
Department(s): Science, Faculty of > Computer Science
Date: March 2021
Date Type: Submission
Digital Object Identifier (DOI):
Library of Congress Subject Heading: Biochemical markers--Research; Computer software--Design; Machine learning.

Actions (login required)

View Item View Item


Downloads per month over the past year

View more statistics