Patra, Pranjal (2015) Regulation expression pathway analysis (REPA): a novel method to facilitate biological interpretation of high throughput expression profiling data. Masters thesis, Memorial University of Newfoundland.
- Accepted Version
Available under License - The author retains copyright ownership and moral rights in this thesis. Neither the thesis nor substantial extracts from it may be printed or otherwise reproduced without the author's permission.
In the past decade there have been great advances and emergence of new techniques in the field of gene expression profiling. As the popularity of these techniques grew, the amount of data that gets generated has also grown. The task of analyzing this data to create a global picture to identify the biological pathways that are relevant to the study has been addressed by many. These approaches (collectively termed as enrichment analysis) have also grown in sophistication and accuracy making them the default step following a gene profiling experiment. However, enrichment analysis approaches do not provide pointers to likely regulators in their results. In this project we built a system called Regulation Expression Pathway Analysis or REPA to facilitate the biological interpretation of results from high throughput gene expression profiling experiments. In particular, we provide researchers with gene sets that were most active in the biological phenomenon under study and their likely regulators. Users can input the gene expression profile data from their expression profiling experiments in REPA and get a list of disturbed gene sets and inferred transcription factors that possibly regulate these gene sets. To build this system first we processed the transcription factor binding data from the ENCODE project to quantify the strength of regulation that each transcription factor has on each gene set. Then we build a gene expression enrichment analysis system that can analyze the gene expression profiling data and list the most active gene sets. Finally we combine the results from the previous two steps to arrive at a more complete picture that gives users information about not only the most active gene sets, but also about the most likely regulators of these gene sets.
|Item Type:||Thesis (Masters)|
|Additional Information:||Includes bibliographical references (pages 105-127).|
|Keywords:||Transcription factor binding data, Gene set analysis, Pathway analysis, Bioinformatics|
|Department(s):||Science, Faculty of > Computer Science|
|Library of Congress Subject Heading:||Gene expression--Data processing; Bioinformatics; DNA microarrays--Data processing|
Actions (login required)