Gene prediction by combining outputs from ExonHunter and SGP2

Kuai, Yujing (2009) Gene prediction by combining outputs from ExonHunter and SGP2. Masters thesis, Memorial University of Newfoundland.

[img] [English] PDF - Accepted Version
Available under License - The author retains copyright ownership and moral rights in this thesis. Neither the thesis nor substantial extracts from it may be printed or otherwise reproduced without the author's permission.

Download (6MB)


Recently gene prediction has become a critical research area in computational biology. This thesis introduces our research on predicting genes in human DNA sequences. We present two algorithms to predict human genes by combining two chosen gene finders. One gene finder uses combination methods and another applies cross-species comparative sequence analysis. Based on these algorithms, a client-friendly gene finder can be developed to accurately predict human genes and thus to help discover genetic reasons of incurable human diseases. -- Combination methods and cross-species comparative sequence analysis are two methods which become increasingly helpful. This thesis first summarizes and classifies main algorithms applied in these two methods, respectively. To be specific, we study two gene finders using comparative sequence analysis and three gene finders applying combination methods. Their architectures and experiments are reviewed separately and overall comparisons are done. According to our survey, currently many gene finders can predict genes with an sophisticated accuracy, but either the methods that gene finders apply have limitations, or the application of these gene finders is difficult for biologists and researchers in medicine. Aiming at these two disadvantages, we develop two algorithms to combine outputs of gene finders using combination methods and cross-species comparative sequence analysis. By comparing the genomes of Mus musculus and Canis familiars, the algorithms are firstly tested on the HMR195 dataset and then on the sequence between the markers D3S1259 and D3S3659 on human chromosome 3p25. The results show that to some extent our algorithms improve the performance of the gene finder using either comparative sequence analysis or combination methods, demonstrating their own advantages on predicting different genetic information. Additionally, our work shows an inspiring perspective of developing a gene finder with a more friendly interface.

Item Type: Thesis (Masters)
Item ID: 8827
Additional Information: Includes bibliographical references (leaves 101-115).
Department(s): Science, Faculty of > Computer Science
Date: 2009
Date Type: Submission
Library of Congress Subject Heading: Computational biology--Methodology; Gene mapping--Computer simulation; Genomics--Methodology

Actions (login required)

View Item View Item


Downloads per month over the past year

View more statistics