Ma, Yue (2005) An anchor-based model for global multiple alignment of whole genome sequences. Masters thesis, Memorial University of Newfoundland.
- Accepted Version
Available under License - The author retains copyright ownership and moral rights in this thesis. Neither the thesis nor substantial extracts from it may be printed or otherwise reproduced without the author's permission.
With the benefit of advanced biotechnology, large numbers of whole genome sequences have been compiled. Aligning whole genome sequences is a fundamentally different problem than aligning short sequences. Recently, intensive research activities have been devoted to this problem. We propose an anchor-based model for global multiple alignment of whole genome sequences. The model includes three main phases. Firstly, an enhanced suffix array method is employed to find anchors. Next, an exact chaining algorithm, which is based on the dynamic programming technique and the longest common subsequence idea, calculates an anchor-chain for the weighted anchors. Lastly, a progressive multiple alignment method is used to close the gaps between the anchors. The proposed chaining procedure is based on evolutionary theory and can align whole genome sequences not only for close homologs, but also distant species. Combined with the exact suffix array approach, this model can compute partially accurate solutions and generate a high-quality alignment result in terms of computation and biology.
|Item Type:||Thesis (Masters)|
|Additional Information:||Bibliography: leaves 76-83.|
|Department(s):||Science, Faculty of > Computer Science|
|Library of Congress Subject Heading:||Genomes--Mathematical models; Nucleotide sequence--Mathematical models.|
Actions (login required)