An algorithm for finding optimal descent trees in genealogies conditional on the observed data

Li, Qiong (2010) An algorithm for finding optimal descent trees in genealogies conditional on the observed data. Masters thesis, Memorial University of Newfoundland.

[img] [English] PDF - Accepted Version
Available under License - The author retains copyright ownership and moral rights in this thesis. Neither the thesis nor substantial extracts from it may be printed or otherwise reproduced without the author's permission.

Download (4MB)


There are many diseases caused by genes such as cystic fibrosis, hereditary spherocytosis and duchenne muscular dystrophy. Identifying the actual disease-causing genes is important since it may help prevent or avoid genetic disorder. Finding out which genes contribute to diseases is also helpful for understanding why some individuals are more inclined to have physical diseases than others. To do this, we should determine regions of chromosomes that are likely to contain the particular genes responsible for a given human disease. Genetic linkage analysis is developed as a statistical method that allows us to determine these regions of chromosomes. -- Geneticists use pedigrees because they offer many advantages for genetic mapping regardless of the incidence of the genetic disease. One of this advantages is that study of pedigrees is quite powerful if the disease is rare, nevertheless there are many other aspects, like genetic homogeneity, the patterns of transmission, etc. These advantages make the study of pedigrees attractive. However, utilizing such pedigrees in genetic analysis is a computationally challenging task. Time complexity of some algorithms for genetic analysis is exponential in the size of the pedigree. Therefore, it is desirable to find a potentially optimal subpedigree that connects the individuals with the disorder. The aim of this thesis is to study the methods of finding optimal subpedigrees conditional on the observed data in a large pedigree. -- In chapter 1, we first provide background of finding optimal subpedigrees in a large original pedigree problem, in particular, one method to find such subpedigrees that uses graph theory techniques. Then, we introduce some terminologies in graph theory related to the Steiner tree problem. At the end of this chapter, some basic ideas about statistical genetics are provided. -- Chapter 2 concentrates on the description of constructing subpedigrees in a large pedigree based on the Steiner tree problem in graph theory. Two pieces of software, PedHunter and Miniped, that use the Steiner tree algorithm in constructing optimal subpedigrees are introduced. The study in this chapter also enables us to consider the most likely descent trees conditional on the observed data. -- Chapter 3 is devoted to the study of algorithms for finding the most likely descent trees. We first present an algorithm to find the probability of every edge in all possible descent trees, and then reformulate this problem into a directed Steiner tree problem in graph theory. Furthermore, we provide an approximation algorithm to solve this directed Steiner tree problem. -- The final chapter summarizes the results in this thesis and points out some problems for future study.

Item Type: Thesis (Masters)
Item ID: 8698
Additional Information: Includes bibliographical references (leaves 51-57).
Department(s): Science, Faculty of > Mathematics and Statistics
Date: 2010
Date Type: Submission
Library of Congress Subject Heading: Gene mapping--Mathematical models; Genetic genealogy--Mathematical models; Trees (Graph theory)

Actions (login required)

View Item View Item


Downloads per month over the past year

View more statistics