Zhou, Jiayi (2015) A robust algorithm to compare chemical structures. Masters thesis, Memorial University of Newfoundland.
- Accepted Version
Available under License - The author retains copyright ownership and moral rights in this thesis. Neither the thesis nor substantial extracts from it may be printed or otherwise reproduced without the author's permission.
The recognition of chemical similarities between molecules plays an important role in chemical science, especially in the subjects of chemistry, biology and pharmaceuticals. Traditional methods of structure recognition are time consuming, usually involving a lot of experimentation and computational effort. The goal of the research is to create an algorithm to compare two chemical structures automatically with basic information. The algorithm only requires atom Cartesian coordinates, atom types and connectivity information of the structures as input. It uses a novel method to pair the atoms in the two structures such that the best superimposition is achieved. A similarity score is computed based on this best superimposition. The algorithm can also be used to search a large set of molecules for a structure similar to a query molecule. An application is developed to display the two structures to be compared and provide a 3D image of their best superimposition based on the auto-pairing of the atoms. Run-time analysis of the algorithm reveals that the traditional time complexity does not describe the run-time of the algorithm well. Linear regression indicates that the run-time is strongly influenced by the number of triplets (consisting of 3 atoms joined by 2 bonds) matched between the two structures. Testing of the algorithm on an in-house data-set of 737 structures as well as a larger NCI-sourced database demonstrates its utility.
|Item Type:||Thesis (Masters)|
|Additional Information:||Includes bibliographical references (pages 96-99).|
|Keywords:||chemical similarity, superimposition, 3D|
|Department(s):||Science, Faculty of > Computer Science|
|Library of Congress Subject Heading:||Molecular structure--Mathematical models; Structural bioinformatics; Three-dimensional modeling; Computer algorithms|
Actions (login required)