Fault detection and root cause diagnosis using sparse principal component analysis (SPCA)

Rahoma, Abdalhamid Ahmad (2021) Fault detection and root cause diagnosis using sparse principal component analysis (SPCA). Doctoral (PhD) thesis, Memorial University of Newfoundland.

[img] [English] PDF - Accepted Version
Available under License - The author retains copyright ownership and moral rights in this thesis. Neither the thesis nor substantial extracts from it may be printed or otherwise reproduced without the author's permission.

Download (5MB)


Data based methods are widely used in process industries for fault detection and diagnosis. Among the data-based methods multivariate statistical methods, for example, Principal Component Analysis (PCA), Projection to Latent Squares (PLS), and Independent Component Analysis (ICA) are most widely used methods. These methods in general are successful in detecting process fault, however, diagnosis of the root cause is always not very accurate. The primary goal of the thesis is to improve the fault diagnosis ability of PCA based methods. In PCA, each Principal Component (PC) is a linear combination of all the variables, therefore makes it difficult to identify the root cause from the violation of a PC. Sparse Principal Component Analysis (SPCA) is one version of PCA that gets a sparse description of the PCA loading matrix making it more suitable for fault diagnosis. The present research aims to devise novel strategies to find the sparse description of loading matrix, more aligned with process fault detection and diagnosis. The thesis also looks into improving the fault diagnosis of PCA using clustering methods. The entire thesis can be divided into three major tasks. First, a novel fault detection and diagnosis method is proposed based on the Sparse Principal Component Analysis (SPCA) approach. This approach incorporates a new criterion based on the Fault Detection Rates (FDRs) and False Alarm Rates (FARs) into Zou et al.’s (2006) SPCA algorithms. The objective here is to find appropriate the (Number of Non-Zero Loadings) NNZLs for SPCs that can result in low FARs and high FDRs. A comparison between PCA and four SPCA-based methods for FDD using a continuous stirred tank heater (CSTH) as a benchmark system is also carried out. The results indicate that shortcomings of the PCA can be overcome using the Sparse Principal Component Analysis (SPCA) that uses the novel NNZL criterion. The FDR-FAR SPCA approach gives the highest FDRs for the SPE statistic (93.8%). The second task focuses on developing statistical methods to decide on the non-zero elements of the loading elements of SPCA. Rather than using heuristics, the proposed methods use the distribution of the loading elements to decide if an element should be set to zero. Two SPCA algorithms are proposed to find the NNZL and its position of each PC. The first algorithm is based on bootstrapping of the data, and the second approach is based Iterative Principal Component Analysis (IPCA). The proposed methods are implemented on a CSTH process to test the performance with PCA- and other SPCA-based methods for fault detection and diagnosis. The results reveal that the approaches have superior performance in fault detection, as well as diagnosis of the root cause of fault. Both the Bootstrap-SPCA and Sparse-IPCA methods give the highest FDRs for fault 1 for the SPE statistic (99.3% and 95.76%, respectively) As the third task, this research combines the clustering (k-means) algorithm and PCA algorithm to improve the detection and diagnosis of the fault. PCA has the advantage of detecting the fault without the need for data labelling, while clustering is able to distinguish data from different fault groups into separate clusters. By combining these two algorithms we are able to have better detection and diagnosis of fault and eliminate the need for data labelling. The performance of the proposed method is demonstrated in simulated and large-scale industrial case studies.

Item Type: Thesis (Doctoral (PhD))
URI: http://research.library.mun.ca/id/eprint/15080
Item ID: 15080
Additional Information: Includes bibliographical references.
Department(s): Engineering and Applied Science, Faculty of
Date: June 2021
Date Type: Submission
Digital Object Identifier (DOI): https://doi.org/10.48336/42ah-0r31
Library of Congress Subject Heading: Fault location (Engineering); Principal components analysis.

Actions (login required)

View Item View Item


Downloads per month over the past year

View more statistics