Data-driven approaches for risk assessment in the chemical processing industry: leveraging textual and numerical data

Kamil, Mohammad Zaid (2023) Data-driven approaches for risk assessment in the chemical processing industry: leveraging textual and numerical data. Doctoral (PhD) thesis, Memorial University of Newfoundland.

Full text not available from this repository.


Chemical process industries are accident-prone due to handling hazardous materials and the complex interaction of process operations. Industries, including chemical processing industries, are transitioning to digitalization with higher productivity potential by better managing process operations. A continuous encouragement to adopt digitalization in process industries while ensuring operational safety has led to new opportunities and challenges. The former relates to underpinning digital changes that will open new data generation and collection avenues, whereas the latter deals with translating the data into meaningful information. Two data types will play a key role in dealing with this evolving challenge of translating data into meaningful information. First, structured data (numerical data) determine the behavior of process systems. Second, unstructured data from accident investigation reports for learning lessons is utilized. Conventional risk analysis techniques are incapable of dealing with the evolving challenge. Risk evaluation for process operations during this transition requires advanced technologies. This thesis proposes new approaches for safety 4.0, which is the introduction of industry 4.0 technologies such as artificial intelligence and automation to monitor risk. The approaches integrate artificial intelligence with data-driven models. These advanced techniques address the widely recognized knowledge gap in the literature and serve as an important tool for safety 4.0. The thesis looks at developing approaches to gain insights from operational (contemporary) and textual (historical) data. First, a framework is developed to introduce a learning-based likelihood model. Structured data are used to model the topology of the Bayesian network (BN) and learn parameters from the data. Learning from data makes the model unique and allows capturing changes in operational data that are reflected in model output. A novel methodology is introduced to utilize field data of microbiologically influenced corrosion (MIC) in the likelihood model. Second, unstructured data in textual form is transformed into objective risk assessment by employing natural language processing (NLP). A novel methodology is developed to gain insights from corrosion investigation reports assessing the risk of MIC in pipelines. The methodology attempts to give a new dimension to risk assessment by developing a cause-effect scenario from the textual data. A named entity recognition (NER) model is trained to gain insights and, based on the findings, transformed into a risk estimation BN model and evaluated using a risk matrix. Third, unstructured data are used to develop a generalized causation model. A systematic approach comprised of NER, interpretive structural modeling (ISM), and BN is proposed to gain insights from unstructured data. The output is a generalized causation model for oil and refining accidents that lead to fire and explosion. A hierarchical BN model is developed for fire and explosion from the CSB database to identify commonalities among different incidents. Finally, this thesis looks into the integration of structured and unstructured data. The methodology of integrating both data types is proposed to provide a comprehensive picture. Insights from multiple sources are key for robust risk analysis. The methodology proposed gains insights from unstructured data using a co-occurrence network. These insights integrate with contemporary data and establish each factor dependence using ISM. The resulting digraph from the ISM is mapped into a generalized hybrid BN model. Industrial and simulated datasets are used to test and verify the effectiveness of the developed model in predicting adverse events. This thesis develops important tools for enhanced datadriven prediction of adverse events.

Item Type: Thesis (Doctoral (PhD))
Item ID: 16277
Additional Information: Includes bibliographical references -- Restricted until July 18, 2024
Keywords: data-driven, automation learning, safety 4.0, natural language processing
Department(s): Engineering and Applied Science, Faculty of
Date: October 2023
Date Type: Submission
Digital Object Identifier (DOI):
Library of Congress Subject Heading: Natural language processing (Computer science); Chemical engineering--Safety measures; Chemical engineering--Risk management; Chemical engineering--Automation; Chemical engineering--Risk assessment; Chemical engineering--Accidents; Data mining; Engineering economy; Industrial safety; Industry 4.0

Actions (login required)

View Item View Item