Data-driven diagnosis for cyber-physical systems
Publication date
2025-09-30
Document type
Dissertation
Author
Advisor
Referee
Granting institution
Helmut-Schmidt-Universität/Universität der Bundeswehr Hamburg
Exam date
2025-09-24
Organisational unit
Publisher
Universitätsbibliothek der HSU/UniBw H
Part of the university bibliography
✅
File(s)
Language
English
Keyword
Diagnosis
Anomaly detection
AI
MLOps
Cyber-physical systems
Modular neural networks
Abstract
Rapid recovery from failures in safety-critical systems requires accurate and timely diagnosis, which is a task that is increasingly challenging due to the growing complexity of modern cyber-physical systems. These systems generate large amounts of data describing operational metrics, sensor readings, and performance indicators across various subsystems. The complexity is also driven by various discrete system modes, complex interactions between subsystems, and external influence factors. The example of such a complex system that has motivated this work is the Environmental Control and Life Support System of the International Space Station's Columbus module. Due to this volume of data, in this particular case thousands of signals, differentiating nominal from abnormal system states alone becomes a non-trivial task. However, the challenge of identifying root causes of abnormal behavior proves to be even more complex and often requires system models whose creation demands substantial domain expertise.
This thesis presents a novel approach to these fault diagnosis challenges that minimizes the need for extensive prior knowledge. The research focuses on developing a method that detects anomalies at the subsystem level, which is an aggregated level between individual sensors and the overall system, and leverages the detected anomalies for system diagnosis.
The main contributions of this thesis are as follows: (i) A novel neural network architecture that was specifically designed for detecting anomalies in subsystems of cyber-physical systems. (ii) A new graph-based diagnostic algorithm which uses basic causal relationships between subsystems and the output of the anomaly detection model to identify root causes of failures. (iii) A methodology that combines the components above into a comprehensive diagnostic framework. (iv) An implementation of these methods using state-of-the-art machine learning operations tools, that addresses the challenges typically involved in deploying and maintaining machine learning models in production environments.
The proposed framework was evaluated through a set of experiments using both simulated and real-world datasets. The results provide evidence that the proposed approach can identify subsystem-level anomalies and find the subsystems that caused the system failure.
This thesis makes contributions to the fields of anomaly detection, fault diagnosis, and machine learning operations in the context of cyber-physical systems. It provides a largely data-driven solution for the diagnosis challenge in complex technical systems, which traditionally requires extensive manual and labor-intensive modeling. The findings have implications for various domains including aerospace, manufacturing, and other critical infrastructure systems.
This thesis presents a novel approach to these fault diagnosis challenges that minimizes the need for extensive prior knowledge. The research focuses on developing a method that detects anomalies at the subsystem level, which is an aggregated level between individual sensors and the overall system, and leverages the detected anomalies for system diagnosis.
The main contributions of this thesis are as follows: (i) A novel neural network architecture that was specifically designed for detecting anomalies in subsystems of cyber-physical systems. (ii) A new graph-based diagnostic algorithm which uses basic causal relationships between subsystems and the output of the anomaly detection model to identify root causes of failures. (iii) A methodology that combines the components above into a comprehensive diagnostic framework. (iv) An implementation of these methods using state-of-the-art machine learning operations tools, that addresses the challenges typically involved in deploying and maintaining machine learning models in production environments.
The proposed framework was evaluated through a set of experiments using both simulated and real-world datasets. The results provide evidence that the proposed approach can identify subsystem-level anomalies and find the subsystems that caused the system failure.
This thesis makes contributions to the fields of anomaly detection, fault diagnosis, and machine learning operations in the context of cyber-physical systems. It provides a largely data-driven solution for the diagnosis challenge in complex technical systems, which traditionally requires extensive manual and labor-intensive modeling. The findings have implications for various domains including aerospace, manufacturing, and other critical infrastructure systems.
Version
Published version
Access right on openHSU
Open access
