Now showing 1 - 10 of 13
  • Publication
    Open Access
    Data-driven diagnosis for cyber-physical systems
    (Universitätsbibliothek der HSU/UniBw H, 2025-09-30) ; ;
    Helmut-Schmidt-Universität/Universität der Bundeswehr Hamburg
    ;
    Rapid recovery from failures in safety-critical systems requires accurate and timely diagnosis, which is a task that is increasingly challenging due to the growing complexity of modern cyber-physical systems. These systems generate large amounts of data describing operational metrics, sensor readings, and performance indicators across various subsystems. The complexity is also driven by various discrete system modes, complex interactions between subsystems, and external influence factors. The example of such a complex system that has motivated this work is the Environmental Control and Life Support System of the International Space Station's Columbus module. Due to this volume of data, in this particular case thousands of signals, differentiating nominal from abnormal system states alone becomes a non-trivial task. However, the challenge of identifying root causes of abnormal behavior proves to be even more complex and often requires system models whose creation demands substantial domain expertise. This thesis presents a novel approach to these fault diagnosis challenges that minimizes the need for extensive prior knowledge. The research focuses on developing a method that detects anomalies at the subsystem level, which is an aggregated level between individual sensors and the overall system, and leverages the detected anomalies for system diagnosis. The main contributions of this thesis are as follows: (i) A novel neural network architecture that was specifically designed for detecting anomalies in subsystems of cyber-physical systems. (ii) A new graph-based diagnostic algorithm which uses basic causal relationships between subsystems and the output of the anomaly detection model to identify root causes of failures. (iii) A methodology that combines the components above into a comprehensive diagnostic framework. (iv) An implementation of these methods using state-of-the-art machine learning operations tools, that addresses the challenges typically involved in deploying and maintaining machine learning models in production environments. The proposed framework was evaluated through a set of experiments using both simulated and real-world datasets. The results provide evidence that the proposed approach can identify subsystem-level anomalies and find the subsystems that caused the system failure. This thesis makes contributions to the fields of anomaly detection, fault diagnosis, and machine learning operations in the context of cyber-physical systems. It provides a largely data-driven solution for the diagnosis challenge in complex technical systems, which traditionally requires extensive manual and labor-intensive modeling. The findings have implications for various domains including aerospace, manufacturing, and other critical infrastructure systems.
  • Publication
    Metadata only
    A supervised AI-based toolchain for anomaly detection, diagnosis, and reconfiguration for the life-support system of the COLUMBUS module of the ISS
    (Springer Nature, 2025-08-19) ; ; ; ; ;
    Myschik, Stephan
    ;
    Geier, Christian
    ;
    Creutzenberg, Martin
    ;
    Grashorn, Philipp
    ;
    Hoppe, Tobias
    ;
    Ernst, Hauke
    ;
    This paper focuses on the development and implementation of a diagnosis toolchain, to identify faults and recommend actions for the system operators of the environmental control and life support system of the COLUMBUS module on the International Space Station. We present a comprehensive framework which uses different aspects of artificial intelligence to efficiently identify the necessary interventions for the system operator to stabilize the system in case of emergencies and defects. Methods such as machine learning and statistical analysis, based on time-series, are used for anomaly detection to identify potentially critical situations early and issue the corresponding warnings. Diagnostic functionality enables the identification of the causes of anomalies, integrating expert knowledge and pattern recognition algorithms to achieve accurate diagnostic results. The localization of affected system parts is crucial as fault propagation can obscure the origin of anomalies. A vital aspect of the AI system is determining possible reconfiguration measures according to the behavior of the system, offering operators various operational continuance variants in the event of damage. Based on the diagnostic results, the system identifies suitable reconfiguration measures to restore normal operation or minimize potential damage. An additional supervision module based on qualitative system models is then used to monitor, evaluate, and assess the effects of these interventions. An MLOps platform facilitates the seamless integration of the framework into existing processes, providing an agile solution for fast and reliable development, scaling, and standardized integration interfaces. The successful integration of the AI toolchain at Airbus Defense and Space exemplifies this implementation’s effectiveness, significantly reducing development times and enhancing the process’s reliability and efficiency.
  • Publication
    Open Access
    A model learning perspective on the complexity of cyber-physical systems
    (Universitätsbibliothek der HSU/UniBw H, 2025-05-27) ; ;
    Swantje Plambeck
    ;
    ;
    Benndorf, Gesa
    ;
    A large palette of models and their corresponding learning algorithms have been applied to time series observed from cyber-physical systems (CPSs). For some use cases, simple linear methods are sufficient, while for others, even sophisticated machine learning approaches fail to extract subtle patterns in system behavior. To date, the literature has not examined this phenomenon adequately and lacks a comprehensive analysis linking the characteristics of CPSs with the suitability of different models and learning algorithms. In this work, after examining the complexity of multiple real-world and artificial CPS use cases, we identify several key aspects that distinguish them: 1) the number of system variables, 2) the degree of interdependence between discrete-event part and continuous part of the system, and 3) the number of unobserved system inputs. By analyzing the approaches successfully applied in the respective use cases, we were able to distill preferred techniques for addressing systems of different complexity levels.
  • Publication
    Metadata only
    Using modular neural networks for anomaly detection in cyber-physical systems
    Autonomously detecting anomalous behavior based on system observations is a fundamental task for Cyber-Physical Systems (CPS). Due to the high system complexity and large number of subsystems in modern CPS, rule- or knowledge-based approaches for anomaly detection are more and more replaced by Machine Learning (ML) approaches which leverage historical CPS data. Typically, ML approaches learn a system model based on the CPS data and identify anomalous behavior based on the distance of the real CPS behavior to the predicted model behavior. However, most classical ML approaches for anomaly detection are monolithic, meaning a single ML model is fitted on a global CPS observation, making them frail to spurious correlations and confounders that originate on CPS subsystem level. We hence propose a modular approach toward anomaly detection in CPS, specifically a novel Modular Neural Network (MNN) architecture. Our architecture not only models the behavior of individual CPS sub-systems in individual MNN modules, but additionally models the dependencies of the CPS subsystems into the MNN architecture. Thereby, we omit confounding effects and spurious correlations, enabling us to identify and allocate anomalies within the CPS at subsystem level. We benchmark our MNN architecture against monolithic Neural Networks and MNN architectures that do not explicitly model CPS subsystem dependencies using a real-world dataset of an industrial robot with different anomalies. We show that by modeling real-world dependencies into a MNN architecture, we can improve the performance of autonomous anomaly detection in CPS.
  • Publication
    Open Access
    End-to-end MLOps integration: a case study with ISS telemetry data
    (UB HSU, 2024-03) ;
    Geier, Christian
    ;
    ;
    Creutzenberg, Martin
    ;
    Pfeifer, Jann
    ;
    Turk, Samo
    ;
    Kubeflow integrates a suite of powerful tools for Machine Learning (ML) software development and deployment, typically showcased independently. In this study, we integrate these tools within an end- to-end workflow, a perspective not extensively explored previously. Our case study on anomaly detection using telemetry data from the International Space Station (ISS) investigates the integration of various tools—Dask, Katib, PyTorch Operator, and KServe—into a single Kubeflow Pipelines (KFP) workflow. This investigation reveals both the strengths and limitations of such integration in a real-world context. The insights gained from our study provide a comprehensive blueprint for practitioners and contribute valuable feedback for the open source community developing Kubeflow.