CareFree - AI-based System Health Assessment Scaled for Industrial Use
Integrating human knowledge and AI techniques for better diagnosis and effective predictive maintenance of high tech systems.
The Dutch Industry is at the forefront of innovation in high-tech systems like those for the professional printing industry, where increasing throughput and product quality is offered at lower and lower costs. Meeting their clients’ demands, the companies that build these systems need to guarantee carefree high-performance workflows based on their systems with no unexpected downtimes as well as short maintenance and repair times. With the CareFree project, TNO Embedded Systems Innovation (ESI) and the industrial partner Canon Production Printing (CPP, formerly Océ) want to demonstrate the potential of AI to do just that. The goal of the overall project is to generate AI fueled with engineering knowledge and data insights to provide novel and more efficient tools to manage the maintenance of professional printers.
A challenge for this goal is that the diagnosis and prognosis of high-tech systems (like professional printers) is very complex. Together, the diagnosis and prognosis of the systems are meant to ensure the lasting performance of the overall system– and not merely to prevent the breakdown of its parts. We face control loops that interweave on various timescales to optimize system-level behavior. As these loops respond to differences in tasks and circumstances, but also to increasing wear and tear, they realize adaptive behavior that can mask indicators of failures or future breakdowns.
Given the complexity in the diagnosis and prognosis of high-tech systems, at TNO ESI, we concluded that we require Hybrid AI to accomplish our goal. Specifically, we require an AI system that can perform probabilistic and causal reasoning. To this end, we decided to implement Bayesian networks. The Bayesian networks are generated within a multi-step process that compiles nets from network fragments which represent diagnostic models for components and their relations which we extract from system specifications and other sources, like FMECAs. While some of the probabilistic parameters of the Bayesian networks are bound to functions, others, like failure rates, are learned from data.
With this hybrid approach we were able to reach valid diagnosis for several test scenarios. This confirmed our choice and motivated us in carrying on with our research. Our current research focuses on a variety of topics. Firstly, we are looking for a means to model control loops, as the Bayesian network cannot contain loops. Secondly, we want to be able to diagnose operational degradation, rather than only full system failures. With an eye on interventions, we are interested in investigating how information gathered from a series of “active tests” can effectively be combined (since such an active test changes the system). We are also interested in the use of test strategies. Which tests should be performed to give maximal discriminative diagnostic evidence at the lowest cost? Finally, we want to perform some data driven optimization, where the models are customized on the basis of printer usage.
Within this project, several important achievements were made. Firstly, we have demonstrated the feasibility of AI-based diagnosis within the industrial application domain, deploying prototype Bayesian networks for the diagnosis of modules of a Canon printer. We also developed a technology that can automate the creation of the Bayesian network's diagnostic reasoning engine based on system-level descriptions. This technology demonstrated the use of a Bayesian network as a valid system level reasoner and demonstrated that it is a scalable approach towards diagnosis of large industrial systems.
Rather than a static diagnosis, we have defined an initial process for an iterative diagnosis where evidence is added incrementally resulting in convergence on a limited set of plausible root-causes. This is the result of an entropy based calculation that determines the best subsequent test to be done, in the sense that it will provide the most discriminative information for diagnosis at the lowest possible cost.
To deal with the challenge of timescales, we have developed an initial approach to model performance aspects and how to integrate them in the Bayesian network. This entails behavior over time and typically models the effects of wear and tear in a machine. We have furthermore defined a capability or functional hierarchy based approach specifically to model (control) loops and demonstrated its correctness by a comparison with a time expansion and subsequent absorption approach on the Bayesian networks. The latter is by definition correct but unpractical for engineering purposes.
Finally, we have defined modeling templates for specific situations in the domain. For example to model the effects of a short-to-ground. Contrary to normal failure effects that only operate downstream, a short-to-ground also has an effect upstream. Bringing everything together, we built a demonstrator to show the feasibility and working principles of the proposed Bayesian iterative diagnosis of industrial systems.
- L. Barbini and M. Borth, “Probabilistic Health and Mission Readiness Assessment at System-Level”, PHM_CONF, vol. 11, no. 1, Sep. 2019.
- Jos Hegge, Sr Project Manager, TNO, e-mail: email@example.com