SRE Ottawa hosted seminar#334 on 29th September 2021 and had the pleasure of welcoming Chris Hobbs, Software Developer, Certified Systems/BlackBerry QNX. Chris is a previous winner of Colin Chabot Memorial Award and at this webinar presented ideas on Handling intellectual debt in Reliable Systems.
Intellectual debt is a term describing the lack of understanding of the principles behind a process or an event. i.e. we know something works but we don't know why it works. One introductory example cited by Chris during this webinar is the invention of the drug Aspirin. It was discovered in 1897 but the explanation of how it works was given in 1995, accounting for an intellectual debt during all these intermediate years. This debt, as any debt in any form, has its consequences. Without fully comprehending the mechanism behind a function, we accept the risk to utilize it or hinder further enhancement of the function. Either of which is undesirable.
In the context of Reliability Engineering, an intellectual debt can arise in the following instances: a product fails and the root cause is unknown, a complex system stops working after a software update, a predictive algorithm correctly predicts equipment's failure without explaining the influencing factor. While Reliability tools such as RCA, FRACAS can help with the first two cases, the issue gets acute with machine learning applications. Though the algorithms used in improving the Product Reliability and Asset Management are not as sophisticated as used in general AI systems, the decisions driven by the outputs of these algorithms should be assessed appropriately to forestall the intellectual debt from accruing. The correlation models fitted by the data should be audited for its prediction and explained by the engineers using theory. The more critical the effect of decisions from these models, the more rigorous the audit should be. Models built to compute values on the interaction with other models shall have control limits to trigger examination. But can all the processes be explained by a foundational theory?
Learn more about how to handle the intellectual debt in reliable systems from Chris Hobbs' in-depth presentation. A recorded video is available for SRE Ottawa members. Check out the membership page for details.