Proposing the Use of Hazard Analysis for Machine Learning Data Sets


  • H. Glenn Carter U.S. Army DEVCOM AvMC
  • Alexander Chan U.S. Army DEVCOM AvMC
  • Christopher Vinegar U.S. Army DEVCOM AvMC
  • Jason Rupert Modern Technology Solutions, Inc



machine learning, data assurance, data governance


There is no debating the importance of data for artificial intelligence. The behavior of data-driven machine learning models is determined by the data set, or as the old adage states: “garbage in, garbage out (GIGO).” While the machine learning community is still debating which techniques are necessary and sufficient to assess the adequacy of data sets, they agree some techniques are necessary. In general, most of the techniques being considered focus on evaluating the volumes of attributes. Those attributes are evaluated with respect to anticipated counts of attributes without considering the safety concerns associated with those attributes. This paper explores those techniques to identify instances of too little data and incorrect attributes. Those techniques are important; however, for safety critical applications, the assurance analyst also needs to understand the safety impact of not having specific attributes present in the machine learning data sets. To provide that information, this paper proposes a new technique the authors call data hazard analysis. The data hazard analysis provides an approach to qualitatively analyze the training data set to reduce the risk associated with the GIGO.

Author Biographies

H. Glenn Carter, U.S. Army DEVCOM AvMC

Mr. Carter is an Engineering Supervisor in the Army’s DEVCOM AvMC Airworthiness organization. He comes from a background of a decade of Army research into modular system approaches, software strategies, and model-based engineering and tools. He is additionally in pursuit of innovative airworthiness approaches. For four years his efforts have included airworthiness where machine learning in the form of trained algorithms will be used, where traditional software qualification process is insufficient to establish safety of flight.

Alexander Chan, U.S. Army DEVCOM AvMC

Mr. Chan is currently a Computer Software Engineer in the Army’s DEVCOM AvMC Airworthiness organization.  His background includes quality assurance. For the past two years he has focused on development of criteria for acceptable qualification of AI/ML applications in flight critical applications.

Christopher Vinegar, U.S. Army DEVCOM AvMC

Mr. Vinegar is currently a Computer Software Engineer in the Army’s DEVCOM AvMC Airworthiness organization.  His background includes 20+ years of software process, design, and development in the automotive industry.  He has been working for the government for the past two years and runs the AvMC Systems Readiness Directorate AL/ML working group which focuses on qualification of AI/ML applications for use in flight critical applications.

Jason Rupert, Modern Technology Solutions, Inc

Between 2011 and 2021 Mr. Rupert provided contractor software airworthiness functional support to the Army's DEVCOM AvMC Airworthiness organization. Two years ago Mr. Rupert began his work on AI/ML airworthiness certification/qualification, specifically assessing the possibility of certifying/qualifying AI/ML for use on manned and unmanned flight safety critical applications. In that role he has collaborated with colleagues from all branches of the US Military and various communities of practice, e.g., SAE G-34 and Safety.


AFE 87 Project Members. (2020). Machine Learning, AFE-87. College Station: Aerospace Vehicle Systems Institute. Retrieved June 1, 2022, from

Brillinger, D. R. (2011). Data Analysis, Exploratory. Retrieved June 1, 2022, from

Copeland, R. (2019). An Analysis and Classification Process towards the Qualification of Autonomous Systems in Army Aviation. Vertical Flight Society’s 75th Annual Forum & Technology Display. Philadelphia. Retrieved from

D. Sculley, G. H.-F. (2015). Hidden Technical Debt in Machine Learning Systems. Advances in Neural Information Processing Systems 28.

Data Safety Initiative Working Group. (2022). Data Safety Guidance (Version 3.4). Safety-Critical Systems Club. Retrieved June 1, 2022, from

Kevin Fuchs, P. A. (2016). INTUITEL and the Hypercube Model - Developing Adaptive Learning. SYSTEMICS, CYBERNETICS AND INFORMATICS, 14(3), 7-11. Retrieved June 1, 2022, from

Nagy, B. (2021). Increasing Confidence in Machine Learned (ML) Functional Behavior during Artificial Intelligence (AI) Development using Training Data Set Measurements. Acquisition Research Program. Retrieved June 1, 2022, from

Oliver Zendel, K. H. (2017). Analyzing Computer Vision Data — The Good, the Bad and the Ugly. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

Recommended Failure Modes and Effects Analysis (FMEA) Practices for Non-Automobile Applications, ARP 5580. (2020). SAE International.

Rob Ashmore, M. H. (2018). “Boxing Clever”: Practical Techniques for Gaining Insights into Training Data and Monitoring Distribution Shift. SAFECOMP 2018 Workshops, LNCS 11094, 393–405. Retrieved June 1, 2022, from

S-18. (1996). Guidelines for Conducting the Safety Assessment Process on Civil Aircraft, Systems, and Equipment, ARP4761. SAE International.

S-18. (2010). Guidelines for Development of Civil Aircraft and Systems, ARP4754A. SAE International.

SAE G-34. (2021). Artificial Intelligence in Aeronautical Systems: Statement of Concerns, AIR6988™. SAE International. Retrieved June 1, 2022, from

SAE G-34. (2022). Process Standard for Development and Certification/Approval of Aeronautical Safety-Related Products Implementing AI, AS6983. SAE International.

Safety of Autonomous Systems Working Group. (2022). Safety Assurance Objectives for Autonomous Systems V3, SCSC-153B. Safety Critical Systems Club. Retrieved June 1, 2022, from

SC-205. (2011). Software Considerations in Airborne Systems, DO-178C. Washington: RTCA, Inc.

Soudain, G. (2021). First usable guidance for Level 1 machine learning applications. European Union Aviation Safety Agency. Retrieved June 1, 2022, from

Tabular Modeling Deep Dive. (2022, April). Retrieved June 1, 2022, from

Timnit Gebru, J. M. (2021). Datasheets for Datasets. Communications of the ACM, 64(12), 86-92.

United States Code of Federal Regulations. (n.d.). 14 CFR 25.1309 Equipment, systems, and installations. US Government. Retrieved June 1, 2022, from

Article cover




How to Cite

Carter, H., Chan, A., Vinegar, C., & Rupert, J. (2023). Proposing the Use of Hazard Analysis for Machine Learning Data Sets. Journal of System Safety, 58(2), 30–39.