Most complex aerospace systems have many text reports on safety, maintenance, and associated issues. The Aviation Safety Reporting System (ASRS) spans several decades and contains over 700 000 reports. The Aviation Safety Action Plan (ASAP) contains over 12 000 reports from various airlines. Problem categorizations have been developed for both ASRS and ASAP to enable identification of system problems. However, repository volume and complexity make human analysis difficult. Multiple experts are needed, and they often disagree on classifications. Even the same person has classified the same document differently at different times due to evolving experiences. Consistent classification is necessary to support tracking trends in problem categories over time. A decision support system that performs consistent document classification quickly and over large repositories would be useful. We discuss the results of two algorithms we have developed to classify ASRS and ASAP documents. The first is Mariana--a support vector machine (SVM) with simulated annealing, which is used to optimize hyperparameters for the model. The second method is classification built on top of nonnegative matrix factorization (NMF), which attempts to find a model that represents document features that add up in various combinations to form documents. We tested both methods on ASRS and ASAP documents with the latter categorized two different ways. We illustrate the potential of NMF to provide document features that are interpretable and indicative of topics. We also briefly discuss the tool that we have incorporated Mariana into in order to allow human experts to provide feedback on the document categorizations.
IEEE Transactions on Systems, Man, and Cybernetics, Part C
support vector machines,simulated annealing,nonnegative matrix factorization,decision support systems,decision support system,automatic speech recognition,testing,health and safety,support vector machine
Document classification,Content analysis,Aviation Safety Reporting System,Computer science,Document management system,Support vector machine,Decision support system,Artificial intelligence,Non-negative matrix factorization,Action plan,Machine learning