Discovery of Recurring Anomalies in Text Reports

This paper describes the results of a significant research and development effort conducted at NASA Ames Research Center to develop new text mining algorithms to discover anomalies in free-text reports regarding system health and safety of two aerospace systems. We discuss two problems of significant import in the aviation industry. The first problem is that of automatic anomaly discovery concerning an aerospace system through the analysis of tens of thousands of free-text problem reports that are written about the system. The second problem that we address is that of automatic discovery of recurring anomalies, i.e., anomalies that may be described in different ways by different authors, at varying times and under varying conditions, but that are truly about the same part of the system. The intent of recurring anomaly identification is to determine project or system weakness or high-risk issues. The discovery of recurring anomalies is a key goal in building safe, reliable, and cost-effective aerospace systems.

Data and Resources

Additional Info

Field Value
Maintainer Ashok Srivastava
Last Updated March 31, 2025, 21:17 (UTC)
Created March 31, 2025, 21:17 (UTC)
accessLevel public
accrualPeriodicity irregular
bureauCode {026:00}
catalog_@context https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
catalog_@id https://data.nasa.gov/data.json
catalog_conformsTo https://project-open-data.cio.gov/v1.1/schema
catalog_describedBy https://project-open-data.cio.gov/v1.1/schema/catalog.json
harvest_object_id 7560d844-2039-4459-89a3-cb405cd4b06c
harvest_source_id 61638e72-b36c-4866-9d28-551a3062f158
harvest_source_title DNG Legacy Data
identifier DASHLINK_159
issued 2010-09-22
landingPage https://c3.nasa.gov/dashlink/resources/159/
modified 2020-01-29
programCode {026:029}
publisher Dashlink
resource-type Dataset
source_datajson_identifier true
source_hash 104927d9b1a85c4c78c9db96c75d0ae35a4f19c93d7024b8540abe4def48502d
source_schema_version 1.1