ARC Code TI: sequenceMiner

The sequenceMiner was developed to address the problem of detecting and describing anomalies in large sets of high-dimensional symbol sequences. sequenceMiner works by performing unsupervised clustering (grouping) of sequences using the normalized longest common subsequence (LCS) as a similarity measure, followed by a detailed analysis of outliers to detect anomalies. sequenceMiner utilizes a new hybrid algorithm for computing the LCS that has been shown to outperform existing algorithms by a factor of five. sequenceMiner also includes new algorithms for outlier analysis that provide comprehensible indicators as to why a particular sequence was deemed to be an outlier. This provides analysts with a coherent description of the anomalies identified in the sequence, and why they differ from more 'normal' sequences.

Data and Resources

Additional Info

Field Value
Maintainer Dennis Koga
Last Updated July 14, 2025, 20:53 (UTC)
Created March 31, 2025, 23:45 (UTC)
accessLevel public
accrualPeriodicity irregular
bureauCode {026:00}
catalog_@context https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
catalog_@id https://data.nasa.gov/data.json
catalog_conformsTo https://project-open-data.cio.gov/v1.1/schema
catalog_describedBy https://project-open-data.cio.gov/v1.1/schema/catalog.json
harvest_object_id 9ab10fdf-19d8-42d5-a28b-4e13ef6b8d2e
harvest_source_id 61638e72-b36c-4866-9d28-551a3062f158
harvest_source_title DNG Legacy Data
identifier OCIO-Fitara-137
issued 2015-01-07
modified 2020-01-29
programCode {026:046}
publisher Ames Research Center
resource-type Dataset
source_datajson_identifier true
source_hash 38136fe4afee5946f3acb824a0ab5bdc7fb6e1a5fd274b814c2ff9b8c773494e
source_schema_version 1.1
theme {Management/Operations}