MULTI-LABEL ASRS DATASET CLASSIFICATION USING SEMI-SUPERVISED SUBSPACE CLUSTERING

MULTI-LABEL ASRS DATASET CLASSIFICATION USING SEMI-SUPERVISED SUBSPACE CLUSTERING

MOHAMMAD SALIM AHMED, LATIFUR KHAN, NIKUNJ OZA, AND MANDAVA RAJESWARI

Abstract. There has been a lot of research targeting text classification. Many of them focus on a particular characteristic of text data - multi-labelity. This arises due to the fact that a document may be associated with multiple classes at the same time. The consequence of such a characteristic is the low performance of traditional binary or multi-class classification techniques on multi-label text data. In this paper, we propose a text classification technique that considers this characteristic and provides very good performance. Our multi-label text classification approach is an extension of our previously formulated [3] multi-class text classification approach called SISC (Semi-supervised Impurity based Subspace Clustering). We call this new classification model as SISC-ML(SISC Multi-Label). Empirical evaluation on real world multi-label NASA ASRS (Aviation Safety Reporting System) data set reveals that our approach outperforms state-of-theart text classification as well as subspace clustering algorithms.

Data and Resources

Additional Info

Field Value
Maintainer Elizabeth Foughty
Last Updated March 31, 2025, 21:28 (UTC)
Created March 31, 2025, 21:28 (UTC)
accessLevel public
accrualPeriodicity irregular
bureauCode {026:00}
catalog_@context https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
catalog_@id https://data.nasa.gov/data.json
catalog_conformsTo https://project-open-data.cio.gov/v1.1/schema
catalog_describedBy https://project-open-data.cio.gov/v1.1/schema/catalog.json
harvest_object_id 44cd8f87-60c0-49b4-b102-c7c5a4f3f27f
harvest_source_id 61638e72-b36c-4866-9d28-551a3062f158
harvest_source_title DNG Legacy Data
identifier DASHLINK_244
issued 2010-10-13
landingPage https://c3.nasa.gov/dashlink/resources/244/
modified 2020-01-29
programCode {026:029}
publisher Dashlink
resource-type Dataset
source_datajson_identifier true
source_hash 83b1cebcec7cdc1fe56cbb0f353320d4a60530023375491e4ee51d39e451f2d4
source_schema_version 1.1