A Local Scalable Distributed EM Algorithm for Large P2P Networks

his paper describes a local and distributed expectation maximization algorithm for learning parameters of Gaussian mixture models (GMM) in large peer-to-peer (P2P) environments. The algorithm can be used for a variety of well-known data mining tasks in distributed environments such as clustering, anomaly detection, target tracking, and density estimation to name a few, necessary for many emerging P2P applications in bioinformatics, webmining and sensor networks. Centralizing all or some of the data to build global models is impractical in such P2P environments because of the large number of data sources, the asynchronous nature of the P2P networks, and dynamic nature of the data/network. The proposed algorithm takes a two-step approach. In the monitoring phase, the algorithm checks if the model ‘quality’ is acceptable by using an efficient local algorithm. This is then used as a feedback loop to sample data from the network and rebuild the GMM when it is outdated. We present thorough experimental results to verify our theoretical claims.

Data and Resources

Additional Info

Field Value
Maintainer Ashok Srivastava
Last Updated April 1, 2025, 02:36 (UTC)
Created April 1, 2025, 02:36 (UTC)
accessLevel public
accrualPeriodicity irregular
bureauCode {026:00}
catalog_@context https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
catalog_@id https://data.nasa.gov/data.json
catalog_conformsTo https://project-open-data.cio.gov/v1.1/schema
catalog_describedBy https://project-open-data.cio.gov/v1.1/schema/catalog.json
harvest_object_id 90e7ea9c-b514-41c2-b89c-d37ccd9599a0
harvest_source_id 61638e72-b36c-4866-9d28-551a3062f158
harvest_source_title DNG Legacy Data
identifier DASHLINK_168
issued 2010-09-22
landingPage https://c3.nasa.gov/dashlink/resources/168/
modified 2020-01-29
programCode {026:029}
publisher Dashlink
resource-type Dataset
source_datajson_identifier true
source_hash 634a48bfafb49d9b675c077c1d5978c1ad9b85587094917e8fa9e580085e311b
source_schema_version 1.1