ESTIMATING PARAMETERS AND THE MIXTURE COMPONENT NUMBER OF A GMM IN THE PRESENCE OF UNOSERVED DATA
Abstract
The Gaussian Mixture Model (GMM) is one of powerful approaches to
model data that is heterogeneous and stems from multiple populations.
However, in some certain situations, a part of dataset is unobservable owing
to censoring problem. This problem refers to the fact that the value of a
measurement or observation is only partially known. For example, the sensors
on smart phones are not able to measure WiFi Received Signal Strength
Indication (RSSI) values below a fixed threshold (-100dBm with typical smart
phones). In that cases, RSSI values which are less than or equal to -100dBm
will return the same value as -100dBm. In this paper, a novel method is
proposed in order to estimate the number of components of the GMM and its
parameters with the existence of censored data by applying the Expectation
Maximization algorithm (EM) and the Sum of Weighted Real elements in
Logarithm of Characteristic Functions (SWRLCF). The experimental results
using artificial data show that this proposal outperform the current
approaches when collected data was suffered from censoring.