A robust technique to transform time series with missing data into a zero mean series
Loading...
Date
2016-11-05
Authors
Adikaram, K.K.L.B.
Jayantha, P.A.
Journal Title
Journal ISSN
Volume Title
Publisher
University of Peradeniya, Sri Lanka
Abstract
In the process of knowledge mining, feature scaling plays a crucial role. For feature scaling, there are two widely used methods in many machine learning algorithms, namely, mean normalization and standardization. In both these methods, mean is zero and consider only the dependent variable (y); no involvement of independent variable (x). Thus, regardless of the original regression, those approaches treat all the data as y = c series. However, when some data points are missing or removed (as outliers), the said approaches destroy the original regression. In this research, a novel standardization method to standardize data based on linear regression (y = mx + c) was introduced. The proposed transformation was given by ʸⁿᵉʷ ⁼ ʸᵀ ⁻ ᵡ ˣ ʸᵀ / ᵡᵀ+ c where ʸᵀ = ʸ ⁻ ʸᵣ, ᵡᵀ = ᵡ ⁻ ᵡᵣ, (ᵡᵣ,ʸᵣ) is any selected reference point, and c is any constant. The speciality of the proposed method is that the transformation used both independent and dependent variables. Thus, no influences form missing or removed data. When there were no noise or outliers, the new method transformed data into a y = c series even with multiple missing values. When c = 0, results showed that the transformation produces a zero mean series from any data set even with noise, outliers and missing values.
Description
Keywords
Robust technique , Data , Zero mean series
Citation
Proceedings of the Peradeniya University International Research Sessions (iPURSE) – 2016, University of Peradeniya, P 291