Maximising the Value of Missing Data

Atai Winkler

Ключевые слова

missing data; imputation; gaps; holes; data mining; empty data

Как цитировать

Atai Winkler. (2014). Maximising the Value of Missing Data. Глобальный журнал компьютерных наук и технологий, 14(C3), 41–48. извлечено от https://gjcst.com/index.php/gjcst/article/view/1191

Аннотация

The subject of missing values in databases and how to handle them has received very little attention in the statistics and data mining literature1 2 3 and even less if any at all in the marketing literature The usual attitude of practitioners is we ll just have to ignore records with missing values On the other hand a few very advanced theoretical solutions have been developed some of which have been applied particularly to clinical trials data These solutions can only be applied to small databases not to the very large databases held by many companies on their customers This paper describes a new method for imputing missing values in such very large databases Two particular features of the method are that it can handle all combinations of variable type continuous ordinal and categorical and that all the missing values in the database are imputed in one run of the software It is based on the k-nearest neighbours method a well known method in data mining The paper concludes by presenting the results of a study of this method when used to impute the missing values in a real set of data This paper is only concerned with missing data i e data that are not known but which have real values It does not address the problem of empty data i e data that are not known but which cannot have real values

Article PDF (English)

Это произведение доступно по лицензии Creative Commons «Attribution» («Атрибуция») 4.0 Всемирная.

Maximising the Value of Missing Data

Ключевые слова

Как цитировать

Скачать ссылку

Аннотация

Похожие статьи