Maximising the Value of Missing Data
Maximising the Value of Missing Data
Article PDF (English)

Ключевые слова

missing data; imputation; gaps; holes; data mining; empty data

Как цитировать

Atai Winkler. (2014). Maximising the Value of Missing Data. Глобальный журнал компьютерных наук и технологий, 14(C3), 41–48. извлечено от https://gjcst.com/index.php/gjcst/article/view/1191

Аннотация

The subject of missing values in databases and how to handle them has received very little attention in the statistics and data mining literature1 2 3 and even less if any at all in the marketing literature The usual attitude of practitioners is we ll just have to ignore records with missing values On the other hand a few very advanced theoretical solutions have been developed some of which have been applied particularly to clinical trials data These solutions can only be applied to small databases not to the very large databases held by many companies on their customers This paper describes a new method for imputing missing values in such very large databases Two particular features of the method are that it can handle all combinations of variable type continuous ordinal and categorical and that all the missing values in the database are imputed in one run of the software It is based on the k-nearest neighbours method a well known method in data mining The paper concludes by presenting the results of a study of this method when used to impute the missing values in a real set of data This paper is only concerned with missing data i e data that are not known but which have real values It does not address the problem of empty data i e data that are not known but which cannot have real values
Article PDF (English)
Лицензия Creative Commons

Это произведение доступно по лицензии Creative Commons «Attribution» («Атрибуция») 4.0 Всемирная.

Copyright (c) 2014 Authors and Global Journals Private Limited