Chapter 4 Data representation, visual and analytical techniques for demystifying temporal missing data
Missing data provokes an air of mystery, that makes analysts discombobulated throughout the exploration loop of transformation, visualization, and modeling. How to handle missing values involves decisions with many degrees of freedom, leading to a tedious and unwieldy process. The challenge of missingness is rooted in seeing what is not there. The aim of this work is to clear that mysterious air away from missing data with the focus on temporal contexts from a data-centric perspective. A new sparse representation facilitates the efficient indexing of runs of missings in time, with supporting operations and visual methods. This places missing data in the spotlight, speaking for themselves. When too many missings are scattered across variables and observations over time, missing data polishing strategies are populated and formulated. This equips analysts with tidy tools to iteratively remove missings from rows and columns, while keeping the temporal nature intact. The accompanying software is the R package mists.