Chapter 1 Introduction
This thesis makes three original contributions for the analysis of temporal data. All three are grounded in an exploratory data analysis pipeline for time-indexed data. The research begins with a new technique for visualising data using a calendar layout, that fits neatly into a pipeline workflow. It is most useful when the data relates to human activity, and embeds plots in the familiar calendar. The second contribution is a new data abstraction which streamlines transformation, visualisation, and modelling for temporal data analysis. This “tsibble” data object is infrastructure holding the foundation of time series pipelines. The tsibble representation exposes a need for conceptually framing the handling of temporal missing values in a data-centric workflow. This is the third contribution of the thesis: exploratory and explanatory tools for understanding missing patterns in time.
Time series analysis has assumed that the entry point to data analysis is at model-ready data format, which provides little organisation or conceptual oversight on how one should get the wild data into a tamed state. This mind-set is related to a long-held belief that exploratory data analysis is a highly ad hoc statistical area, impossible to teach or formalise. However, the tidyverse framework, as originating in Wickham (2014) fundamentally overturns this thinking. Data plots and data wrangling, for which the “tidy data” conceptualisation supports, can be formally described using an abstract grammar. The grammar of graphics and data manipulation, as implemented in the ggplot2 (H. Wickham, Chang, et al. 2018) and dplyr (H. Wickham, François, et al. 2018) R packages respectively, form the core of the tidyverse suite of tools. My contributions extend the tidyverse way of thinking to the temporal domain, by providing tidy tools for supporting fluent workflow in temporal data analysis.