GRU-DF: An RNN Model with Dynamic Imputation for Missing Values in Multivariate Time Series
Working in collaboration with Harvard Medical School, we established that the overarching purpose of this thesis is to develop a predictive model to help doctors determine how quickly patients with multiple sclerosis will deteriorate over time. Our data consists of a multivariate time series provided by Harvard Medical School. Our prediction target is a patient's score on the Expanded Disability Status Scale (EDSS), a measure of disability caused by multiple sclerosis. Given the sequential nature of our data, EDSS score acts both as a target and feature (in the form of lagged observations). As is often the case with clinical datasets, however, many variables in this dataset are missing a significant amount of values; unfortunately, our target value, EDSS, is chief among them. In order to tackle this challenge, we developed a new variation on Gated Recurrent Units (GRU) that leverages the predictive power of the recurrent neural network model itself to impute for missing values. We call this model Dynamic Fill GRU, or GRU-DF for short. Furthermore, and of particular interest to the doctors at Harvard Medical School, we seek to determine how the length of patient observation time impacts our ability to accurately predict disease course at least 5 years into the future.
Berretta Magarinos, Matias Bartolome, "GRU-DF: An RNN Model with Dynamic Imputation for Missing Values in Multivariate Time Series" (2019). ETD Collection for Fordham University. AAI13884530.