Missing values are a common problem when applying classification algorithms to real-world medical data. This is especially true for trauma patients, where the emergent nature of the cases makes it difficult to collect all of the relevant data for each patient. Standard methods for handling missingness first learn a model to estimate missing data values, and subsequently train and evaluate a classifier using data imputed with this model. Recently, several proposed methods have demonstrated the benefits of jointly estimating the imputation model and classifier parameters. However, these methods make assumptions that limit their utility with many real-world medical datasets. For example, the assumption that data elements are missing at random ...