Towards a Computable Data Corpus of Temporal Correlations between Drug Administration and Lab Value Changes
Background The analysis of electronic health records for an automated detection of adverse drug reactions is an approach to solve the problems that arise from traditional methods like spontaneous reporting or manual chart review. Algorithms addressing this task should be modeled on the criteria for a standardized case causality assessment defined by the World Health Organization. One of these criteria is the temporal relationship between drug intake and the occurrence of a reaction or a laboratory test abnormality. Appropriate data that would allow for developing or validating related algorithms is not publicly available, though. Methods In order to provide such data, retrospective routine data of drug administrations and temporally corresponding laboratory observations from a university clinic were extracted, transformed and evaluated by experts in terms of a reasonable time relationship between drug administration and lab value alteration. Result The result is a data corpus of 400 episodes of normalized laboratory parameter values in temporal context with drug administrations. Each episode has been manually classified whether it contains data that might indicate a temporal correlation between the drug administration and the change of the lab value course, whether such a change is not observable or whether a decision between those two options is not possible due to the data. In addition, each episode has been assigned a concordance value which indicates how difficult it is to assess. This is the first open data corpus of a computable ground truth of temporal correlations between drug administration and lab value alterations. Discussion The main purpose of this data corpus is the provision of data for further research and the provision of a ground truth which allows for comparing the outcome of other assessments of this data with the outcome of assessments made by human experts. It can serve as a contribution towards systematic, computerized ADR detection in retrospective data. With this lab value curve data as a basis, algorithms for detecting temporal relationships can be developed, and with the classification made by human experts, these algorithms can immediately be validated. Due to the normalization of the lab value data, it allows for a generic approach rather than for specific or solitary drug/lab value combinations.