Underestimated prevalence of heart failure in hospital inpatients: a comparison of ICD codes and discharge letter information.
Read the original article.
- Comprehensive Heart Failure Center (CHFC), Department of Internal Medicine I, Würzburg University Hospital, Am Schwarzenberg 15, 97078, Würzburg, Germany.
- Chair of Computer Science VI, University of Würzburg, Würzburg, Germany.
- Service Center Medical Informatics, Würzburg University Hospital, Würzburg, Germany.
- Comprehensive Heart Failure Center (CHFC), Department of Internal Medicine I, Würzburg University Hospital, Am Schwarzenberg 15, 97078, Würzburg, Germany. Stoerk_S@ukw.de.
Heart failure is the predominant cause of hospitalization and amongst the leading causes of death in Germany. However, accurate estimates of prevalence and incidence are lacking. Reported figures originating from different information sources are compromised by factors like economic reasons or documentation quality.
We implemented a clinical data warehouse that integrates various information sources (structured parameters, plain text, data extracted by natural language processing) and enables reliable approximations to the real number of heart failure patients. Performance of ICD-based diagnosis in detecting heart failure was compared across the years 2000-2015 with (a) advanced definitions based on algorithms that integrate various sources of the hospital information system, and (b) a physician-based reference standard.
Applying these methods for detecting heart failure in inpatients revealed that relying on ICD codes resulted in a marked underestimation of the true prevalence of heart failure, ranging from 44% in the validation dataset to 55% (single year) and 31% (all years) in the overall analysis. Percentages changed over the years, indicating secular changes in coding practice and efficiency. Performance was markedly improved using search and permutation algorithms from the initial expert-specified query (F1 score of 81%) to the computer-optimized query (F1 score of 86%) or, alternatively, optimizing precision or sensitivity depending on the search objective.
Estimating prevalence of heart failure using ICD codes as the sole data source yielded unreliable results. Diagnostic accuracy was markedly improved using dedicated search algorithms. Our approach may be transferred to other hospital information systems.
Data warehouse; Electronic health records; Heart failure; ICD coding; Information extraction