Title
Could Patient Self-reported Health Data Complement EHR for Phenotyping?
Abstract
Electronic health records (EHRs) have been used as a valuable data source for phenotyping. However, this method suffers from inherent data quality issues like data missingness. As patient self-reported health data are increasingly available, it is useful to know how the two data sources compare with each other for phenotyping. This study addresses this research question. We used self-reported diabetes status for 2,249 patients treated at Columbia University Medical Center and the well-known eMERGE EHR phenotyping algorithm for Type 2 diabetes mellitus (DM2) to conduct the experiment. The eMERGE algorithm achieved high specificity (.97) but low sensitivity (.32) among this patient cohort. About 87% of the patients with self-reported diabetes had at least one ICD-9 code, one medication, or one lab result supporting a DM2 diagnosis, implying the remaining 13% may have missing or incorrect self-reports. We discuss the tradeoffs in both data sources and in combining them for phenotyping.
Year
Venue
Field
2014
AMIA
Data source,Diabetes status,Data quality,Research question,Self report,Type 2 Diabetes Mellitus,Medical physics,Missing data,Bioinformatics,Cohort,Medicine
DocType
Volume
ISSN
Conference
2014
1942-597X
Citations 
PageRank 
References 
1
0.39
0
Authors
3
Name
Order
Citations
PageRank
Daniel Fort110.39
Adam Wilcox221535.66
Chunhua Weng310.39