Title
Algorithms for rapid digitalization of prescriptions
Abstract
Prescription data are invaluable for healthcare research and intelligence, yet, extraction of these data is challenging as this information is intertwined in the unstructured and non-grammatical text in prescription images. Moreover, text extraction from images in itself is hard, particularly for handwritten text. While piecemeal solutions exist, they are either limited to a small set of entities of interest or have very low accuracy and are not scalable. In this paper, we present two algorithms: the C-Cube algorithm for digitization of computer-printed prescriptions and the 3-Step Filtering algorithm for handwritten prescriptions. While a brute-force approach would match every word that is received from an optical character reader (OCR) with all possible entries in the database, this approach is inefficient and imprecise. The premise of our algorithms is an application of pattern intelligence to select a much smaller set of words (from the words returned by the OCR) as potential entities of interest. We rigorously tested the two algorithms on a corpus of more than 10,000 prescriptions' images, taking the brute-force technique as the baseline methodology. Regarding latencies, we found that the C-Cube and the 3-Step Filtering algorithms were 588 and 231 times faster than the brute-force approach. In terms of accuracies, we found that the F-score of the C-cube algorithm was 90% higher than the F-score of the brute-force approach whereas the F-score for the 3-Step filtering algorithm was found to be 8,600% higher. The algorithms are decidedly faster and more accurate than the brute-force approach. These attributes make them suitable for implementation in real-time environments as well as for use in batch-mode for various applications. We expect the algorithms to play a significant role in the digitalization of healthcare information and briefly discuss a few applications. (C) 2021 The Author( s). Published by Elsevier B.V. on behalf of Zhejiang University and Zhejiang University Press Co. Ltd.
Year
DOI
Venue
2021
10.1016/j.visinf.2021.07.002
VISUAL INFORMATICS
Keywords
DocType
Volume
Artificial intelligence, Image processing, Healthcare technology, Pattern intelligence
Journal
5
Issue
ISSN
Citations 
3
2468-502X
0
PageRank 
References 
Authors
0.34
0
2
Name
Order
Citations
PageRank
Mehul Gupta100.34
Kabir Soeny200.34