A Deep learning-based system for de-identification of personal health information on mobile devices

Musila, Daniel Mutiso
Journal Title
Journal ISSN
Volume Title
Strathmore University
Communication in healthcare has evolved from older technologies like pagers to present day smartphone devices. The change has been largely driven by the capability of smartphones to facilitate information exchange at greater speed and efficiency to manage the rising patient numbers, complexity of cases and the multiple disciplines in modern medicine. Instant messaging services like WhatsApp offer a channel which meets most of these needs. This communication often involves exchange of patient clinical data containing Protected Health Information (PHI). Various laws and policies have been enacted in various geographies and jurisdictions to safeguard the confidentiality of patients through strict management of PHI. During the normal course of care provision, healthcare professionals and organizations are expected to maintain full confidentiality and integrity of the data against unauthorized exposure. Whenever patient data needs to be shared with external parties for research use, informed consent must be obtained from the data subjects along with an oversight of their activities by a relevant review board. The widespread use of smartphones and popular instant messaging applications in modern healthcare however presents security and data protection challenges which need urgent addressing. De-identification of the data offers an avenue to address these concerns, allowing clinical data containing PHI to be shared among healthcare providers and/or researchers with minimized risks. Deep learning de-identification systems demonstrate superior performance over other approaches. They are generally deployed on high-end workstations in medical facilities and research centres, or on cloud-based infrastructure. However, on-premises deployments present infrastructural, connectivity and cost implications while cloud de-identification services may involve transmitting sensitive data across different jurisdictions therefore potentially breaching data residency regulations. On the other hand, smartphone use worldwide continues to see incredible growth with mobile processors becoming more powerful and versatile. Deep learning models can be deployed on Android-based smartphones to perform complex tasks such as de-identification of PHI. This is in line with the growing interest and research in edge computing, where computations are carried out as close to data sources as possible as an alternative to cloud computing. Concretely, this research proposes a mobile-based de-identification system, in which the deep learning model is optimized and embedded onto a smartphone application from which de-identification can be done. Specifically, Long Short-Term Memory (LSTM) artificial neural networks will be leveraged to develop a deep learning model which can then be ported onto the Android operating system to be embedded into a mobile de-identification application.
A Thesis Submitted to the School of Computing and Engineering Science in partial fulfillment of the requirements for the award of Master of Science in Computing and Information Systems Degree.
Protected health information, De-identification, Neural network, Smartphone