Health data

Health data is any data “related to health conditions, reproductive outcomes, causes of death, and quality of life”[1] for an individual or population. Health data includes clinical metrics along with environmental, socioeconomic, and behavioral information pertinent to health and wellness. A plurality of health data are collected and used when individuals interact with health care systems. This data, collected by health care providers, typically includes a record of services received, conditions of those services, and clinical outcomes or information concerning those services.[2]

Historically, most health data have been sourced from this framework. The advent of eHealth and advances in health information technology, however, have expanded the collection and use of health data—but have also engendered new security, privacy, and ethical concerns. The increasing collection and use of health data by patients is a major component of digital health.


Health data are classified as either structured or unstructured. Structured health data are standardized and easily transferable between health information systems.[3] For example, a patient’s name, date of birth, or a blood-test result can be recorded in a structured data format. Unstructured health data, unlike structured data, are not standardized.[3] Emails, audio recordings, or physician notes about a patient are examples of unstructured health data. While advances in health information technology have expanded collection and use, the complexity of health data has hindered standardization in the health care industry.[2] As of 2013, it was estimated that approximately 60% of health data in the United States were unstructured.[3]


Health informatics, a field of health data management, superseded medical informatics in the 1970s.[4] Health informatics, which is broadly defined as the collection, storage, distribution, and use of health data, differs from medical informatics in its use of information technology.[4]

Individuals are the origin of all health data, yet the most direct if often overlooked is the informal personal collection of data. Examples include an individual checking off that they’ve taken their medication on a personal calendar, or an individual tallying the amount sleep they’ve gotten over the last week.

Prior to recent technological advances, most health data were collected within health care systems. As individuals move through health care systems, they interact with health care providers and this interaction produces health information. These touch points include, clinics/physician offices, pharmacies, payers/insurance companies, hospitals, laboratories, and senior homes. Information is also collected through participation in clinical trials, health agency surveys, medical devices, and genomic testing. This information, once recorded, becomes health data. This data typically includes a record of services received, conditions of those services, and clinical outcomes consequent of those services.[2] For example, a blood draw may be a service received, a white blood cell count may be a condition of that service, and a reported measurement of white blood cells may be an outcome of that service. Information also frequently collected and found in medical records includes, administrative and billing data, patient demographic information, progress notes, vital signs, medications diagnoses, immunization dates, allergies, and lab results.[5]

Recent advances in health information technology have expanded the scope of health data. Advances in health information technology have fostered the eHealth paradigm, which has expanded the collection, use, and philosophy of health data. EHealth, a term coined in the health information technology industry,[6] has been described in academia as

an emerging field [at] the intersection of medical informatics, public health and business, referring to health services and information delivered or enhanced through the Internet and related technologies. In a broader sense, the term characterizes not only a technical development, but also a state-of-mind, a way of thinking, an attitude, and a commitment for networked, global thinking, to improve health care … using information and communication technology.[6]

From the confluence of eHealth and mobile technology emerged mHealth, which is considered a subsector of eHealth.[7] mHealth has been defined as

medical and public health practice supported by mobile devices … . mHealth involves the use and capitalization on a mobile phone’s core utility of voice and short messaging service (SMS) as well as more complex functionalities and applications including general packet radio service (GPRS), third and fourth generation mobile telecommunications (3G and 4G systems), global positioning system (GPS), and Bluetooth technology.[7]

The emergence of eHealth and mHealth have expanded the definition of health data by creating new opportunities for patient-generated health data (PGHD).[8] PGHD has been defined as “health-related data—including health history, symptoms, biometric data, treatment history, lifestyle choices, and other information—created, recorded, gathered, or inferred by or from patients or their designees … to help address a health concern.”[8] MHealth allows patients to monitor and report PGHD outside of a clinical setting. For example, a patient could use a blood monitor interfaced with her or his smartphone to track and distribute PGHD.

PGHD, mHealth, eHealth, and other technological development such as telemedicine, constitute a new digital health paradigm. Digital health describes a patient-centric health care system in which patients manage their own health and wellness with new technologies that will gather and assess their data.[9]

Data has become increasingly valuable in the 21st century and new economies have been shaped by who controls it[10]—health data and the health care industry are unlikely to be an exception. An increase in PGHD has led some experts to envision a future in which patients have greater influence over the health care system.[11] Patients may use their leverage as data producers to demand more transparency, open science, clearer data use consent, more patient engagement in research, development, and delivery, and greater access to research outcomes.[11][12] Put another way, it is foreseeable that “health care will be owned, operated, and driven by consumers.”[11] Moreover, some large technology companies have entered the PGHD space. One example is Apple’s ResearchKit. These companies may use their newfound PGHD leverage to enter and disrupt the health care market.[11]


Health data can be used to benefit individuals, public health, and medical research and development.[13] The uses of health data are classified as either primary or secondary. Primary use is when health data is used to deliver health care to the individual from whom it was collected.[14] Secondary use is when health data is used outside of health care delivery for that individual.[14]

Digitization and health information technology have expanded the primary and secondary uses of health data. Over the last decade the U.S. health care system widely adopted electronic health records (EHRs)—an inevitable shift given EHR benefits over paper systems.[15][16] EHRs have expanded the secondary uses of health data for quality assurance, clinical research, medical research and development, public health, and big data health analytics, among other fields.[17][18][19][13] Personal health records (PHRs), while less popular than EHRs,[20] have expanded the primary uses of health data. PHRs can incorporate both patient- and provider-reported health data, but are managed by patients.[20] While a PHR system can be standalone, integrated EHR-PHR systems are considered the most beneficial.[20] Integrated EHR-PHR systems expand the primary uses of health data by giving individuals greater access to their health data—which can help them monitor, evaluate, and improve their own health.[20] This is an important aspect of the digital health paradigm.

Security and privacy

In the United States, prior to the Health Insurance Portability and Accountability Act (HIPAA) of 1996, there were no comprehensive federal policies that regulated the security or privacy of health data.[21] HIPAA regulates the use and disclosure of protected health information (PHI) by specified entities, including health providers, health care clearinghouses, and health plans.[21] HIPAA implementation, delayed by federal-level negotiations, became broadly effective in 2003.[21]

While HIPAA established health data security and privacy in the U.S., gaps in protection persisted. The emergence of new health information technologies exacerbated these gaps.[21][22] In 2009, the Health Information Technology for Economic and Clinical Health Act was passed. The legislation aimed to close the existing gaps in HIPAA by expanding HIPAA regulations to more entities, including business associates or subcontractors which store health data.[21] In 2013, an Omnibus Rule implementing final provisions of HITECH was revealed by the U.S. Department of Health and Human Services.[21]

Despite these legislative amends, security and privacy concerns continue to persist as healthcare technologies advance and grow in popularity.[23] It is worth noticing that in 2018, Social Indicators Research published the scientific evidence of 173,398,820 (over 173 million) individuals affected in USA from October 2008 (when the data were collected) to September 2017 (when the statistical analysis took place).[24]

Ethical considerations

There are important ethical considerations for the collection and secondary use of health data. While discussions on the ethical collection and use of health data typically focus on research, it is important not to overlook potential data misuse by non-research organizations.[25] It has been argued that the collection and use of health data for any non-clinical purpose, “is ethically sound only if there is (or could reasonably arise) a question to be answered; the methodology (design, data collected, etc) will answer the question; and the costs, including both communal health care resources and any risks and burden imposed on the participants, justify the benefits to society.”[25]


  1. ^“health data”. McGraw-Hill Concise Dictionary of Modern Medicine. McGraw-Hill. 2002.
  2. ^ Jump up to:ab c Tzourakis, Melissa (1996). Richard Y. Wang (ed.). The Healthcare Industry and Data Quality (PDF). International Conference on Information Quality.
  3. ^ Jump up to:ab c Unstructured Data in Electronic Health Record (EHR) Systems: Challenges and Solutions (Report). Datamark Incorporated. Oct 2013. Retrieved 19 October 2017.
  4. ^ Jump up to:ab Hovenga, Evelyn J. S. (2010). Health Informatics: An Overview. IOS Press. ISBN 9781607500926.
  5. ^“What information does an electronic health record (EHR) contain? | FAQs | Providers & Professionals |”. Retrieved 2018-03-16.
  6. ^ Jump up to:ab Eysenbach, G (2001-06-18). “What is e-health?”. Journal of Medical Internet Research. 3 (2): E20. doi:10.2196/jmir.3.2.e20. ISSN 1438-8871. PMC 1761894. PMID 11720962.
  7. ^ Jump up to:ab mHealth: New horizons for health through mobile technologies (PDF) (Report). Global Observatory for eHealth series – Volume 3. 2011. ISBN 978 92 4 156425 0. Retrieved 23 October 2017.
  8. ^ Jump up to:ab Technology, Department of Health and Human Services, Office of the National Coordinator for Health Information. “Patient-Generated Health Data White Paper”. Retrieved 2017-10-23.
  9. ^Health, Center for Devices and Radiological. “Digital Health”. Retrieved 2017-11-07.
  10. ^“Data is giving rise to a new economy”. The Economist. Retrieved 2017-11-07.
  11. ^ Jump up to:ab c d “Disruptive Innovation and Transformation of the Drug Discovery and Development Enterprise – National Academy of Medicine”. National Academy of Medicine. 2016-07-20. Retrieved 2017-11-06.
  12. ^Discovery, Forum on Drug; Development; Translation, and; Policy, Board on Health Sciences; Division, Health and Medicine; Sciences, National Academies of; Engineering; Medicine, and (2017-02-15). Real-World Evidence Generation and Evaluation of Therapeutics: Proceedings of a Workshop. doi:10.17226/24685. ISBN 9780309455626. PMID 28211655.
  13. ^ Jump up to:ab Raghupathi, Wullianallur; Raghupathi, Viju (2014-12-01). “Big data analytics in healthcare: promise and potential”. Health Information Science and Systems. 2 (1): 3. doi:10.1186/2047-2501-2-3. ISSN 2047-2501. PMC 4341817. PMID 25825667.
  14. ^ Jump up to:ab Safran, Charles; Bloomrosen, Meryl; Hammond, W. Edward; Labkoff, Steven; Markel-Fox, Suzanne; Tang, Paul C.; Detmer, Don E. (2007-01-01). “Toward a National Framework for the Secondary Use of Health Data: An American Medical Informatics Association White Paper”. Journal of the American Medical Informatics Association. 14(1): 1–9. doi:10.1197/jamia.m2273. ISSN 1067-5027. PMC 2329823. PMID 17077452.
  15. ^Gunter, Tracy D; Terry, Nicolas P (2005). “The Emergence of National Electronic Health Record Architectures in the United States and Australia: Models, Costs, and Questions”. Journal of Medical Internet Research. 7 (1): e3. doi:10.2196/jmir.7.1.e3. PMC 1550638. PMID 15829475.
  16. ^“Office-based Physician Electronic Health Record Adoption”. Retrieved 2017-10-19.
  17. ^Hersh, William R. (June 2007). “Adding value to the electronic health record through secondary use of data for quality assurance, research, and surveillance”. The American Journal of Managed Care. 13 (6 Part 1): 277–278. ISSN 1936-2692. PMID 17567224.
  18. ^Pakhomov, Serguei; Weston, Susan A.; Jacobsen, Steven J.; Chute, Christopher G.; Meverden, Ryan; Roger, Véronique L. (June 2007). “Electronic medical records for clinical research: application to the identification of heart failure”. The American Journal of Managed Care. 13 (6 Part 1): 281–288. ISSN 1936-2692. PMID 17567225.
  19. ^Botsis, Taxiarchis; Hartvigsen, Gunnar; Chen, Fei; Weng, Chunhua (2010-03-01). “Secondary Use of EHR: Data Quality Issues and Informatics Opportunities”. Summit on Translational Bioinformatics. 2010: 1–5. ISSN 2153-6430. PMC 3041534. PMID 21347133.
  20. ^ Jump up to:ab c d Tang, Paul C.; Ash, Joan S.; Bates, David W.; Overhage, J. Marc; Sands, Daniel Z. (2006-03-01). “Personal Health Records: Definitions, Benefits, and Strategies for Overcoming Barriers to Adoption”. Journal of the American Medical Informatics Association. 13 (2): 121–126. doi:10.1197/jamia.m2025. ISSN 1067-5027. PMC 1447551. PMID 16357345.
  21. ^ Jump up to:ab c d e f Goldstein, Melissa M.; Pewen, William F. (2013-11-01). “The Hipaa Omnibus Rule: Implications for Public Health Policy and Practice”. Public Health Reports. 128(6): 554–558. doi:10.1177/003335491312800615. PMC 3804103. PMID 24179268.
  22. ^Ren, Y.; Werner, R.; Pazzi, N.; Boukerche, A. (February 2010). “Monitoring patients via a secure and mobile healthcare system”. IEEE Wireless Communications. 17 (1): 59–65. doi:10.1109/MWC.2010.5416351. ISSN 1536-1284.
  23. ^Arora, Shifali; Yttri, Jennifer; Nilsen, Wendy (2014). “Privacy and Security in Mobile Health (mHealth) Research”. Alcohol Research: Current Reviews. 36 (1): 143–151. ISSN 2168-3492. PMC 4432854. PMID 26259009.
  24. ^Koczkodaj, Waldemar W.; Mazurek, Mirosław; Strzałka, Dominik; Wolny-Dominiak, Alicja; Woodbury-Smith, Marc (2018). “Electronic Health Record Breaches as Social Indicators”. Social Indicators Research. 141 (2): 861–871. doi:10.1007/s11205-018-1837-z.
  25. ^ Jump up to:ab Wade, Derick (2007-06-28). “Ethics of collecting and using healthcare data”. BMJ. 334 (7608): 1330–1331. doi:10.1136/bmj.39247.679329.80. ISSN 0959-8138. PMC 1906611. PMID 17599978.

Ofer Abarbanel – Executive Profile

Ofer Abarbanel online library

Ofer Abarbanel online library

Ofer Abarbanel online library

Ofer Abarbanel online library