In the next 3 years, the average volume of new healthcare data for each patient is predicted to exceed 1 terabyte; how can we reduce this to improve global health outcomes?

Every second of every day, the volume of patient data stored in silos – from national health systems, wearable devices, and the life sciences sector – is growing. The pace of this data accumulation is so rapid that in the next 3 years, the average volume of new healthcare data for each patient is predicted to exceed 1 terabyte: a volume of data equivalent to 1,300 filing cabinets full of paper.

As recognised in the UK government’s recent ‘Data Saves Lives’ policy paper, when gathered, connected and analysed at scale, patient health data can improve health outcomes for billions of people. When this data is integrated with other data – including social, personal, activity and medical devices – a holistic, 3D picture of population health can be captured and leveraged. However, achieving such benefits necessitates a new, cooperative, secure approach to the management and sharing of patient data. This is no mean feat, but it is possible, and it’s undoubtedly too important to delay.

The possibilities offered by global-scale patient health data

In 2019, Ernst & Young produced a report on NHS patient record data, detailing billions of pounds of savings that initiatives based on this “treasure trove of information” could deliver for the service. This included operational and workforce efficiencies within the NHS, faster diagnosis enabling early interventions and improved patient outcomes, and enhanced safety and pharmacovigilance.

The report also outlined an array of wider advantages connected, interoperable data could deliver to health systems and the advancement of public health on a global scale. As echoed by the authors of a recent analysis in the Journal of Big Data, access to data from billions of patients would fuel a rapid acceleration in the design, development and launch of new drugs and therapies, as well as allow for much greater personalisation of treatment and health management plans.

Suicide accounted for almost 45,000 deaths in the United States in 2016 and approximately 1.4% of all global deaths in 2017

For example, suicide accounted for almost 45,000 deaths in the United States in 2016 and approximately 1.4% of all global deaths in 2017. However, in 2018, a study published in the American Journal of Psychiatry detailed how researchers used data from speciality mental health, primary care, self-reported questionnaires and death certificates for 2,960,929 individuals across 7 health systems to build and validate a multi-modal model to predict suicide attempt and suicide death following an outpatient visit.

This model substantially outperformed existing suicide risk prediction tools, leading the study authors to conclude that mass data-derived models from multiple sources had an important role in national suicide prevention strategies. The researcher’s access to quality patient data directly enabled the accurate identification of individuals at the highest risk of suicide, enabling the adjustment and/or targeting of support.

Examining existing barriers to progress

Today, the huge potential of patient data is far from being effectively harnessed on a global scale. Progress is being held back by several formidable barriers, the first of which is the disjointed and dispersed nature of the vast majority of patient records. This means that their data is locked up in silos and stored in a myriad of formats that cannot be easily translated. As a result, accessing multiple source data is complex, slow, expensive and inaccurate, hampered further by the mistakenly perceived technological inability to integrate anonymised data, as per the UK and European GDPR.

In addition, a high level of apprehension exists in the public consciousness regarding the risks of shared public health data. Patients and providers often express concerns about the risk of this invaluable information being misused, hacked or stolen, dampening their enthusiasm to support or fund large-scale data-sharing initiatives.

Transitioning to a data-first healthcare model

If patient data is to be successfully shared, consolidated, and analysed at a population level, we must focus on technologies that enable safe, single-point access to records. These technologies must focus on minimising the movement of the data and deploy anonymisation or pseudonymisation as appropriate to maximise security and privacy.

With a clear, unified and holistic view of patient information, health journeys can be tracked in real-time, with local and global population trends identified. These trends could inform research, decision making and treatment outcomes in a way that benefits the whole healthcare ecosystem – similar to the impressive progress observed in Australia following their Electronic Health Record (EHR) unification project.

In his introduction to the 2022 Goldacre Report, Professor Ben Goldacre emphasised that raw patient information has “phenomenal potential”, and when “shaped, checked and curated into shape” and “housed and managed securely”, its full power can be unlocked. This power can be utilised in pursuing several of the UN’s Good Health and Wellbeing targets, including a reduction in mother and infant mortality, the end of epidemics, and the development of vaccines and medicines.

Data may be one of healthcare’s biggest challenges. Still, every step we take towards joining the patient data dots in a secure, sustainable and scalable way takes us one step closer to laying the firmest foundations for the future of global health.


Written by Dr Petros Kotsidis, Chief Data Officer, FITFILE


Please enter your comment!
Please enter your name here