What can Big Data do for your health

December 15, 2020

We will hardly find someone who has never heard the word “Big Data”. We have all heard of this term at some point. Big Data has changed the way that we treat, manage and analyse the data. But, what is actually and what can it do for our health? 

5 V’s of Big Data

Big Data is not just a large amount of data, it goes much further. According to the Spanish Agency for Data Protection and the Spanish Association for the Promotion of Information Security, Big Data is defined as the set of technologies, algorithms and systems used to collect data at a scale and variety not reached until now, and to extract value information from advanced analytical systems supported by parallel computing. 1. To describe the dimensions it covers, what are called the 5V’s of Big Data , are often used: Volume, Variety, Velocity, Veracity and Value. Let’s see what each of them means:

  • Volume: when we think of Big Data, logically, a huge amount of data comes to mind. This first dimension refers to this characteristic.

  • Variety: the data in a Big Data environment has a different nature and typology, due to the diversity of existing data sources. Generally, they are usually classified into structured data (“ordered”, which can be stored in tables of a relational database) or unstructured (text format, images, etc.).

  • Velocity: the flow of data, in addition to being massive, is constant and is generated at unprecedented velocities thanks to new technologies. 

  • Veracity:  given the three previous characteristics, it is necessary to maintain a workflow that improves and verifies the quality of this data, since this will have a direct impact on subsequent analytics and decision-making. This is probably the biggest challenge that Big Data presents. 

  • Value: the last dimension of Big Data refers to the capacity of this data, treated correctly, to generate value through its exploitation.

Big Data applications in the healthcare sector. 

Data has always been an important pillar in the healthcare sector. Thanks to them, doctors can decide the most appropriate treatment based on the patient’s medical history, hospitals can manage their resources, and pharmaceutical companies can calculate, based on the results of their clinical studies, the effectiveness of a new drug.

Thanks to new technologies, not only is a bigger amount of data generated, but we are also able to process and analyze this data at a higher velocity. Specifically, in the health sector, a large amount of data about our health is continuously generated, which includes medical records, medical images, genetic analysis, and even data collected through wearables or other types of tools. All of this opens up a huge range of possibilities that did not exist until now.

The applications of Big Data in the health sector are innumerable and can improve each one of its areas, from patient care to disease research, through health cost management and the monitoring of infectious diseases, among others. As far as the health of patients is concerned, being able to process and analyze all this data will allow us to offer precision, preventive, and personalized medicine. Below are four examples of how Big Data can improve our health.

Predictive analytics

Using data about our health and lifestyle in certain Artificial Intelligence (AI) models can revolutionize medicine as we know it, since it offers doctors the ability to anticipate events, improving the response to diseases and even preventing them.

Real-time monitoring and alerts 

The use of the right tools, together with the data we generate every minute, offers doctors the ability to “monitor” our health from a distance, with an alert system in case something goes wrong, thus avoiding unnecessary visits to the doctor. In this type of application, wearables play a fundamental role, since they can monitor our vital signs in real time 24 hours a day.

Cancer Therapies

Cancer is a very complex disease for which, nowadays, there is still no cure. Due to the complexity of this disease, initiatives are emerging that promote advanced data analysis to discover the best combinations of treatments, taking into account clinical, genetic, lifestyle information, among others, from the patients. For this, having a database interconnection system between hospitals is essential, so that researchers can have access to a greater number of patient data.


The term telemedicine refers to the ability to offer distance medical services through the use of technology. Telemedicine has been around for years, but thanks to the arrival of new technologies such as video conferencing, smartphones, and wearables, along with the ability to store and process all this data, it makes telemedicine a viable and effective service. Additionally, keeping patients away from hospitals reduces costs and improves their quality of life.


In the same way that the applications of Big Data are innumerable in the healthcare sector, there are also a great number of implicit challenges in it. One of the main problems that we find when trying to analyze medical data, as we have seen in other entries in this blog, is the lack of structuring them, which makes it impossible to use them in applications so beneficial to our health and quality of life, like the ones named above. It is estimated that between 65% and 85% of health data is in unstructured text format, so it is not being used so far due to the lack of tools that structure it and make it actionable data. For this reason, at IOMED we try to solve this problem by using advanced Natural Language Processing (NLP) techniques to make all this previously unexplored information actionable. Do you want to join us in the Big Data revolution?[1] Agencia Española de Protección de Datos y Asociación Española para el Fomento de la Seguridad de la información. Código de buenas prácticas en protección de datos para proyectos Big data. 2017.

Image Description

Mónica Arrúe

Data Engineer