01-02-2021 Door: Jan Henderyckx

Stop rolling the dice on your pandemic response – adopt data centricity

Deel dit bericht

Leaders worldwide must make tough decisions when facing the COVID-19 pandemic. They are overwhelmed with data that can beneficially influence these decisions, but can they trust the insights they are served? Anyone that wants to make fact-based decisions is faced with the same question: can I rely on this data? Fighting a pandemic involves making decisions with far-reaching impacts.

Shutting down an economy and asking people to remain in their homes for several weeks or even months, is not a decision that should be informed by gut feeling alone. You simply must base decisions on facts, that originate from data that most closely represents reality. As Lord Kelvin said, “to measure is to understand”; capturing the data and generating insights about the state of COVID-19 infections and the capacity of healthcare systems to take care of patients are core aspects in fighting the pandemic.

We observe that governments are struggling with the same questions as business leaders:
• Can we capture the relevant data?
• Can we trust the data that is being captured?
• Can we turn the captured data into insights?
• Can we act upon these insights?
• Is it ethically and morally acceptable to make decisions based on the data?
• Are we able to properly steer all aspects of the data value chain?

These questions represent the key components to consider when adopting a data-centric business model. A good way to measure the level of data excellence in an organization is by analyzing answers to the six questions shown in the figure below.


The ability to correctly address these individual key components and their overall orchestration determines how successful you will be at harvesting sustainable value from the data.
Let’s look into some practical examples of each of the different challenges within the key components that have surfaced when dealing with the COVID-19 pandemic, and consider what lessons government and business leaders can draw from these challenges.

Can you APPLY the insights?
Do you have someone in your organization that can ensure that the insights can be applied across your decision-making processes?
When making policy decisions on the duration of lockdown and the level of limitations that are to be put in place it is important to have correct data on the COVID-19 reproduction rate. This represents the average number of people to whom an infected person transmits the virus, which is a good indicator of the level of control you have over the spread of the virus.

The people that have decision-making powers must typically rely on a team of experts that can perform the analysis and suggest what measures could be taken to achieve the desired outcome. Therefore, most leaders have put in place a taskforce to provide them with relevant and factual inputs that enable them to make the right decisions. The key is to develop the right balance between policy execution and insight gathering.
Assuming the decision can be made and put into action, the question remains: how can you get the right insights?

Can you get INSIGHTS out of the data?
Here, the challenge is to determine which data points are required and find a way to translate them into an actionable insight. You can apply artificial intelligence and machine learning to derive patterns and obtain insights from the data points, however any approach will require an understanding of the use case and the problem that one is trying to solve.

Since the beginning of the twentieth century, the spread of infectious diseases has been modeled using the SIR (Susceptible-Infected- Removed) model, developed by Ronald Ross, William Hamer, and others. This is a basic model which indicates whether a disease is under control or still spreading. The model is susceptible to small variations in the way the virus behaves. More complex models incorporate many additional parameters and require insight into the way the real world is behaving.

These complex models should not be built by data scientists alone; they require that you combine mathematical skills with all behavioral aspects of both the virus and the people that influence the accuracy of the model. Having the right mix in your team is essential to guaranteeing the quality of the model – and to make sure it is providing insights that you can apply.

Can you CAPTURE the data?
A model is useless unless it is fed with relevant data points. A crucial question is therefore: what datapoints do I already have, and what can and should be captured to gain the desired insight?
The fact that we are living in a digital era greatly improves our ability to capture data. There are numerous digital data points you can tap into, such as mobile phone location data and data automatically obtained from medical devices. However, a lot of data is still obtained manually.

Another challenge for many organizations is that they don’t have a clear view of the data that is available internally and externally. This leads to untapped potential or loss of effectiveness by duplication of effort.
The datapoints needed for the SIR model to be reliable must measure the number of infections, the number of recovered people, and the number of deaths. Testing is one way to determine who is infected. But if you only test those already hospitalized, the time between early symptoms and hospital admission can
be up to weeks, meaning you are pretty much flying blind when it comes to understanding the
actual number of infected persons in the overall population.
Is it possible to close this time gap? A company in the US is using data from their digital IoT-sensor thermometer and has shown that analyzing body temperatures could be a far better indicator of the overall number of contaminations. Their US HealthWeatherTM Map plots the level of atypical influenza-like temperature patterns per zip code.

Another example of digitally capturing the behavior of large numbers of people is the
use of mobile devices to perform contact tracing. Apple and Google have partnered to enable the use of Bluetooth technology to help governments and health agencies reduce the spread of the virus. They have put user privacy and security central to the design of their framework.
The key question is what data can be captured to achieve the optimal outcome. There are often a magnitude of technical possibilities and data sources available that can enhance your ability to gain insight.

Can you TRUST your data?
Capturing data might seem easy, but you must ensure that your data points are correct. Common issues include incorrect data entry and captured data lacking context.
Manual data capture often introduces issues such as the timeliness of the data. In some countries, for example, the reporting of the number of COVID-19 cases is performed daily, but there have been several instances where timing has led to spikes in reported cases. Monday morning’s results are often influenced by the inclusion of Saturday’s and Sunday’s cases, thus making the day-to-day trend analysis less trustworthy and valuable.

When data points are not defined and measured in the same way, they become hard to compare. In the case of COVID-19, the policy for testing a population is not uniform throughout the globe. Some countries only test people when they are admitted to a hospital, while others perform a broader screening of the population. These differences in testing methods create a set of data points that are hard to compare.

It is therefore essential that the data points are given enough context to allow for meaningful conclusions, such as the applied testing methods and tested population.

Furthermore, the registration of the number of people that have deceased is also open for interpretation. There might be underlying medical conditions that have caused the actual death or there is simply no testing performed on the deceased.
The issue of having data that is hard to compare goes even to the core of the definition of the actual measurement. One clear example is the Chilean interpretation of “recovered” being “anyone that is no longer contagious”. This means that they count the deceased as recovered.
So, can you draw conclusions, take the right actions, and measure the effects if your underlying data cannot be trusted? History is full of examples where serious negative consequences have risen from a reliance on the assumption that data points have a comparable context and are accurate. It is therefore necessary that data points are given enough context to allow for meaningful conclusions.

Are you ALLOWED to use the data?
With the current state of technology and the proliferation of data-capturing devices, the urge to track individuals in a very detailed manner is not uncommon. The ethical question that needs to be addressed is whether you are willing to sacrifice privacy for well-being.

It is vital that this is not seen as a moral dilemma, but it should be implemented in such a way that the two objectives coexist. The sentiment around COVID-19-related risks should not be (ab)used to pass measures that
are disproportional to the results. The massive deployment of tracking solutions is often flawed and introduces new risks such as security breaches or changes in human behavior, resulting in an increased risk of contamination, rather than providing a better control over the pandemic.
It is for this reason that your decision-making process should always be consistent with your legal and ethical boundaries.

Can you STEER your data value chain?
It is not enough to excel in one of the key components of the data-centric way of working; they all need to fit together. If the leader of a country wants a factual basis for lifting national lockdown restrictions, the insight into the COVID-19 reproduction rate must be trustworthy. This can only be the case if all elements in the data value chain can contribute.

During the COVID-19 pandemic, the US Centers for Disease Control (CDC) decided to hire a CDO. Did they realize that applying a data centric way of working requires the right level of coordination to be truly efficient and effective? One of the responsibilities that are listed is to facilitate a data governance and standards structure through the implementation and oversight of an appropriate governance program; if the CDC sees the importance of doing this, shouldn’t your business?
Focusing on your challenge or your goal is pivotal for orchestrating your data initiatives. Appointing a CDO is a good step but it will not be enough if your organization is not acting in a data-centric way.

Data enables effective, fact-based decisions
The COVID-19 pandemic demonstrates very clearly that it is essential for leaders to be able to make fact- based decisions, but the issues faced by government leaders are encountered daily by their counterparts in business.
Setting up a data-centric organization is not just a matter of capturing as much data as possible and hiring a data scientist.

The real challenge is being able to define what insight enables your decision-making, allowing you to face problems, and drive towards your goals.
A compliant and sustainable data-centric organization can:
• Capture the relevant data
• Trust the data that is being captured
• Turn the captured data into insights
• Act upon these insights
• Assure that it is ethically and morally acceptable to make decisions on the data
• And properly steer all aspects of the data value chain.

They don’t rely on gut instinct and rolls of the dice.

Jan Henderyckx

Jan Henderyckx is een toonaangevende consultant, spreker en auteur met meer dan 25 jaar ervaring op het gebied van informatiearchitectuur en databases. Hij is partner bij BearingPoint en heeft op vele internationale congressen en usergroup-bijeenkomsten over de gehele wereld presentaties gegeven en workshops gehouden, of als moderator opgetreden. Als consultant bij talloze bedrijven, variërend van Fortune top 500 tot kleine ondernemingen, is Jan getuige geweest van de interne gang van zaken bij een scala aan ondernemingen. Deze ervaringen gecombineerd met zijn deskundigheid op het gebied van informatie¬architectuur, komen goed van pas als u worstelt met het inpassen van informatie¬architectuur in uw organisatie.

In 2009 heeft Jan samen met IBM de Belgische tak van de IBM Data Governance Council opgezet. Hij is de drijvende kracht van dit initiatief waarbij de focus ligt op het creëren van een platform waar mensen die betrokken zijn bij dit onderwerp ervaringen en best practices kunnen delen.

Jan is spreker op het jaarlijkse congres Datawarehousing & Business Intelligence Summit.

Alle blogs van deze auteur