Governing data
In the race to control COVID-19, the world is facing new demands for data: to track the spread of the virus, trace contacts and develop vaccines. Health systems need partnerships and data sharing arrangements with different stakeholders. Governments need data to fight the virus, so must cooperate with the private sector, nongovernmental groups, and multinational organizations. As more data are collected and travel across borders, they hold great promise for fighting the disease, but also more risks, including to privacy. All these challenges require new rules and governance. How are institutions organizing themselves to face these new demands?
What is the data governance ecosystem?
The COVID-19 pandemic has thrown into sharp relief the need for better institutions and data governance to manage the ever-increasing flow of data around the world. The role of data governance is two fold: first, to control risks by ensuring the security, integrity, and protection of data and systems; and second, to capture value by establishing rules and technical standards to enable data to be more effectively transferred, combined, and exchanged.
These layers of governance create trust in how data are produced, collected, processed, and used. Data governance goes beyond good data management. It also establishes norms and rules about rights, principles, and obligations around use of data.
These principles, strategies, policies, laws, regulations, and standards for governing data are developed by institutions and actors in the data ecosystem. This ecosystem extends beyond governments to nongovernmental actors, including civil society organizations (CSOs), the private sector, academia, and others who have a stake and therefore a role to play in how data should be governed.
Adopting a collaborative, coordinated, and mutually reinforcing multistakeholder approach to data governance helps bring the social contract to life. Public sector entities, nongovernment actors, and others can work together to develop the infrastructure, rules, and standards to get the most value out of data in an equitable, transparent, and accountable manner. In this way, institutions are a key driver of data for development.
For these objectives to be achieved, the institutions and actors participating in the data governance ecosystem must have the capacity, resources, and incentives to perform their roles and harness the value of data.
What do institutions do?
Institutions are primarily responsible for developing and implementing data governance frameworks. They can take many forms, including state and nonstate actors. What they look like varies widely between countries. Within the public sector, key players include National Statistical Offices (NSOs), departments, agencies, and units that make and implement policy and lead change; data protection authorities; cybersecurity agencies; and standard-setting bodies.
To effectively design and implement data governance frameworks, institutions need to undertake many functions. These can be arranged into four clusters: strategic planning; rule making and implementation; compliance; and learning and evidence. Which institutions carry out these functions varies widely across the world. The optimal setup depends on local conditions and organizational structures.
Strategic planning institutions
To maximize the use of data for development, a top-down strategic approach is required, to embed the use of data in decision making. One of the main aspects of this strategic planning is developing a data governance strategy for the whole of government and nongovernment entities.
The other main functions of strategic planning institutions are developing plans for ethically making the most of data, and devising domestic institutional arrangements. This can include identifying government entities that need to be created or appointed, as well as key performance indicators to measure how well institutions are achieving results. In some cases, this takes the form of setting up a government data entity. This can be a separate agency or a unit embedded within the government.
Lower-income countries are less likely to have a data governance entity and are more likely to embed it in another department.
No low-income countries have a data governance entity, compared to 53 percent of high-income countries.
Percentage of countries with a data governance entity
Source: WDR 2021 team calculations based on the Data Governance System and Services (DGSS) data.
Note: A data governance entity is a dedicated institution in charge of data governance or data, including both separate agencies and units that are part of another institution. In progress/planned includes partial focus on data governance under Open Data initiative. Established means established based on legislation.
Other countries take a decentralized approach, with a network of ministries, departments, and agencies sharing responsibility for data governance, sometimes via a National Statistical Office.
In either case, it is essential to ensure high-level leadership of the agenda and capitalize where possible on a “data champion” that can steer reform efforts and promote a culture of better data use across the whole of government. Countries that are at the forefront of leveraging greater value from data through better data governance frequently have strong advocates of the value of data in positions of power.
Public sector reform efforts can be supported by international and regional organizations that seek to harmonize data flow across borders. The APEC Cross-Border Privacy Rules (CBPR) System is a government-backed data protection certification that implements the APEC Privacy Framework. The Framework and CBPR are designed to support regulatory convergence through cross-border rules and enforcement cooperation, while allowing countries to adopt national level data protection legislation.
Rule-making and implementation institutions
Safety and transparency is an important part of data governance, and so rule-making institutions are needed to legislate and regulate the use of data. The laws and regulations they create act as both safeguards and enablers. Safeguards protect data rights, security, and the integrity of data throughout their life cycle. Enablers govern how data are managed, accessed, and shared to maximize their impact.
These institutions are also the ones that set standards to make data universally understandable. And finally, they can provide guidance and clarification for the participants who are required to follow their rules.
Within the public sector, rule-making institutions can be sector specific. For example, a ministry of digital economy may propose a data classification regulation, or a telecommunications or banking sector regulator may develop specific rules for the use of call data records (CDRs) or financial data, respectively. Nongovernmental organizations can also set standards for data interoperability and cybersecurity, for instance.
Institutions that enforce compliance
Once rules are in place, they need to be followed. Compliance institutions are responsible for enforcing rules; investigating complaints; and auditing, arbitrating, and remedying any breaches of these rules. In the data protection context, enforcement institutions are responsible for ensuring compliance.
They may take the form of a national data protection authority–though in countries that do not have the resources to create a new entity, existing institutions may undertake these functions. Lower-income countries are less likely to have an independent and active data protection authority, even where the national data protection law provides for its creation. This affects individuals’ ability to enforce their rights under the law. Even where such an authority exists, it is important to ensure alternative review and redress mechanisms through the courts.
Only 24 percent of low-income countries have established data protection authorities, compared to 81 percent of high-income countries.
Percentage of countries with a data protection authority
Source: WDR 2021 team calculations based on the Data Governance System and Services (DGSS) data.
Institutions that promote learning and evidence-based policy making
The final cluster of institution functions is evaluation and learning. There are two broad categories: backward-looking monitoring and evaluation, and forward-looking learning and risk management.
Institutions and actors focused on backward-looking monitoring and evaluation play a critical role in measuring whether the objectives of data governance strategies and policies have been achieved, and use these insights to improve the quality of decision making and strategic planning in future cycles. These institutions promote learning and evidenced-based policy making. They can be national agencies, such as the US Government Accountability Office (GAO), which audits activities of the US federal government.
The forward-looking learning and risk management function is critical in an area such as data governance because new issues and risks emerge so quickly. These entities (often CSOs and academic institutions) can play a crucial role in helping governments proactively address these issues before societal risks materialize, using tools such as anticipatory governance and horizon scanning.
Many independent organizations also identify and help fill gaps in policy making or public service delivery. For example, during the COVID-19 pandemic, Johns Hopkins University spotted and filled a gap in tracking case data, making its database available globally as a reliable evidence-based tool for policy makers tackling the virus.
What do institutions need to be effective?
All the different players in the data governance ecosystem need the right characteristics and resources to fulfil their functions effectively.
Technical capacity
Institutions often play dual roles in data governance: helping to develop and implement data governance frameworks, but also producing and consuming data. To see the potential value in data, we need to understand how to use and analyze them. Institutions therefore need to be sufficiently resourced and data literate.
Some roles will require specialized skills. For example, employees of a Computer Security Incident Response Team (CSIRT) will require technical competencies in cybersecurity, while officers within a data protection authority will often benefit from a combination of legal enforcement skills and an understanding of the technological processes that enable data transactions.
As users of data, civil servants and nongovernment actors require resources and training to allow them to use and analyze data effectively to harness its potential value. Beyond investing in the technical capability and data literacy of staff, institutions need to invest in appropriate digital infrastructure to enable institutions to transition to a data-driven way of working.
Data literacy is particularly weak among government institutions in low-income and middle-income countries. There is often a cap on salaries in the public sector, which affects their ability to compete with the private sector for talent. Often, local CSOs or private sector entities can reinforce the training and capacity building of civil servants through stand-alone workshops or ongoing engagement. Where local resources are lacking, international nonprofit organizations can help, such as through the GovLab Academy’s coaching programs, workshops, courses and project clinics aimed at supporting better data use to solve public problems. In addition to improving data skills, non-profit initiatives such as the Data Pop Alliance’s Open Algorithms (OPAL) pilot projects aim to support multistakeholder approaches to better data use, by bringing together public, private sector and civil society representatives.
A culture of innovation
Even when the most talented people have access to the best technical infrastructure and useful data, innovation can sometimes be prevented by a more intangible barrier: culture. For data to improve the lives of the poorest people, they need to be used in new and imaginative ways. Politics and siloed decision making can often prevent countries from extracting the maximum value of data for development. Data governance reforms must be paired with change management and collaborative leadership techniques for them to work.
On an individual level, staff can be motivated to innovate through bonuses, salaries, and the autonomy to make decisions at lower management levels. Shifts in data culture can be incentivized through hackathons and competitions. For example, Morocco’s Ministry of Economy, Finance and Administrative Reform awards an annual Emtiaz prize to support competition between public sector entities and service providers developing innovative e-services.
Autonomy
To be able to govern the data ecosystem, institutions need to be truly autonomous, both financially and legally. This is crucial for institutions to make the best decisions, free from political or commercial influence.
In addition to functional autonomy, some institutions require formal independence. This is particularly important for entities that play a role in enforcing laws and regulations, investigating complaints, or providing redress to claimants. Independence is critical in the public sector (for example, for a data protection authority or sectoral regulator). It is equally important for accountability functions within the private sector to be independent from decision makers in corporate suites (such as a data protection officer reporting directly to the Board of Directors).
How to make the data governance ecosystem work
For the ecosystem to function well, institutions must work together. And both individual institutions and the data governance ecosystem as a whole need to be seen as legitimate. There are several ways to accomplish this. Each of them strengthens the social contract between data users.
Build public trust
The public are part of the data governance ecosystem, both as data participants and by holding governments to account. Transparency helps establish public trust in the integrity of institutions, and opportunities for public scrutiny can build this trust further. For example, the UK Connected Health Cities project in Manchester convenes a citizen’s jury to hear expert evidence before approving an approach for the project.
Be inclusive
As the overseers of data governance, institutions are often responsible for ensuring that data users are inclusive. Marginalized groups tend to get excluded from traditional data collection methods. Involving local communities is one way to tackle this problem. For instance, in the Amazon Basin, a wide-ranging regional initiative in citizen science is pooling indigenous, local, and international knowledge and efforts to study and protect Amazon freshwater systems.
Take a collaborative, multistakeholder approach to decision making
With so many data users and participants involved, a multistakeholder approach needs to be built into the data management and governance systems.
In Tunisia, the government’s decision to adopt a collaborative leadership approach to drafting its latest open data decree was an important shift from its previously unsuccessful efforts that had resulted in siloed and fragmented initiatives and limited results. By convening more than 50 officials from across the Tunisian public administration and several CSOs, the government was able to gather diverse views on the best-fit options to include in the decree. This collaborative process was led by a unit in the Prime Minister’s Office, thereby endowing the effort with high-level support and ownership.
Data can be costly to create. This can often motivate hoarding to gain relative power. Without a culture of data creation and sharing, we will not unlock the full potential of data.
Sometimes the best way for sharing and reusing data is to send them through a data intermediary, which can clean and package them in a consistent way. Intermediaries can establish trust by facilitating secure data transfer between the government and other actors in a national system.
Data intermediaries can often be crucial in low-income and middle-income countries that have gaps in their data management frameworks or weak enforcement mechanisms. They can enhance the use of data for development by linking together otherwise siloed datasets and data users. For instance, the nongovernmental organization DataEthics.eu, a collaborative effort across academia and civil society, has developed a series of data ethics principles designed using a European legal and value-based framework for voluntary adaptation and use by European Union (EU) data providers, data intermediaries, and data users.
Coordinate between stakeholders
With so many institutional functions being undertaken by different stakeholders, coordination is essential to make the network function smoothly.
Brazil’s Central Data Governance Committee, established in 2019 is tasked with steering Brazil’s transition to a data-driven public sector by promoting data sharing among federal agencies and integrating citizens’ information in a single platform (the Citizen Base Register). The committee was created as a separate entity by presidential decree to ensure high-level collaboration and coordination of data governance activities.
Coordination can also yield huge direct benefits for development by avoiding duplication and facilitating secure data sharing. Different ministries and government agencies collect, manage, and use a wealth of data sources, including tax returns, results from social programs, research, fuel consumption statistics, health data, immigration flows, geospatial maps, land management, crop inventories, and business program activities. Bringing these data sources together increases their potential to help those that need it most by improving policy decisions and the efficiency of service delivery. To ensure that these efficiency gains do not create risks for users, it is critical to ensure that the appropriate safeguards are in place to ensure the security and integrity of data and protect personal data and the associated rights of data subjects.