The social contract for data
Data are a powerful weapon for fighting poverty in low- and middle-income countries. They can tell governments which programs and policies are working, and which aren’t. They can identify who is lacking access to crucial services such as water, transport, education and internet connections. And they can even be used as an input by businesses to create new sources of economic value.
But data themselves won’t help lift people out of poverty. It’s the people using them that generate insights that can turn into action to improve development outcomes.
There are three main groups of people using data who can support development. Governments use data about populations to identify needs and improve programs and policies to address them. Civil society, academia, and individuals use data to monitor and analyze the effects of government policies and to access services. And the private sector uses data to grow their businesses, boosting wider economic growth.
The data created through these pathways have limitless potential. They can be used and reused multiple times and in many more ways than originally intended without being depleted. When multiple data sources are combined, they become exponentially more powerful than they were alone. This is particularly true when data from different pathways are combined to offer new insights, as when private firms use government census data to target services, or when governments access traffic data from transportation companies to inform road planning.
But this isn’t happening enough. Data are currently not reaching their full potential for use in development. The World Development Report (WDR) 2021 explains this problem and introduces an important part of the solution: a social contract for data.
A social contract isn’t a new idea. Our existing laws are one type of social contract–an agreement between people in a society to follow a set of rules. And there are already laws that govern data creation and protection, such as the EU’s GDPR, rooted in the US Fair Information Principles developed in the 1970s.
But the social contract for data set out in the WDR 2021 goes further. In order to maximize the potential of data, we need to do more with them. Different data-creating and data-using parties need to collaborate, and have a way to do so safely. At the same time, the opportunities that data create should be accessible to all people in all countries, and not just confined to a few.
Gathering personal data from individuals also requires their cooperation, so we need to build a trust environment. The more people trust in the system, the more likely they will share personal information, and the more useful the data become.
The social contract is an enabler of safer collaboration to get more value out of data than we do at the moment. With better collaboration between data users, we can better help the poorest communities, and advance development goals.
The three elements of a social contract for data: Value, Equity and Trust
Value
Economic and social value comes from sharing, reusing, and combining data sources to generate greater insight
Equity
Data capture, infrastructure, and trade need to include poorer communities and countries equitably
Trust
Personal data and data infrastructure must be protected from misuse to avoid discrimination and cybercrime
Value
The social value of data
Data don’t have intrinsic value. Their value comes from the insight they can provide. When data production and analysis focuses on insights that can help the most vulnerable, this advances development goals.
For example, José lives in Chile and suffers from diabetes. Local public health services don’t have the capacity to focus on monitoring and prevention of these kinds of long-term illnesses, so he struggled with managing the condition and relied on his family for support. Using telehealth service Accuhealth gives José access to remote health monitoring through medical sensors and predictive data analysis, using his specific profile. This is more cost efficient than treating serious complications, and improves outcomes for patients (Guardian, World Bank).
The economic value of data
Data also have an economic value. Many companies have digitized their operations, and use data as an input to enhance decision making. The innovations created by this explosion of data can boost company profitability, which ultimately contributes to wider economic growth.
Many businesses also use data at the core of their services, which can address imbalances in accessing services such as health care and financial products.
The income of farmers in India is traditionally at the mercy of the rain. Crop harvests can be variable and unpredictable year to year. Now farmers can use a data-driven platform that combines satellite data, artificial intelligence, and machine learning to estimate future crop yields. They can then use this information as evidence of their profitability, and share it with financial institutions to access loans.
The synergy of sharing
The real power of data for development comes from creative ways to combine and reuse sources. Overlaying data sources from the private sector, public sector, civil society, and academia can improve granularity, timeliness, and coverage of datasets and enhance decision making.
People living in Longido in Tanzania suffer from some of the highest poverty levels in the country. But, using the typical method of mapping a sample of household surveys to the census data reports poverty at less than 50% of the true rate (Belghith et al. 2019 / World Bank Map O.2 in report). When this survey data is combined with satellite data, the resolution increases, splitting Tanzania into 169 districts rather than 20, showing the true extent of poverty in the town of Longido.
The plan starts with data
For the social contract to work, there needs to be an appetite for data. They need to be seen as a key foundation in decision making. Data are being collected 24 hours a day. But much of them are only used once and then stored (or lost) without being reused. We also need to shift towards open access and data reuse being the default. Safety is paramount, but withholding data is not the best way to protect it.
Speaking the same language
For data to be shareable across the world, they need to use a universal language. Common standards for definitions and classifications can help smooth the process of overlaying multiple datasets. For example, the System of National Accounts is an internationally agreed standard for measuring size and growth of country economies.
International and regional organizations can aid countries that lack the technical capacity for data production, and can help to harmonize data collection and systemization. The 50x2030 Initiative to Close the Agricultural Data Gap is an example of a multi-partner initiative that seeks to transform agricultural data systems across 50 developing countries by 2030. This regional approach creates a transferrable output, while creating national ownership.
Equity
Accurate representation
Representing an entire population in a dataset is very difficult, and it’s often those who are disadvantaged that might be missed out. Lack of digital connection, official ID, formal address, or phone line are just some of the reasons people might be invisible to certain types of data capture.
Gender equality is another issue when it comes to gaps in data. Fifty-four of the Sustainable Development Goals indicators are gender specific, and only 10 of these have data widely available.
This creates challenges for policy makers when estimating the population of a slum, or distributing emergency funds during the COVID-19 pandemic. Marginalized people need better representation, and data systems must be designed with these challenges in mind.
Equal benefit
As part of the social contract, program and policy interventions can help ensure that the poor benefit equitably from data’s potential.
Millicent lives in Kibera, a slum that one-quarter of a million people call home in Nairobi, Kenya. The slum has long been studied by external organizations, inducing a feeling of survey fatigue in the residents, without giving them anything back from the data collected.
Map Kibera aimed to change that, by completing an open-source, community-driven mapping project harnessing local wisdom. Millicent was one of the local residents who mapped out the local area, recording points of interest based on what was important to her. The map produced is not just available open source online, but as a paper version in Kibera for residents to use and add to (Map Kibera).
The infrastructure imbalance
As the quantity of digital data has exploded, the need for strong infrastructure has expanded too. Often there is an imbalance between rural and urban areas within countries, and between richer and poorer countries.
Inequality between people – Access to broadband networks
Sustainable Development Goal 9c calls for “universal and affordable access to the internet in least developed countries by 2020.” But there’s still a long way to go: The UN Broadband Commission targets are 75% access to broadband worldwide by 2025: 65% in developing economies, and 35% in the least developed economies.
When it comes to mobile internet, developments in technology make equal access a moving target: in 2018, 92% of the world lived within range of a 3G signal, but only 80% live within range of 4G, and far fewer will be able to access the new 5G technology.
Policies can proactively level the playing field by incentivizing private sector investment in broadband and mobile internet in places that are least connected.
Inequality between countries – Data storage and transfer
Data analysis isn’t just about internet access. Low- and middle-income countries often lack the national level infrastructure required to take part in the data-driven economy: internet exchange points to exchange data, colocation centers to store data, and cloud platforms to process it.
When these facilities aren’t available at a country level, they can be shared at a regional level, as long as regulations are harmonized and there are adequate high-speed connections between neighboring countries.
Leveling the playing field
Economies of scale mean that data-driven businesses grow particularly fast, leading to concentration of market power. This makes it harder for platform companies from lower-income economies to reach critical mass and compete with global players.
At the same time, the virtual character of data-driven enterprises makes it difficult for governments to capture normal tax revenues from sales and profits that foreign companies make in their domestic markets. This affects how the economic value from these businesses is distributed across countries.
Economic policies need to address competition and tax issues to help lower-income countries gain more equitably from data-driven businesses.
Trust
Trust enhances participation
The social contract is all about people. If people do not trust in the social contract, they won’t participate. This can be a very important issue when dealing with highly sensitive data, which can also be the most valuable to policy makers.
Unfortunately, survivors of violence against women and girls (VAWG) are at greater risk of violence after reporting it – a fact that makes data collection on VAWG a very delicate process. Survivors need to be given protection and treated with utmost privacy and confidentiality when telling their stories, because without them we can’t fully see the extent of the problem.
This is why trust is a key element of the social contract for data. When we build a trust environment between data collectors and data subjects, this fosters greater participation, which provides greater visibility to all members of a population. And this increased representation enhances the power of the data system to help everyone.
Safeguards against personal harm
Examples of deliberate misuse of data to cause harm include discrimination, and politically or commercially motivated surveillance. The social contract requires safeguards to protect people and countries from these types of harm.
Personal data protection is grounded in a human-rights framework. This is usually achieved through requiring consent before collecting data, but this is not enough. It would take the average person 76 days per year to read all the consent forms they are presented with online (Madrigal, 2012). This means that users cannot truly give informed consent.
Safeguards against nonpersonal harm
We also need to protect systems and infrastructure from cyber threats. One recent study estimated the annual cost of cybercrime in the US as between US$57 billion and US$110 billion (Council of Economic Advisers, 2018).
Protection of nonpersonal data is done through intellectual property (IP) law. However, many low- and middle-income countries do not have sufficient IP laws in place to adequately protect data.
Enablers increase the value of data safely
Trust isn’t just about preventing harm that comes from misuse. It’s also about enabling greater use of data, to gain the maximum value from them safely. A key part of this value comes from data sharing between stakeholders. To allow data to be shared securely, they must be sufficiently and consistently anonymized.
Often, data gathered by the private sector can have huge value for the public sector, such as mobile phone records for contact tracing to control the spread of COVID-19. Within a social contract for data, the government can incentivise private companies to share their data, whilst requiring them to protect the end user’s sensitive information.
Governance is key to make the social contract work
How to build a social contract
Data governance is how we make the ideas behind the social contact tangible. The social contract for data isn’t simply a list of rules, but a way of approaching data, to make sure they benefit those who need it most. It is fundamentally built on people, working toward a common goal.
However, bringing it to life starts with laws and regulations. This is not only to protect users from the potential harms of data, but also to inspire greater good to come out of them, by enhancing transparency, regulating market competition, and mandating collaboration.
A national social contract – The integrated national data system
It is important to start nationally. Countries are responsible for balancing out inequities between their citizens, protecting their rights through the trust environment, and seeking to gain the most value out of available resources by fostering collaboration between data users. The WDR 2021 sets out a detailed framework on how the governments can work together with institutions, academia, the private sector, civil society and international organizations in the Integrated National Data System.
An international social contract
International data governance is also really important in order to ensure that lower- and higher-income countries benefit equitably from the global data economy. Global policies can help make data more shareable by harmonizing technical standards, both digitally and across physical infrastructure. International laws can protect people, countries and systems from cybersecurity threats, whilst enhancing cross-border cooperation. And international organizations play a large role in making data collection and analysis happen.
With the right governance framework in place, we can realise the full potential of data to improve equality and help the world’s poorest people, whilst keeping everyone safe.