Exploring the Potential of Data Stewardship in the Migration Space

July 08, 2022
Astha Kapoor
Suha Mohamed
Shefali Girish
16 min read
Photo Credit: Kentoh / Shutterstock.com

Editor's Note: The paper is part of The Dialogue on Tech and Migration, DoT.Mig. series, see The EU AI Act Proposal: Europe’s Opportunity to Safeguard the Rights of People on the Move and Digital Wallets and Migration Policy: A Critical Intersection to read the related pieces. 

Key Takeaways

Expand All


Available and accessible data on migrant and refugee populations is not only critical to provide timely support but it can help policymakers, civil society organizations and community-groups to design refugee sensitive policies and interventions going forward. This data can come from a variety of sources and either be characterised as personal (relates to individual/community attributes, i.e name, location, address, ID) or non-personal (de-identified personal/anonymised data or data that does not relate to an individual). It is captured from traditional (e.g official statistics, surveys) and non-traditional sources (e.g social media data, Call Detail Records, satellite imagery) and has been leveraged to improve international migration governance.

Most data on refugee and migrant groups is collected passively, by virtue of their use of social platforms and through the digital traces left behind while using phones and other technologies. Another set is collected through more official channels by governments where refugees or migrants provide their data at a point of registration for protection, asylum or a visa. At either of these junctures, migrants have little to no visibility on how corporations or states use this data. Of particular concern is how this data can be used covertly by governments to paint refugees/migrants as ‘security threats’ or to build algorithmic systems - which may render autonomous decisions on asylum applications and offer applicants no recourse or mechanisms for accountability. Therefore, while big data has been lauded as a silver bullet, even leveraging de-identified digital traces on migrant and refugee populations, it presents a host of ethical and privacy concerns.  Without governance mechanisms, it also opens up possibilities for misuse of this data by both state and non-state actors and consequent threats to safety, health and wellbeing of migrant populations.

In this context, data can be regarded as both a tool of exploitation and empowerment. It is increasingly deployed to surveil and control migrant and refugee populations globally, but new models of data stewardship can help empower communities by providing timely advice, guidance and support necessary to navigate the data economy and use data meaningfully to create value, and prevent harm. Stewards play the role of intermediaries, who are equipped to spread awareness and safeguard digital rights, but beyond this, can create opportunities for migrant and refugee communities to mobilise and have collective negotiating power around data. The following text unpacks how stewardship can be considered as a pathway for more responsible data collection, governance - in ways that are empowering and enhance the agential rights of migrants and refugees.

1. What does data stewardship mean exactly?

Data Stewardship is a broad paradigm that explores potential structures (legal, technical and social) that can unlock data for societal value while providing individuals and communities with greater control, transparency and ability to make informed decisions around their data. Our current data economy is characterised by many inequities that resemble our offline world - key among these is that data is siloed by companies and governments. This status quo offers individuals and companies little insight into understanding how they can derive value, and beyond this, leaves them powerless with little to no bargaining power in the face of data harms.

Stewards can be structured with varying levels of participation for the community, and based on different types of data. For instance, data cooperatives enable members to deliberate and decide on questions of collection, access and use of data often through direct voting, data trusts empower a board of trustees that have a fiduciary responsibility to represent the best interests of the community to make data decisions. Data collaboratives involve a range of stakeholders who co-define rules for data, and personal data stores provide individuals with granular control and consent dashboard like interfaces.  All these structures serve as intermediaries such that refugees and migrant communities can repost their trust in these institutions to safeguard their data rights.

2. What are the data related challenges in the current migration data landscape?

As mentioned above, there is also a lack of data sources on refugees and migrants which makes it challenging to develop adequate policies. For instance, UNHCR has age information for only 56% of the refugee population. This invisibilizes the risks and vulnerabilities refugees and migrants face from decision-makers, who are in turn not able to design responsive policies. Incomplete information not only impacts offline decisions but also, increasingly, AI-driven ones which are linked to existing data sets and have the potential of serious harm if the quantity and quality of data is suspect. For instance, 7000 students were deported from the UK because of a faulty english testing program. While not all data driven decision-making is problematic, there is a need to address lack of data which can result in ad-hoc decision-making, poorly timed responses can lead to exclusion and increased vulnerabilities for refugees and migrants.

Simultaneously, refugees and migrants often are subject to data extractive relationships with governments and private companies who knowingly or unknowingly follow a techno-solutionist approach and collect significant personal, biometric, mobility data about them. Along with lack of information, there is an inherent power imbalance between migrants and refugees and the authorities, therefore, people are unable to question or resist this demand for data, even though there may be a realisation that not all the data collected is required or helpful. Of course, states are empowered to collect data on non-citizens, e.g. to grant visa applications or at border checks, but also have to respect basic human rights and adopt a right-based approach to data, especially since right to life is intrinsically connected to right to privacy in many jurisdictions.

This is especially important to understand as there are also more insidious deployment of technology through firms like Palantir which mine data without explicit consent and share it with immigration authorities. In all of this, refugees and migrants have little to no say on what data is being collected and how it is being used. In situations like these, stewards could play two roles to start to rebalance power - to start with, they could help sensitise and articulate data harms to refugee and migrant communities. Secondly, stewards who already possess data on migrants, particularly civil society organizations, can act as bulwarks against extractive data practices and practice more representative or delegated consent processes when sharing of this data has to take place. Where active decision-making around data usage and sharing may be burdensome for refugees or migrant communities to take part in, stewards that are structured to have a duty of care can represent these interests. Through pre-negotiated governance mechanisms and consultations with migrant groups, the steward will then be best placed to exercise the responsibility to conditionally share data, for a specified purpose with technical safeguards in place.

Considering the double edged sword of poor data resources on one end and overcollection, and surveillance on the other which make the data landscape for refugees and migrants fairly complex with harms at both the individual and collective level - there is a need to reimagine refugee and migrant data rights such that individuals and communities are protected and empowered.

How does data stewardship solve these challenges?

Whether it is lack of data or its abundance, it is clear that migrants and refugees need to participate more in decision making with regard to their data rights. Data stewardship offers mechanisms for bottom up, community driven approaches to data governance that can bring migrant and refugee communities together to deliberate on data related decisions, prioritise their needs and concerns, and negotiate with authorities as required. Implementing stewardship can take a variety of forms and requires the support of top-down regulation or legislation and engaging at the grassroots with data-holding organizations that work closely with refugees and migrants to explore the potential of instituting a range of mechanisms. 

3. How do stewards ensure a participatory approach and enhance agency of migrants? How do data stewards frame  governance, legal and technical mechanisms to enhance community agency in their decision making power?

Models of stewardship, when structured to be community led, are by design participatory - which means engaging communities on decision making on data collection, processing, sharing. They also enable migrants to collectivise their data journeys and negotiate with technology platforms, and going forward perhaps with governments as necessary - negotiation powers of the data steward are anchored in being a representative of the community of refugees and migrants, and being legally and otherwise empowered to represent community interests. Levels of participation in models will vary depending on the needs and choice of the community on how much they would like to engage  - for instance, data cooperatives may have members vote on every decision whereas data trusts empower trustees, providing members with the option of delegation.

The governance of stewardship models is anchored in the idea of fiduciary responsibility - whether through cooperatives, trusts or collaboratives - this responsibility is enforced through the organisation type or through contracts. For instance, if an entity is registered as a cooperative, it is codified to have a certain responsibility to its members. Similarly, trust law gives trustees a fiduciary responsibility to its members. In other cases, the terms of responsibility can be defined through contracts. These mechanisms ensure trust in the process of stewardship, incentivise people to participate through a demonstration of the value of data and process.

There is a broader question of incentives for all stakeholders. Why would migrants and refugees want to sign on to yet another intermediary and how would these institutions garner the trust of the community? Why would governments be willing to negotiate with data stewards? Why would private sector companies that provide services to migrant and refugee populations engage with data stewards? These are evolving questions but we are seeing the use of data stewards such as data cooperatives such as Drivers Seat with gig workers who are collecting data on rides to enhance their own income and negotiate with app-companies such as Uber, lyft, grab. Simultaneously, driver data is made available to municipalities to facilitate evidence based decision making. This model demonstrates ways to build incentives for all stakeholders and can be applied to the question of refugees and migrants as well. That said, it is important to acknowledge that this is an evolving space where political will, policy environment and international agreements will have to align to create an enabling environment for data stewardship for refugees and migrants at scale.

Further, there is no one size fits all approach when it comes to data stewardship for refugees and migrants. There may be circumstances, such as the ongoing Ukraine crisis, where smart deployment of data has been shown to be helpful to refugees - data wallets are being used to store IDs and other sensitive information, and granular data being made available to the government is enabling action. This is a ripe case for data stewardship, but these mechanisms need to be made available before the crisis. Elsewhere, the United Nations refugee agency shared improperly collected data on Rohingyas with the Bangladesh government which in turn shared it with Myanmar to verify people for possible repatriation - and mechanisms of data stewardship could prevent this kind of misuse and resultant harm. Both situations described above need models of stewardship, in one case to facilitate the movement of refugees and allow governments to better understand the ongoing situation and respond accordingly, in the other to safeguard the privacy of refugees and migrants and ensure that their data is not being revealed without their consent.

4. What are the pilot projects or use cases in this context?

There are several use cases for data cooperatives for migrants and refugees. For instance in the case of immigrant health, data on access to health services is limited, siloed and under-utilised. A health data cooperative may allow migrants to contribute, store and manage their health related information, and deploy it for research, access to commercial goods and government bodies. Similar use-cases may be seen in the context of employment and access to credit. ILO already recommends a cooperative structure for refugees and migrants to improve access to markets,  livelihoods, negotiate for housing etc - and it is natural to explore the application of these structures to the data economy.

These models may be most relevant to explore in the context of internal migrant communities or long-term refugee groups. In India for instance, there are periodic waves of migration over every 6-8 months - where workers travel from rural parts of the country to the city for economic opportunities. In March 2020, at the peak of the Covid-19 pandemic, state-wise lockdowns and nation–wide transport restrictions imposed a significant burden on this community. The lack of accessible data available from the public sector exacerbated the situation and further hindered interventions required to plan transport services and provide basic relief. In response to this gap, a Civil Society organisation, Jan Sahas has developed the Migrants Resilience Collaborative (MRC) in collaboration with other nonprofit, philanthropic, and private sector actors. Jan Sahas plays the role of a steward in this instance and makes aggregated data from over 10 million migrant families available for timely interventions. Decision-making at the collaborative takes place through a steering committee model and includes representation from former migrant workers, along with other philanthropic partners and research organizations. The MRC model is demonstrative of how stewardship may be conceived and at which avenues it may be most practical to implement.

In terms of models, given the legal and jurisdictional complexities of the refugee and migrant landscape, as well as the new regulations with regard to data protection - models of data stewardship are still evolving. There are efforts such as Big Data For Migration, which seeks to accelerate the ethical use of data to inform migration policies and programs - and is focused on plugging data gaps for migrant and refugee issues but is not necessarily focused on participative data governance which is a possible next step.

5. What are the main debates about the data stewardship relevant to the migration space?

As mentioned above, data stewardship is a new concept in general and especially novel in its application to migrants and refugees. Therefore, a few debates need to be resolved in this context. First, the form of data stewardship relevant for migrants and refugees, and the specific use cases for each model. Such as, are collaboratives more effective in unlocking public value of data whereas cooperatives can be deployed to help migrants negotiate better at the collective level, and personal data stores to safeguard individual data rights? There are also questions about demonstrating value to migrants and refugees such that they are willing to engage with data stewards, and also so managing and safeguarding data rights doesn’t become another point of hassle, confusion in what is already an overwhelmed and precarious community of people. Another big question, which plagues data stewardship across sectors, is that of financial sustainability and the ways in which it can be achieved - it is clear data stewards should be structured to prevent data harms, but are there models for financial sustainability which can be deployed to ensure stewards can remain true to their purpose?

As more and more states use big data analysis to make sense of, control and manage the movement of people. However, the design, testing and deployment of AI tools for migration often view people in isolation of their rights - this belief can insert dangerous bias in the use of AI, and harm migrants and refugees. There is lack of oversight and regulation as most countries are still evolving their AI policies - therefore, use of AI currently exists without adequate oversight. In this context, the use of AI-driven technologies requires more collaboration with refugees and migrants to build ethical models that safeguard migrant/refugee interests - this will help bring the conversation out of just legal compliance but be structured to the well-being of the community.

Models of data stewardship can facilitate this but need to be tested with regard to the peculiar issues of migrants and refugees and answer the complex questions of incentives, value, sustainability and scale.

6. What are the insights for policy stakeholders to advance the discussion on data stewards for refugees and migrants?

To instantion data stewardship, multiple stakeholders need to come together such as international cooperation agencies, policy-makers and civil society organizations.

Inter-agency collaboratives need to consider how data rights can be mainstreamed in the conversation on refugees and migrants. Efforts such as Global Migration Group are invested in ‘data and research’ but these efforts need to move just insights derived from data to data rights of migrants. It is imperative to advocate for a special category of migrant data rights to be inserted into refugee policy and implementation documents. Alexander Beck, the UNHCR’s Senior Data Protection Officer also has highlighted “Data protection is part and parcel of refugee protection” . The Office of the United Nations High Commissioner for Human Rights (OHCHR) also has issued guidelines for a human rights approach to data. These efforts need to be amplified and brought to the fore on every discussion on migrant and refugee rights.

With policy makers the task of inserting data rights of migrants is more expansive. Given that data is not a vertical issue but something that runs across asylum seeking, social media, healthcare, employment - it needs to be embedded in different agendas pertaining to refugee and migrant issues. To this end, policy makers must develop policies and regulations to enable data stewardship such as data portability, interoperability, and the ability to delegate consent. They should also recognize models of stewardship such as cooperatives, trusts, collaboratives such that they can be implemented and regulated.

Civil society organizations working with migrants and refugees should also begin to consider ways in which they can build capacities of the community to evolve models of data stewardship which can be stacked with other rights based work. For this, civil society organizations will have to build a certain data consciousness, engage with data related experts to understand points of collection and use, related harms and opportunity, and communicate this to migrant and refugee communities; thereafter, its critical to consult with and co-design ideal mechanisms of data stewardship  that can help achieve the data goals (such as minimisation) for the community. However, there are outstanding challenges of funding, stakeholder buy-in and incentives which need to be resolved through a multistakeholder approach that can embed data stewardship across organizations working on refugee and migrant issues. 

In summary, data stewardship is evolving - there is a need to understand its use and implications in the migration and refugee space which can happen only through investment in pilots, policies and partnerships.


About DoT.Mig

The DoT.Mig In Brief paper series is part of the The Dialogue on Tech and Migration, DoT.Mig.

DoT.Mig provides a learning platform to connect the dots between digital technologies and their use and impact on migration policy, as well as connecting relevant stakeholders. The DoT.Mig In Brief paper series highlights debates and concepts relevant to navigate the emerging field of Tech and Migration.

DoT.Mig is a forum by the Migration Strategy Group on International Cooperation and Development (MSG). The MSG is an initiative by the German Marshall Fund of the United States, the Bertelsmann Foundation, and the Robert Bosch Stiftung.

The views expressed in this publication are the views of the authors alone and do not necessarily reflect those of the partner institutions.