#IDETECT: how technology and collaboration between innovators can help ensure “no one is left behind”.
As part of its innovation efforts, IDMC has launched #IDETECT, the ‘Internal Displacement Event Tagging Extraction and Clustering Tool’ challenge on the UN Unite Ideas platform.
We welcome data scientists to join the challenge to help IDMC paint a more comprehensive picture of internal displacement.
The current picture
Over the past 20 years, we have reported on displacement situations in 169 countries and territories around the world. In 2015 alone, IDMC monitored conflict-induced displacement in 52 countries and one disputed territory and obtained data on approximately 700 new incidents of disaster-related displacement in 127 countries. Notwithstanding, we still do not cover all situations of displacement, or as many as we aim to cover. At present, the incomplete picture looks like this:
As part of our 2015-2020 strategy, we aim to increase our coverage and improve the accuracy of our estimates by leveraging innovative technologies, tools and working methods, in line with our mandate and as requested by the UN General Assembly.
The Unite Ideas Challenge - collaboration between innovators
On 31 January, we launched the #IDETECT challenge on the UN Unite Ideas platform to analyse “big data” to detect disaster and conflict-related displacement reported in the news and on social media. After mining one or more huge datasets of news, such as The GDELT Project, the European Media Monitor and social media platforms, IDMC will use Natural Language Processing (NLP) to filter and extract displacement-related data for subsequent human validation and supervised machine learning. These techniques have already proven effective in addressing similar challenges, such as disease detection and surveillance.
The Unite Ideas challenge represents not only a way to explore how innovative technologies can enhance and improve data collection but also a significant shift in the way we approach projects. Unite Ideas provides a platform for collaboration between UN agencies, international organisations, academia, civil society and innovators with the goal of “harnessing the power of data analytics and visualization to uncover new knowledge”. By submitting the #IDETECT challenge we will crowdsource the solution to the challenge and involve data scientists and innovators worldwide in this process. This will allow us to evaluate and compare several potential tools and solutions rather than just one or two. In addition, the open source code will be accessible to everyone as a public good.
The figure above shows a graphical representation of the challenge we are proposing to the data science community.
The GDELT Project monitors news media in over 100 languages. Its database provides the input for the #IDETECT NLP tool. #IDETECT will be able to analyse any source of human written text and will enable IDMC to:
- filter out non-relevant documents not reporting on internal displacement or human mobility
- tag documents based on the themes IDMC monitors as contained in the Global Internal Displacement Database, which will be used as a training dataset
- extract displacement facts reported in the documents
- visualize, compare and analyse facts in a visual and simple way
Once in use, #IDETECT will allow us to cast a much broader net and to detect thousands of new displacements each year, information that we will make available to the international community. It will also improve the accuracy and reliability of our figures as we will be able to compare, triangulate and validate figures from different sources such as local and international media, UN agencies and civil society.
Innovation at IDMC
#IDETECT is only one part of a broader innovation effort we want to engage in over the coming years. In 2016, we began working on a new and improved version of our disaster-related displacement risk model. The new model is based on an analysis of hundreds of thousands of reported disaster events since 1970 across more than 90 countries, as well as simulated displacement projections for rarely occurring major hazards that must be accounted for but for which there is little or no existing empirical data.
In the future, IDMC will work with partners and use a range of techniques to detect displacement and estimate patterns from proxy indicators when no direct observational data is available. For example, by analysing satellite imagery, IDMC will estimate the scale of displacement based on the number of homes destroyed or the extent of land inundated by the construction of a dam. In other cases, IDMC may detect the scale, scope, patterns and duration of displacement based on analysis of anonymous mobile phone or (inter)national financial transaction data.
#IDETECT is an important step towards enabling IDMC to exploit the full potential of new technologies. In today’s world many internally displaced people remain invisible: technology can help ensure that “no one is left behind”.
The #IDETECT challenge is open to the general public, private entities and academic organisations. Submission deadline: Friday, 28 April 2017.