Foresight, using machine learning to predict refugees displacement

Description

The Foresight model is designed to provide strategic displacement forecasts along with scenario analysis to provide accurate forecasts of the total number of forced displacements from a given country 1-3 years into the future.

Context

Foresight was established by the Danish Refugee Council and IBM to significantly scale-up capacities of predictive analytics in the humanitarian sector. The aim was designed to enhance the accuracy of scenario-building and forward looking-analysis to ensure better strategic planning in the humanitarian sector and thus improved outcomes for vulnerable populations.

Technical details & Operations

The model is based on a theoretical framework focusing on the root causes or pre-disposing factors of displacement. The different dimensions and associated indicators have been grouped into 5 categories:

Economy (e.g. unemployment, GDP, poverty)
Violence (e.g. civilian fatalities, number of conflict events, etc.)
Governance (e.g. corruption, access to public services, democracy, etc.)
Environment (e.g. food security, natural hazard events, etc.)
Social/population (e.g. presence of vulnerable groups, urbanization, population size)

The categories were identified based on DRC experience in the field, as well as adopting standard groupings used to describe fragility e.g. by the OECD, State Fragility Index etc.

The data is all derived from open source data and from the monitoring program over 25 years old run by the DRC through which thousands of migrants on the move are interviewed. The main data sources are the World Bank development indicators, ACLED, UCDP, EMDAT, UN agencies (UNHCR, WFP, FAO), IDMC, etc. In total, the system aggregates data from 18 sources, and contains 148 indicators. The machine learning model employed is an Ensemble. An ensemble model works by leveraging several constituent models to generate independent forecasts that are then aggregated. Here is employed two gradient boosted trees are employed to generate the point forecasts. The model hyperparameters were determined by means of a grid search. Each year-ahead forecast has a separate model. In other words, we train a set of Ensemble models for y(t + h) = f(x(t)), where h = 0, 1, 2, 3. The associated confidence intervals were generated by empirical bootstrap method, where the source error distributions were generated on a retrospective analysis. Model training data was limited to data since 1995 and based on this training the data, the model is able to learn and understand how the different indicators relate to displacement and use this in forecasting how future displacement will evolve.

Deployment & Impact

The model now covers 26 different displacement-producing countries. The model can predict next years displacement with an average margin of error of down to 6% in the country where it performs best on average (Guatemala). Of the more than 150 forecasts made so far across the 26 countries, approximately 50% have a margin of error of 10% or below and 2/3 have 15% or below. The model generally outperforms the accuracy of the planning figures being used in Humanitarian Response Plans.

The model has so far been used in the Danish Refugee Council (DRC) annual strategic planning cycle. This includes providing country offices with forecasts of displacement to inform the contextual analysis, as well as scenario-based forecasts of how displacement can potentially unfold. This helps DRC in being better prepared for contextual developments and engage in mitigation efforts. DRC is further exploring the potential of linking Foresight with an anticipatory financing mechanism.

It has also been used by external partners to inform strategic planning. For example, the model has been used in HNO process for Central America, as well as been used by OCHA CERF in funding allocation decision-making.

The model is hosted on an online platform, where it is possible to access the underlying data, see the forecasts for the different countries and develop scenario-based forecasts of displacement.

Cookie	Duration	Description
_icl_visitor_lang_js	1 jour	Stores the user's preferred language on the site.
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
wpml_browser_redirect_test	Session	Stores the user's preferred language on the site and redirects them to the corresponding language version.
YSC	session	This cookies is set by Youtube and is used to track the views of embedded videos.

Cookie	Duration	Description
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the wbsite is doing. The data collected including the number visitors, the source where they have come from, and the pages viisted in an anonymous form.

Cookie	Duration	Description
__Host-GAPS	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and to track site usage for site analytics report. Cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gat_gtag_UA_191715615_1	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and to track site usage for site analytics report. Cookies store information anonymously and assign a randomly generated number to identify unique visitors.
i/jot	session	Sets a unique identifier for the visitor, which allows third-party advertisers to target the visitor with relevant advertising. This matching service is provided by third-party ad centers, which facilitates real-time bidding for advertisers.
IDE	1 year 24 days	Used by Google DoubleClick and stores information about how the user uses the website and any other advertisement before visiting the website. This is used to present users with ads that are relevant to them according to the user profile.
NID	6 months	This cookie is used to a profile based on user's interest and display personalized ads to the users.
test_cookie	15 minutes	This cookie is set by doubleclick.net. The purpose of the cookie is to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	This cookie is set by Youtube. Used to track the information of the embedded YouTube videos on a website.

Description

Context

Technical details & Operations

Deployment & Impact

Related solutions Tech as a tool