National Labour Office – Profiling the unemployed with machine learning

nfsz_logo

For the National Labor Office, we used a machine learning approach to analyze anonymous personal records for the Hungarian unemployed in order to discover how demographic features characterize chances for people staying unemployed, obtaining a job or retiring.

The demographic traits for

  • the geographical location (NUTS Level 3)
  • sex
  • age group
  • the level of education

were used to study how long an individual stayed in the unemployment support system and what chances the beneficiary had for a particular employment status when he or she left the benefit system.

Having performed Principal Component Analysis in the space of demographic features and unemployment status,  we used a random forest meta estimator to fit a number of randomized decision trees on various sub-samples of the dataset.  A linear combination of age and education level turned out to be the main driving factor for the length of various employment statuses.

The machine learning method "random forests" was used to rank the importance of variables in this regression problem in a natural way.

The machine learning method “random forests” was used to rank the importance of variables in this regression problem in a natural way.

 

Hungary