How #BigData helped Santa save Christmas

Jan 9th, 2017

Gary Allemann, MD at Master Data Management

“Every year Santa delivers millions of presents to millions of children around the world.”

When that was written, around a hundred years ago, the world was a much smaller and less complex place.

Santa’s systems, the Workshop and the Mail room at the North Pole were geared to maximise elven efficiency.

The elves had a well defined system to sort through the children’s mail – using their well developed sense of smell to filter the requests from good children from those coming from naughty children. The choice of gift was much simpler – a small selection of cars or guns for boys: dolls for girls – with the occasional pony thrown in for the really well behaved.

Over the last hundred years, Santa had invested heavily in automation and mechanisation. This means that a smaller number of elves are able to deal with a higher volume.

But, around 20 years ago two trends emerged that have threatened to overwhelm his operation.

  1. The world’s population has grown exponentially. There are now many more children requesting gifts.
  2. The Internet has created a generation of children, the millennials, that are easily bored, that are exposed to a great variety of choice, and that are comfortable with electronic media and instant gratification.

Santa set up an elven committee to define the major problems facing his organisation.

After months of research the elves were able to categorise the technical challenges into four main areas:

  1. Volume: Santa now gets billions, rather than millions, of requests for gift from children around the world. The elven population has not exploded at the same rate – in fact the population of workers is still roughly the same size as it was in the days of a few million letters. The systems at the Pole were being overwhelmed. Santa had to find a new approach
  2. Variety: very few of Santa’s requests come via post (although the channel cannot be ignored.) Santa receives Christmas lists via email, via WhatsApp, via Social Media, and several million still come in via traditional mail. The elves can no longer filter this mail using smell – a new approach had to be found. Children also now have much more to choose from in the range of gifts. This puts pressure on Santa’s supply chain as they cannot wait until the requests come in before ordering stock and beginning construction. Santa had to find a way to predict which gifts would be popular before the rush.
  3. Velocity: The majority of requests to Santa still come in the twenty day period before Christmas. But children are much more likely to change their minds as new toys and games are advertised in the days before Christmas. This issue compounded the challenges of variety and volume.
  4. Veracity: Some children send multiple requests, via different channels, hoping that they will receive more than one present. Multiple languages, regional slang, poor spelling and grammar – these are all challenges that Santa and his team face very day. Without quality data Santa risks making the wrong decisions – disappointing the good children, or rewarding the naughty.

In summary? Santa existing systems were not equipped to deal with the Information age!

The elves proposed to augment the automated processes with machine driven predictions.

This would mean using statistical models to predict which toys would be popular.  By analysing solution media and advertising Santa’s manufacturing team are able to identify the broad categories of toys, and even specific items, that will be in high demand in December. This allows the team to get a head start on manufacturing, without waste.

The elves had to improve the process for filtering requests from good and bad children. Once again, social media sources give a good indication of character. The elves supplement this with data from phone calls, emails and other communication sources. The elves must analyse and aggregate the sentiment of each child’s parents, teachers, friends and family in order to place each child into the category – from very good to very naughty – that will determine whether they get exactly what they asked for, or a lump of coal.

The elves must use machine learning techniques to read through the mountains of mail – paper and electronic and match gifts requested to those available in the production line. In some cases, compromises must be made. Children may be allocated gifts that are very similar, but not identical, to that which they had requested. Santa’s recommendation engine combines ensure that each child gets the most appropriate gift to their request – depending on their segmentation.

Finally, Santa’s team use sophisticated data cleansing and matching engines to identify and filter duplicate requests, even across channels.

Without these investments in big data Santa would be out of business!

Thankfully, Santa was able to meet the challenges posed by the information age. Have you?