Saturday, July 21, 2018

data for good

When there are natural or man-made disasters, and families get separated, the most vulnerable members of these groups (mostly children) become susceptible to heinous crimes like human trafficking. This must be stopped. Fortunately, technology and tools available to us today can make a huge difference in addressing these kinds of problems. In this post we examine how data science and machine learning can be used to help accomplish some of this.

Take for instance a simple web-site with (say) 5 tabs:

  1. Search - parents or concerned adults post a photograph of the lost child with a name, nationality, description, any identifying characteristics of the individual in question, along with details of where and when last seen, when taken if known, parent contact details, etc.
  2. Found - the organization that found or is keeping these children, or volunteers that have seen these children somewhere, can post entries into this tab - similar details: name, nationality, description, identifying marks, and current location information with contact details. 
  3. Links to charities that are focused on reuniting charities torn apart by war and other unfortunate circumstances. The website does not itself touch any money (the focus here is purely on the technology), there are simply hyperlinks that directly connect to charities that help.
  4. Discussion board - people can ask questions and and others can answer. Lawyers can volunteer pro-bono to represent families in their most desperate hour of need. 
  5. News - articles in the news relating to the evolving situation wherever such sad events occur and families are destroyed.
Machine learning and data science can help tie (1) and (2) above together - facial recognition is suitably advanced today so that faces in photographs can be matched. Facebook engineers routinely do this to auto-tag subscribers in various photographs posted on the site, and in other cases leverage active user participation via captchas to utilize human intelligence to resolve thorny classification problems.

Image processing related technologies to do these kinds of things require the use of a technique called convolutional networks - these are neural nets that relate adjoining cells in the "input field" (in this case pixels next to each other) in ways that allows classification of objects within a scene, and potential matching of images. 

Similarly, if the text descriptions are suitably accurate and use appropriately descriptive language, then Natural Language Processing (NLP) techniques can be applied to identify statistically improbable phrases to more easily match descriptions of lost and found children. 

In fact, if both the parents looking for the children, and the organizations that found or are holding the children are able to point to published photographs in the media (some news articles are widely circulated - one picture of a crying toddler in particular is seared into my brain), since both sides are pointing to children from the same picture, it would be easiest to match them.

Building something with the scale to be able to process images of tens of thousands of children, some of them very small, perhaps even incapable of verbal expression, and storing these in high availability and geographically redundant form probably calls for a cloud-based architecture with a web-based front end. 

Engineers and data scientists from Facebook, Google, Amazon, Microsoft, and other such companies have banded together in the past to address issues like the spread of the flu epidemic (Google keyword search tracking), tracking survivors of natural calamities like Earthquakes, etc. I wonder if initiatives such as this see any interest from superstar software engineers from these companies. 

Solving the most difficult problems at the leading edge, and pushing the envelope further forward on technical aspects is no-doubt exciting. But what can be more fulfilling than seeing a lost, crying child reunited with her parents, and more importantly saved from the most heinous of crimes? 

There is too much wanton cruelty, conflict, subterfuge, and careless negligence at play these days in our brutal world. Some might even say people's lives are more important than politics. The hope is, that where technology abounds, capable, competent people will come together to help solve this meaningful problem. We may not be able to stop wars. We can, as humans, help reduce suffering. If nothing else, someday, when we need it most, someone might do the same for us.