Medusa's Cave: October 2016

So we want to build the best mobile phone ... ever. How would we go about doing this? To think through this problem, we go through all the basic features users want or like, then all whiz-bang technology that would make our phone the best out there... for now at least.

"The future is already here. It is just unequally distributed. " -- William Gibson

So, what do users want? The below might be the "big 10".

A very high resolution screen - the Apple Retina display or better would be nice.
High (12+ megapixels anyone?) front AND back facing cameras
Battery that lasts very very long - for Android, crossing one full day is a challenge on most devices even today, even with mild "energy-sipping" use. This is really one of the most disappointing things for users. Charge quickly, retain charge longer. Apps should sip, not guzzle, battery.
Good quality network coverage on both up and down links (some might consider this a feature outside the phone, but of course it relates to the quality of the radio within)
Environmental safety of the device - gives out less radiation while remaining completely impervious to being dipped or (gasp!) fully immersed in water. Add shatter-proof (not just shatter-resistant) glass, and you'll have a winner. And yes, the phone shouldn't become a supernova, double as an iron, or be usable to boil water just because you use it for a while - being unable to hold a device because it might burn your fingers, ... or worse, precludes it becoming popular or liked.
Responsive UX (user experience) that is upgradeable (think base OS) to new releases offering fresh skins without requiring a complete hardware refresh every year or two would be good.
(Stretch) Ability to upgrade the device piece-meal. This will help with being able to isolate and swap out damaged, under-performing, or obsoleted components while keeping device integrity intact. Besides, this makes the case for easier, more seamless integration with upcoming technologies like AR/VR (Google daydream anyone? Oculus Rift?) or ... What would be really cool is if the whole PC vs phone vs tablet debate just went away, with the core of the phone BECOMING the PC, with plug and play adaptors for screen and keyboard. (If your screen was made of fold-able material with a provided electronic pen, you'd just carry it everywhere, unroll it onto a table, plug a micro-USB wire into it and your "phone", plug an unrolled (or IR keyless) keyboard into it, and voila, you have a computer and a tablet. Even better, imagine a portable, cigarette lighter sized holographic 3D display - then your computer would be the size of 3 cigarette lighters - one for the core compute part, one to generate the IR keyboard on any flat surface, and a third to project a 3D holographic display - sure, maybe you'd carry a special pair of glasses so only you could see the encrypted display so you have some privacy akin to how people can't see through your laptop screen today).
Seamless connectivity options:

Phone/cell network - GSM for the most part, since this is the most widely used in the world, though of course, Verizon (CDMA) has a wide subscriber base in the US as one of the two largest mobile service providers here.
Same as (1), but for data (4G/LTE and above?)
Wifi connectivity
WiMax connectivity where available
AirDrop (Manet?), IR, or Bluetooth connectivity for easy filesharing
A micro USB or other expandable slot and ability to connect with external computing devices to load up different things onto the device relatively easily
The ability to ramp up and down in technology seamlessly when communicating with other devices or networks as one roams.
Casting - not sure why they still make dumb (as in not-"smart") TVs. But once we make the leap to all smart TVs, this obviates the need for dongles like chromecast, but still retains the need for the casting function that comes with youtube and the like apps.

Software - lots of apps for free, with intuitive controls (the equivalent of frustration-free packaging for hard goods)
Lots and lots of memory - so you can keep all the rich media you create (remember the high-pixel camera?) and receive from friends you interact with. Sure, things are moving to cloud, but to move this much stuff around, you need a good network and a plethora of connectivity options, both of which are not fully in the control of the phone manufacturer today.

Things that peeve:

Why do different phones and laptops have different charging adaptor configurations? Why can't everyone settle on a single micro-USB type? OK, if power ratings are THAT different, there could be one type for all tablets and another for all laptops. This leaves me feeling that manufacturers try pointless differentiation to try to keep you locked, and this only fuels user frustration - carry along a big jumble of power cords if you travel with laptops in a group - few people can share a charger.
Why does every manufacturer insist on loading their device with proprietary apps or some tweaked version of an OS? Tweaks should be "untweakable". The user shouldn't feel she is renting the device from you - it belongs to her. In some cases, the proprietary devices lead to more laggy performance than you'd otherwise expect.
Why does software get more bloated with time? Less efficient memory management, and what is really irksome, developers requiring more and more "permissions" especially in Android, that in many cases have no direct bearing on what the app does.
Hardware devices seem not to be designed to last as long these days. Makes sense from a manufacturer's perspective. After all, if planned obsolescence didn't exist, who'd buy their next generation of device?
Why do some service providers still insist on selling/leasing locked devices? It's 2016 - get with the program. Making people work harder on jail-breaking their new phones and risk bricking them only makes them mad. Walled garden models work well in some environments - this doesn't seem to be one of them.

I tend to think the best value in phones today comes in buying mobile devices new, but one release behind with the lowest memory configuration (much cheaper), so long as they have the specs you need, the right kind of radios, and expandable memory capabilities. But I think a device that meets the above desired features while avoiding the pet peeves (at the right price-point of course) has a good shot at satisfying users.

AI is all the rage these days. It stands to reason that large companies are trying to incorporate it wherever possible to improve the quality of their products. This also holds in the Search space. There are two ways a user's search experience can be improved - one, the user becomes a better searcher, picking better keywords to search. The second, more interesting method is where the search engine becomes more attuned to the user through frequent interactions with her, and delivers better results as a direct result of those interactions - this requires some intelligence or adaptiveness on the part of the search engine. This requires artificial intelligence... at least in the more advanced use cases.

Sure, this means the search giants need to spend more to deliver better results. But this is a battle to capture users and eye-balls. And the company that gets there first increases "stickyness"- a search more finely attuned to your tastes builds a huge moat around itself. Unless that search context can be transferred from one search engine to another, once you've "captured" a user, she is likely to stay with you for good. And the more users you have, the better you'll be able to monetize them. Search goes social at some point, and well, there's no looking back from there.

Gone are the days when search used to be driven simply by the terms in the input. Recall those simpler times? One would use techniques like tf-idf to identify statistically improbable words from a corpus that occur with regularity in the documents of interest, and return those based on the goodness of fit with the search criteria. (OK, this is an oversimplification. Techniques such as stemming, synonym detection etc could be used to give the search query more power.)

There are different ways of determining goodness of fit of course - using cosine distance etc which are relatively easy to implement. Then large search engines implemented ideas such as using the context of the text on particular web-pages - the color, the size, and other metrics, to determine if a page was worth including in the search results.

Page rank was a further improvement where the computation of the reputation of the pages linking to a certain target web-page gave the target a score. The better the reputation of a page's in-links, the more credibility the page had in its search results.

Next came the filter bubble. You log into the service suite provided by the company running the search function. Google, Microsoft etc all offer much more than just search - they offer a full suite of services including email, drive storage in the cloud, a blogging portal (such as this one), and other applications (e.g. documents) designed to make use of cloud-hosted data, etc. A lot of these services require that you log into their service suite. And once you log in... they get to see, and store your information - the kinds of searches you are doing, and definitely the keywords you are looking for, and the links you click on. This is a veritable gold mine of information.

This leads to predictive search - the search toolbar can now predict the kinds of keywords you are looking for. If most of your queries are for pdf documents with say a "filetype:pdf" at the end of your query, then the toolbar can suggest that suffix to your typed keywords in the query string.

But that is not all. Now that they have your query set and the set of links you clicked, the service is able to use this data to a. better tune the results for future queries by using the context of previous queries from keywords typed earlier, and b. use the content of the clicked links and times spent on each page (e.g. user clicks on a link and goes back to the search results), to determine how better to service your search queries going forward.

Complaints from users were factored in to negatively affect the reputation of links offered up in some search results. Some sites were "banned" or sent to "Google hell", which meant their reputation was so badly affected they would not show in results anymore. A manual negative feedback of sorts.

This leads to the filter bubble. If you log into the search engine suite of services and run a query, you get different results from what you would otherwise have gotten if you ran the same query but logged out of the search engine account. Similarly, different people with different search histories would almost certainly get a different set of results possibly even on the very first page, even if they ran the search with the very same keywords. In other words, the filter bubble sets the context for the search results that are delivered.

This is where things start to get interesting.

Using services like Amazon's Mechanical Turk that utilizes human intelligence seamlessly to solve small problems, one could index large amounts of images, then use the image captions to deliver selected images into search results based on search keywords. An early example of a large classified image dataset for training supervised machine learning algorithms comes from Stanford's Fei-Fei Lee.
Google has a video service - youtube videos have captions - these can be auto-generated. Alternately videos have descriptive text, some even have comments. Filtering statistically improbable phrases from either, and using these as a basis for search with keywords can deliver relevant search results.
As speech processing improves - Siri, Cortana, Google Voice Services, etc are getting better every day thanks again to AI and NLP technology, audio files can be auto-captioned. And links to relevant audio can also be returned in search results.
Other media such as Maps can also factor into search results in interesting ways. This gives a local (as in geography) flavor to search.
Microsoft now has (or soon will have) the entire LinkedIn database at its disposal. Search results could now contain links to people... and to jobs where the specified keywords apply. More and more we see the world turning into a huge graph of interconnected data elements.
We are already at a point where a generic search engine can deliver better search results within a certain domain (like say, UPS) better than the hosted search within that domain's website - at least, this was true until more and more websites started hosting Google or Bing searches as their within domain defaults.

Similar ideas can be applied to other kinds of media as well. But this is all soo... yesterday. So what more can we do? With deep learning, i.e. neural networks with neurons layered deep, one is able to automatically sort images - cluster them together on common themes. This works well in many, but not in all cases. Deep networks were reported in recent solutions to have particular output neurons that could recognize cats and dogs, but there were also neurons whose outputs could not easily be discerned to correspond to any one major idea. And for some reason, composite objects like cars were not always easy to recognize... go figure. But this means a way forward might be to use AI and deep learning to automagically classify images for use in search results - with a diminishing (to zero) human component.

Are we done? What more can we do? Well, take the results of similar queries and look at the clicked links especially if you have access to metrics like the time a user spent on particular web-pages etc. The search engine can learn from clicks - similar to the query by example, or QBE paradigm from early relational databases, to learn what particular users liked, then using these clicks to further dynamically refine the search results.

Statistical text analysis uses probability theory to classify words and phrases as different components of a sentence (e.g. noun, verb, adjective etc), and potentially also as having different meanings (semantics) based on how the sentences are structured. The same word can have different meanings in different contexts - earlier search would sometimes get confused by this - for instance, not so long ago, Google News had, under one grouping, stories about Jordan the country, and Michael Jordan the basketball star. Statistical text analysis when applied appropriately, would minimize, and potentially even eliminate, errors of this nature.

Taking a page from recommender systems, if two users have similar search contexts, it is likely that pages one user clicked on in a search for the same keywords would also have potential higher importance in the other user's search. (Users with similar search tastes cluster together in n-dimensional search space - just like birds of a feather.)

In fact, one could utilize advanced methods to determine the occurrence of non-keyword statistically improbable words or phrases from the highest ranking search results for a given set of keywords, and then use these as "phantom keywords" to improve the quality of search results further so the resulting set of delivered results is even better... and gets better with time - since user clicks serve as a learning input to the users' filter bubbles with greater use.

It would be nice if users can share filter bubbles without having to share their logins. If this was supported, users joining a project late (say) would be able to execute the same level of high quality queries as those involved with that particular research area for some time. Search now becomes a social experience. There are already implementations of local search - a search engine guruji.com was built to cater to Indian audiences with search results more focused on local phenomena - news, locations etc. This went bust once Google implemented similar features.

Perhaps in the not too distant future, there might be a market-place for search filter bubbles - you buy one that gives you the best, most tailored results to your domain, Or you build one for a particular domain, then sell it to others interested in the same. If these filter bubbles can transcend barriers between search engines, late entrants could slowly wear down the first mover's advantage. If not, it is likely to be winner take all (or at the very least, a LOT)... at least for a while. Playing catch-up is no fun.

Lots more areas for advancement still open.... All in all, AI has huge potential to advance the state of search.... and to enable first movers to milk a huge cash cow. After all, as search results get better targeted, so too will micro-targeted ads... generating even more revenue. It pays to be on the bleeding edge... or get run over.

Medusa's Cave

Monday, October 24, 2016

building the next new killer phone

Thursday, October 6, 2016

building a smarter search