Thursday, September 20, 2012

Biggest Bottleneck in Bigdata is the human

Today's WSJ carries an article on Bigdata being the brain behind hiring in companies. There are lots of Bigdata articles all around and each one points to a new bottleneck for the industry to overcome. There is one bottleneck that no one discusses. It is the ultimate consumer of Bigdata - the human. If we have trouble getting computers to deal with Bigdata, imagine presenting the analysis to a human. We are simply not wired to consume all this analysis. That is where visualization steps in.

So what is visualization of Bigdata? It is the rendering of insight in a data analysis using images, animations, selective disclosures, progressive disclosures or charts/figures/clouds/bubbles etc. The challenge here is not in rendering these visual elements, but in mapping these elements to the data that is sourced across the internet, parsed by multiple parsers, collated/curated and correlated with multiple streams and displayed on a canvas that is hosted in yet another place. Then there is the debate of HTML5 vs native too.

Visualization of Bigdata is a field actively targeted by research community and there is a strong business case for it as well: it solves the biggest bottleneck in the bigdata ecosystem i.e. the human.

LLM - Not everything can be learned - so let's realign it to our preferences

When I first started researching LLMs it seemed like the technology could simply learn and get to a point where it is self-learning artifici...