Friday, July 3, 2015

Big Data - Problem or Solution?

Information Indigestion.
One of my university professors who taught us Business Management used this term twenty-two years ago.
He was quoting from ‘Megatrends 2000’, a book by John Naisbitt and Patricia Aburdene, originally published in 1990.
Very soon - the professor had said - all of us will be suffering from ‘information indigestion’. 
And we students just thought of the enormous amount of information that we were then being bombarded with -- from the growing channels on Television and the growing number of books and magazines.

Sadly, that professor died in just three or four years from then.  He never got to see the growth of the Internet.  Or of the new ailment he mentioned.
I wonder what he would have thought of Google. A search engine which generates web-links pointing to so much data – on anything I ask for - that my entire life-span will not be enough to read, even, one-hundredth of what it offers.
For instance, now, I searched for the term ‘Information indigestion’ on Google. It tells me there are 8,820,000 results.
Imagine! That’s almost 9 million web pages! And, of course, it also proudly proclaims that it only took 0.28 seconds -- to tell me this.
Now, I must just trust Google’s algorithm, of listing the order of all those pages in the way I had wanted; with the most important ones, at the top. And so, I click on the first few links.
Keener researchers, I am sure, would spend hours and hours – clicking on many other links - collecting, comparing and contrasting the information obtained.  Eventually, I am sure, very often, they get confused.
It was Alvin Toffler who first coined the phrase ‘information overload’ in his celebrated work, ‘Future Shock’, published in 1970.  And everyone knows that Toffler had an amazing insight into the future. I remember reading about some bizarre predictions in that book which have all become real now.
According to IBM’s website, “Every day, we create 2.5 quintillion bytes of data (that is 2.3 trillion Gigabytes) — so much that 90% of the data in the world today has been created in the last two years alone”!
“This data comes from everywhere: sensors used to gather climate information, posts to social media sites, digital pictures and videos, purchase transaction records, and cell phone GPS signals to name a few. This data is ‘big data’”.
And IBM’s data scientists are working, like those of many other organizations, on the gigantic storage of files on database servers and knowledge base servers and are categorizing it for effective usage; and developing models to enable us to have faster access to user relevant data, information and knowledge.
Challenges include analysis, capture, data curation, search, sharing, storage, transfer, visualization, and information privacy.
Currently, volume, velocity, variety and veracity are the four Vs on the basis of which IBM’s ‘big data’ scientists are categorizing big data, and working on it.
We know, we are living now in critical times, and are told to generate less garbage, and to not pollute the earth.
I am sure the day is not very far when we will be told to generate less data, and to not pollute the cyber space.
Every picture we take, every status update we make, and every tweet we tweet, is just adding more and more junk to cyber space!
Even, this article. It just added 17 kilobytes.