Is yours bigger than mine? Big data revisited
Google Scholar lists 2,090 publications that contain the phrase 'big data' in their title. And that's just from the first 9 months of 2014! The titles of these articles reflect the interest/concern/fear in this increasingly popular topic:
One paper, Managing Big Data for Scientific Visualization, starts out by identifying a common challenge of working with 'big data':
Many areas of endeavor have problems with big data…while engineering and scientific visualization have also faced the problem for some time, solutions are less well developed, and common techniques are less well understood
They then go on to discuss some of the problems of storing 'big data', one of which is listed as:
Data too big for local disk — clearly, not only do some of these data objects not fit in main memory, but they do not even fit on local disk on most workstations. In fact, the largest CFD study of which we are aware is 650 gigabytes, which would not fit on centralized storage at most installations!
Wait, what!?! 650 GB is too large for storage? Oh yes, that's right. I forgot to mention that this paper is from 1997. My point is that 'big data' has been a problem for some time now and will no doubt continue to be a problem.
I understand that having a simple, user-friendly, label like 'big data' helps with the discussion, but it remains such an ambiguous, and highly relative term. It's relative because whether you deem something to be 'big data' or not might depend heavily on the size of your storage media and/or the speed of your networking infrastructure. It's also relative in terms of your field of study; a typical set of 'big data' in astrophysics might be much bigger than a typical set of 'big data' in genomics.
Maybe it would help to use big dataTM when talking about any data that you like to think of as big, and then use BIG data for those situations where your future data acquisition plans cause your sys admin to have sleepless nights.