In an effort to put big data’s growth into perspective, UC San Diego computer engineering & science professor Larry Smarr offered this analogy: Start by placing one grain of rice on a chessboard, then double it exponentially square by square.
“By the time you get to the 64th [and final] square, you now have a pile of rice that’s as big as Mount Everest,” said Smarr, “which is 1,000 times the production of rice on the planet.”
Larry Smarr
Such wonderment at big data’s impact on scientific inquiry and everyday lives was the theme of a March 12 seminar and panel discussion, “Big Data: A Conversation with the Experts,” presented by UC San Diego Extension and recorded by UCSD-TV for future airing.
“Never in our history have we had a sustained period of this kind of exponential growth [in computer science],” said Smarr, the seminar’s lead speaker and director of the California Institute for Telecommunications and Information Technology (Calit2). “It keeps getting bigger and bigger. … What we’re talking about is something humanity has never tried to deal with before.”
Amid intermittent references to exabytes and petabytes — computer shorthand for untold billions of bytes of digital information — the session dealt with the gathering, storing, and analysis of massive amounts of computerized data, both unstructured and multi-structured.
The 90-minute seminar was held in Atkinson Hall at UC San Diego’s Qualcomm Institute.
Another speaker, Dr. Michael Norman, director of the campus’ San Diego Supercomputer Center (SDSC), spoke glowingly of its massive supercomputer, launched nearly two years ago and dubbed “Gordon.”
Rated as one of the world’s fastest large-capacity supercomputers, Gordon shines as a big data repository. Named for its massive amounts of flash-based memory, Gordon keeps tabs on data emanating from the far-flung fields of climatology, Wall Street, food production, big industry, physics, biological science, government – the list goes on and on.
Norman outlined the three major categorical functions of big data: 1) volume, the amount of data; 2) velocity, the speed of information generated; and 3) variety, the kind of data that’s readily available.
Michael Zeller, founder and chief executive officer of San Diego-based Zementis, a leading big data firm, pointed out a fourth “V” category, that of value.
“Is there big hype?” asked Zementis. “Absolutely. But big data transcends all typical boundaries, and we’re just now scratching the surface of big data’s ultimate value in business opportunities. The hype has alerted the executive level about big data. Cutting through the noise is a bigger issue.”
Stefan Savage, professor of computer science and engineering at UC San Diego, pointed out big data’s inherent security risks, notably the recent massive data leaks from Target and, more troubling, from the National Security Council.
His admonition: “Security is very much a data-driven field. The goal is to understand the environment better, faster, and more efficiently than your adversaries.”