While most non-techies have never heard of Google’s Bigtable, they’ve probably used it. It is the database that runs Google’s Internet search, Google Maps, YouTube, Gmail, and other products you’ve likely heard of. It’s a big, powerful database that handles lots of different data types.
Now Google is opening up Bigtable to business customers as a cloud-based service that it says is less pricey than competitive NoSQL databases such as Amazon’s (AMZN) DynamoDB. NoSQL or “not only SQL” databases differ from the SQL databases in that they handle many unstructured data types, not just the structured row-and-column data of the relational world.
And, to make sure Google Cloud Bigtable can work with lots of data without a ton of prep work, it will support the HBase application programming interface (“API.”) HBase, an open-source NoSQL database, is used by tons of companies from Ebay to Pinterest to handle their data troves.
Miles Ward, head of global solutions for Google(GOOG) claimed that Google Cloud Bigtable, which it’s making available in beta test form, performs 2.5 times faster than Cassandra or other NoSQL databases. And, because of that HBase API support, it can work with tools like Hadoop, Spark, Storm, BigQuery which are popular in the IT shops at companies including WalMart.
SunGard is using the new Google service as it competes to build a consolidated audit tool for the U.S. financial markets. It went that route, because of Bigtable’s ability to deal with huge volumes of data and, for this application, the data set would amount to 35 petabytes for seven years worth of information, said Neil Palmer CTO of SunGard Consulting Services. “Rather than get a data center and buy hardware up front we can scale up and down with Bigtable.”
Qubit is testing Bigtable along with its existing HBase implementations and is very pleased, said Emre Baran, CTO of the London-based web analytics and personalization company.
Baran said Bigtable is easier to use than HBase, which needs hands-on tweaking and management when dealing with big data loads. “With Google you take your code, tell it how many nodes and Google guarantees performance,” he said.
As for pricing, Google has separated storage from computing. So each compute node will cost 65 cents per hour with a minimum of three nodes overall. Fast solid-state storage will be 17 cents per gigabyte per month with cheaper magnetic disk storage coming at 2.6 cents per gigabyte per month coming soon. That means it’s affordable for companies to let data lay in Google storage for months on end for minimal cost and then just pay for processing by the minute.
Amazon’s DynamoDB is a great product Baran said, but the practice on the e-commerce giant’s cloud service is to charge per database read or write operation, and that gets expensive fast with his type of application.
Bigtable, which has been in production at Google for a decade is the granddaddy of the NoSQL world albeit a “grandpa that can run circles around its grandkids,” Baran said.
Google’s technology has a ton of credibility for large-scale workloads. What isn’t clear is how well the company will do selling that technology into business accounts. Google after all is a consumer-focused Internet search and advertising company at its heart and it has launched and pulled back lots of projects—Google Glass, Google Wave, Google Reader etc.
Baran, who used to work at Google, said even he hesitated to jump aboard the Bigtable bandwagon until he heard about the HBase API support. That means even if Google did shut Cloud Bigtable down, it would be trivial to move his company’s data somewhere else, he said. But, given that Google itself depends on Bigtable, there is little or no possibility that it would shutter this service, he noted.
The technology is impressive, agreed Gartner Research Director Nick Heudecker. But many still wonder how serious Google is about courting business users and workloads. So far, the vendor’s approach has been “if we build it they will come,” he said. And it’s not clear how well that’s working.
Bigtable: A Distributed Storage System for Structured Data
designdistributed architectureslarge-scale distributed storageparallel and distributed dbmss
The ACM Digital Library is published by the Association for Computing Machinery. Copyright © 2018 ACM, Inc.