Friday, September 23, 2011

EMC Puts Together 1,000-Node Hadoop Care Package

Cloud Computing: EMC Puts Together 1,000-Node Hadoop Care Package — EMC, well, at least its Greenplum unit has put together a thousand-node multi-tenant analytics platform to accelerate the development and testing of the disruptive but persnickety open source Apache Hadoop software. That’s 1,000-odd hardware nodes or 10,000 nodes when you count virtual machines, along with 24PT of physical storage. In a word, that’s huge. Greenplum wants to ensure that Hadoop turns into a really serious enterprise-ready Big Data tool for sifting through mounds of unstructured data to unlock their secrets and make predictions but contends that nobody really knows how to deploy the stuff in part because everybody’s Hadoop is different. So it wants to come up with a formula others can follow and overturn the long-standing Hadoop dogma that one can’t have separate compute and storage on the same nodes. Greenplum says it increases efficiency and requires less hardware, pointing to Yahoo’s 40,000 Hadoop servers, which only offer 10%-15% utility. It thinks it can get 80%.

No comments:

Post a Comment