This post was originally published to Big Data Bingo.
Architecting Hadoop and Big Data Analytics for the Enterprise
At Cisco Live in Melbourne (week of March 4th), NetApp and Cisco are demonstrating a joint reference architecture for Hadoop based on NetApp storage and Cisco servers and networking. The joint reference architecture makes Hadoop easier to deploy and provides high availability for Hadoop clusters, which is ideal for enterprises in data intensive industries that have tight SLAs around their data applications. This expands our already strong alliance with Cisco by offering pre-sized storage, networking and compute for Big Data analytics.
Let’s break out the main components of the reference architecture. First, NetApp E5400 storage arrays and FAS 2240 storage systems, which provide highly reliable and scalable storage for Hadoop while improving cluster efficiency and scalability. Then are added Cisco UCS C220 M3 servers for the DataNodes, UCS 6296 Fabric Interconnects and Nexus 2232 Fabric Extenders. So this combines highly reliable storage from NetApp, adds scalable and high density Cisco servers, fast Cisco interconnects, and tests it with Cloudera (Enterprise Manager). Based on the successful FlexPod model, this joint reference architecture delivers pre-sized, validated components for Hadoop workloads with the benefit of seamless integration with existing FlexPod deployments.
Some of the key innovations include 10 PB of scalability in a single switching domain, over a PB of storage capacity per rack, and hot-swappable spares. With disk failure protection via RAID, the reference architecture requires only two copies of data to run Hadoop. This is important as default Hadoop requires three copies for high availability of the data. Not only does this increase storage efficiency, it leads to less network congestion and higher performance. The higher reliability means that Hadoop clusters don’t fail as much, jobs don't restart as much and there’s a faster recovery from downtime.
The joint reference design is anticipated to be a Cisco Validated Design (CVD) later in March, please check the CVD webpage for publication. Solutions based on the reference architecture are planned to be available summer 2013 via the well-established FlexPod channel, as well as from field sales and partners from both companies.