The IT Corner in NetApp BlogsWhere is place located?

Log in to ask questions, participate and connect in the NetApp community. Not a member? Join Now!

Recent Blog Posts

Refresh this widget

“The whole is greater than the sum of its parts.”

― Aristotle

 

Amazon Web Services (AWS) and NetApp recently announced (November 29, 2012) the formation of a partnership to offer customers a hybrid cloud solution that leverages NetApp® data management technology with AWS’ proven and innovative cloud offerings. The result is a cloud solution that offers CIOs, CTOs, and CFOs a sensible, secure, and practical path to adopt the cloud and realize some key benefits: improve IT agility, optimize cost, and focus scarce IT resources on high-value activities to support the business.

Since we announced this partnership, customers and prospects have asked a lot of questions about the solution. To provide more information about this partnership and solution, I’m posting the answers to some of the questions that are most frequently asked.

 

What is NetApp Private Storage for Amazon Web Services?

Picture1.pngNetApp Private Storage for AWS consists of dedicated NetApp storage and data management solutions deployed in an AWS-certified colocation facility. The dedicated NetApp storage is accessible to the AWS Elastic Compute Cloud (EC2) and AWS Simple Storage Service (S3) by using high-bandwidth AWS Direct Connect. The solution essentially offers a new tier of storage for AWS users, enabling bidirectional data mobility for existing NetApp customers between on‐premises storage and NetApp storage residing in a Direct Connect colocation facility.

 

Why did Amazon partner with NetApp to deliver this solution?

Amazon is looking to move deeper into the enterprise markets where NetApp has a broad and loyal customer base (over 150,000 FAS systems deployed), making NetApp Data ONTAP® the #1 storage operating system in the world.

 

Why did NetApp partner with Amazon to deliver this solution?

The IT industry is in a transformative period where IT is shifting to an “as a service” model. Our customers are interested in leveraging private and cloud resources together. Partnering with the largest public cloud provider, AWS, enables NetApp customers to build hybrid infrastructures that leverage the strengths of private IT resources and public cloud together.

 

What about security—how secure is my data?

Security in the cloud is a huge concern for CIOs; questions about data integrity, protection, and guardianship are important considerations when selecting a cloud provider. The NetApp and AWS solution offers customers the best of both of worlds: on-premise enterprise data protection in the cloud. Because customers own the storage infrastructure, they know where their data is at all times, and they can leverage built-in data protection features to continue to protect their data in their accustomed manner. The only difference is that they now have access to massive compute resources (EC2), which they can use when and how they see necessary. Their data is always protected.

 

How is this different from Amazon Simple Storage Service (S3)?

Amazon S3 is designed to provide cloud storage as a service with scale at commodity prices, leveraging the Amazon infrastructure. NetApp Private Storage for Amazon Web Services enables NetApp customers to leverage the same performance, customization, compliance, and control they have with their on-premises NetApp storage, but within the Amazon cloud environment.

 

How is the solution supported?

The customer owns the NetApp hardware in the colocation facility. The colocation provider offers basic unpacking and installation of the equipment (racking), and NetApp support is provided through the customer’s support contract with NetApp. AWS Direct Connect, EC2, and S3 support is provided directly by AWS. In addition, this solution offers basic Elastic Block Storage (EBS) compatibility (via Direct Connect) for AWS EC2 instances. AWS support offers a highly personalized level of service for customers seeking technical help. Customers who do not select AWS support have access to basic support, offered at no additional charge, which includes the Resource Center, product FAQs, discussion forums, and support for health checks.

 

What does it mean to the CIO, the CFO, and the IT operations team and storage administrator?

  • CIOs can confidently implement a cloud strategy to support their business while minimizing risk and meeting data security and compliance requirements.
  • CFOs have a financially viable option to reduce major outlays of capital (capex) and employ a more predictable financial model (opex).
  • IT operations teams and storage administrators can continue to leverage their storage and data management best practices with confidence. Nothing really changes, because they continue to manage the NetApp infrastructure as they did before. Also, AWS is working with major enterprise technology vendors, like BMC Software, CA and others, to integrate their management and operations solutions to make it easy for IT departments to manage and operate emerging hybrid cloud environments.

 

Where can I find more information?

You can find more information about NetApp Private Storage for AWS on the NetApp Field Portal and at the AWS Marketplace. More information about AWS can be found here.

 

So, ready to create your AWS strategy with NetApp? If so, join the conversation at @NetApp on Twitter

Apple founder, Woz, speaks up. I recently had the good fortune to speak SteveWozniakChiefScientist.jpgSteve Wozniak. Steve—affectionately known as “The Woz”—is best known as the co-founder of Apple. He was the brains behind the designs of the company’s original line of computer products, the Apple I and II

 

He left full-time employment at Apple in the late 1980s. Since then, he’s been involved with number of technology startups and charitable activities. In 2009, he became the chief scientist a Fusion-io, a NetApp partner that offers technology to accelerate storage performance. I talked to Woz about his personal experiences and what he learned from his days at Apple. Other topics we discussed included innovation, the cloud, how that’s impacting business, the next evolution of computing, flash storage, and his role at Fusion-io.

 

Woz On Being Open To Good Ideas From Anywhere Within Your Company:

“The technical talent people have inside of them has the ability to solve problems. Whereas management will often hold back and take very few risks because they aren’t really sure that it can be solved or it might be too expensive. “Maybe it’s one person down at your company—one really great engineer that has this idea. So you’ve got to be open to those. And the way to be open to them is like our original HP values, when I worked at Hewlett-Packard: good communication from top to bottom. “Don’t make it a rule that you can only talk to your boss, who will talk to his boss, who will talk to his boss. The people at the top should just be totally open to talking with anybody—up and down the organization.“Engineers at the bottom are sometimes very important. They’re the heads that have the ideas that might drive your company with a great product for the next ten years.”

 

Woz Was Lucky:

“Fusion-io had a first mover advantage, but now it’s going to boil down to the sharp brains that thought out the different way that Fusion-io came up with. Are those sharp brains still going to be the leaders in shaping the future? I would predict that they are. “In the early days of Apple we had almost no risk. It’s a growing market, growing out of absolutely nothing. So when you’re starting out in a brand new field and you have a first mover advantage, everything you touch is gold.”

 

Woz Defines The Cloud:

“The cloud’s a vague term even to me. It can mean different things to different people. “You don’t know where it is. Cloud computing is a specific hardware organization where resources can be assigned remotely and switched around easily and used more effectively. It saves a lot of physical labor, moving things around and lets people change their minds easily.

“Nobody should oppose the cloud. In the end the customer’s really going to make a decision in the long term—not the short term—based upon financial cost. And cloud computing has shown a lot of ways that it lowers cost, in terms of total resources that have to be applied to guarantee all the jobs will get done.”

 

WWW Do?

“I would be looking at the cloud as a way that I can quickly assign computing abilities to the members of my company and be more responsive and keep my users in my company just happier about being able to get what they want.“I would also try to simplify the process of people being assigned equipment that’s in the cloud and maybe get rid of some of the standard bureaucratic overhead. I would try to look at it as an overall cost center but not try to pin it down to individual projects so much. “You know, don’t complicate things and you have some savings just in direct labor right there.”

 

Woz On His Chief Scientist Title:

“I made it up as really kind of a general category, because my life has become extremely busy since Steve Jobs passed away, with an awful lot more travel, speaking, and opportunities to meet different people. Fusion-io has been flexible with my travel and getting to meet lots of people around the world. “When I joined Fusion-io I didn’t have as much travel going on, so I attended a lot more meetings. I went on sales calls because it’s important to know the customer. I sat in other general staff meetings and I would propose ideas of how we might get better performance out of our chips, and some engineering ideas. But then what happened was my life got so busy. “I love this company so much. I speak about them everywhere I go—any chance I get. But I don’t really have time to be working on the planning of the future right now and I really want to. We’re working on ways to bring me in more fully in the near future.”

 

Editor’s note: This is Part 1 of a two-part interview with Woz. Check out Part 2, where Woz discusses topics such as flash and telecommuting.

A journey of a thousand miles begins with a single step. —Lao-tzu

 

Cloud computing is real and it’s here to stay. If there are any lingering doubts about the increasing role and presence of the cloud in the enterprise, here are some facts and figures worth considering about the current and future state of the cloud:

 

  • According to IDC, by 2015, one of every seven dollars spent on packaged software, server, and storage offerings will be through the public cloud model, representing a compound annual growth rate of 27.6% since 2010.
  • According to Gartner’s recent survey of more than 2,000 CIOs, cloud computing (IaaS, PaaS, and SaaS) ranks third (behind analytics and mobile technologies) in their 2013 agenda.
  • 90% of Microsoft’s 2011 R&D budget of $9.6 billion was spent on cloud computing strategy and products, indicating a huge transition in how enterprise customers will deploy and consume IT services and infrastructure.
  • 48% of U.S. federal agencies moved at least one workflow to the cloud following the new requirement that these agencies adopt a cloud-first policy.

 

1307572696_9252_full.jpg

These impressive facts confirm the shift in momentum toward the cloud, and they should serve as a wakeup call to IT practitioners that they must give the cloud the appropriate level of attention. It’s no longer a question of whether or not to use the cloud, but of how and when.

 

Also, something really interesting is happening as the cloud matures and evolves: It’s turning out to be the biggest gift that ever fell into the laps of IT, because it has the power to transform IT from a cost center to a value enabler and creator. However, there’s still a fog of mystique and vagueness about how to best engage the cloud and how IT should leverage it to support the business. This article offers a roadmap for drafting a strategy for a successful cloud adoption.

 

Why do I need a cloud strategy?

Creating a strategy based on your actual needs and requirements will make it easier to navigate the myriad of options in the evolving cloud ecosystem. The good news is that most IT organizations already have a mandate to evaluate the cloud for their IT needs; and although the natural tendency is to rush in and implement a cloud solution, you need to do your homework and due diligence first in order to understand whether the cloud is a good fit for your environment.

 

How do I go about creating a cloud strategy?

Your guiding principle should be to create value for the enterprise; your goals should be to optimize cost and find areas where the cloud can enable value creation; and your objective should be to make IT more agile and responsive to the needs of the business. My recommendation is to start small. Moving to the cloud can be a daunting and intimidating undertaking. Therefore you need a clear methodology to assess your environment and determine what applications can be easily moved to the cloud. The following seven steps offer a template for creating your cloud strategy.

 

Step 1: Inventory all your applications that IT manages: How many applications do you have? List all applications that are currently running in your environment, including both commercially available and custom applications. The outcome should be a comprehensive list of all the applications managed by IT. Don’t be surprised if there are hundreds of applications on the list.

 

Step 2: Determine whether each application is core or context (separating the wheat from the chaff): Building on step one, map each application on your list to the business process that it supports. If the application supports a key business process, then categorize that application as “core”; if it doesn’t, then categorize that application as “context.” The outcome of this exercise should be two lists: a list of core applications and a list of context applications. In general, about 60% percent of all applications are context.

 

Here are brief definitions of core and context to help you categorize each application.

 

  • Core: Any activity that creates sustainable differentiation in the target market resulting in premium prices or increased volume. Core applications provide true innovation and differentiation. Core management seeks to dramatically outperform all competitors in the domain. An example of a core application could be a unique customer support system.
  • Context: Any activity that does not differentiate the company from the customers' viewpoint in the target market. Context management seeks to meet (but not exceed) appropriate accepted standards in as productive a manner as possible. An example could be a tape backup and archival system.

 

Step 3: Identify the stage of the IT assets’ lifecycle: Is the target application due for a hardware or software upgrade any time soon (say, within the next 12 months)? The typical depreciation cycle for the physical assets is 3 to 4 years, so assets that are due for a software or hardware refresh may be perfect candidates for the cloud. The outcome of this exercise is a list of all assets with their retire/refresh dates.

 

The typical asset lifecycle phases are: plan, acquire, deploy, manage, and retire. Focus on assets that are in the “manage” and “retire” stages.

 

Step 4: Perform a technical analysis of the target applications: Assess the level of customization, amount of intellectual property, number of integration points, network connectivity requirements, and security that each target application contains or requires. You should be able to answer the question, what would it take to move this app to the cloud? The outcome should be a comprehensive document assessing the technical feasibility of moving this workload to the cloud.

 

Step 5: Perform a financial analysis of the target applications: Compare the cost of running the infrastructure on premise versus off premise. Be sure to include direct costs like staff, data center (space, cooling, power, and so on), software costs, maintenance, and support. The financial analysis should also consider the impact of a capex-centric budgetary process compared to an opex-centric one. These are not subtle changes, and it would be wise to consult with your finance team to make sure that IT is aligned with the company’s financial strategy. At this point you are beginning to build a business case, so naturally the outcome should be a business case outlining the financial pros and cons of moving the targeted IT applications to the cloud. An interesting concept to recognize here is that in the cloud we are no longer talking about “total cost of ownership” but “total cost of operations”, the new TCO.

 

Cloud-Computing-296x300.jpg

Step 6: Evaluate whether the cloud can optimize cost and/or create value: This is a critical step in the process because you are now ready to determine whether the cloud offers an opportunity to optimize cost (expense reduction) or to create value (increase revenue). Remember that the goal is not just to move a workload to the cloud but to create value in the process. For example, take a backup/archive workload, typically considered a necessary expense. If you were to move this workload to the cloud, not only you are reducing cost (no more tape, dedicated hardware, software, or staff) but you are creating the opportunity to unlock the data and making it available to the business –turning dark/dormant data into an asset that can be used for analytics, dev/test and other use cases. This is the transformative power of the cloud, transforming an otherwise “context” application into a “core” and hence differentiating asset for your company.

 

Step 7: Engage your technical partners, vendors, and peers: After completing the first six steps, you’ll be in a better position to create a sensible strategy and start engaging your technology partners and vendors, based on your specific technical and business requirements. So, are you ready for the journey?

 

In my next blog we’ll talk about moving low-risk, high-impact workloads to the cloud.

"A great place to work is one in which you trust the people you work for, have pride in what you do,

and enjoy the people you work with.” — Robert Levering, Cofounder, Great Place to Work

 

 

I decided to take a break from writing about technology and to write about culture and people instead. As you probably know, NetApp was recently selected as one of the best places to work in the United States, coming in at number 6. (The other company from the Bay Area in the top 10 was Google, at number 1.) I wasn’t surprised at NetApp’s high ranking; I expected it. NetApp has been among the 10 best places to work for many years in a row, and in 2009 we were selected as the best place to work in the United States. I’ve been at NetApp for almost 5 years, and I’m always impressed by the quality of the people I work with: My peers, colleagues, and leaders are truly dedicated to the success of our customers. I don’t need a survey or a report to tell me that I work in a great place, but it’s gratifying to know that others feel the same way I do.

 

logo.pngWhat does this recognition mean? 

 

This recognition means that NetApp is focusing on its employees, who in turn focus their energy and attention on the customer. I believe that this recognition is a reflection of who we are and how much pride we take in doing our jobs. It’s indicative of our culture, which values innovation, teamwork, diversity, and passion for taking care of our customers. Our customers do come first.

 

 

What makes a great place to work? 

According to the Great Place to Work Institute, a great place to work is one where employees consistently:

 

  • Trust the people they work for.
  • Trust (v). Firm belief in the reliability, truth, ability, or strength of someone or something. Managers are very important in keeping employees motivated. According to the Department of Labor, 64% of working Americans leave at least one job during their careers because they don't feel appreciated. Most employees don’t leave a company, they leave a manager.

 

  • Have pride in what they do.
  • Pride (n). A feeling of pleasure from one's own achievements, from the achievements of those with whom one is associated, or from qualities or possessions. It’s a great time to be a technologist and innovator, and people at NetApp are laser focus on creating solutions that help our customers manage and gain value from their data. NetApp employees report a high degree of job satisfaction,  which translates into higher productivity, improved product quality, and greater customer satisfaction.

 

  • Enjoy the people they work with.
  • Enjoy (v). Take delight or pleasure in an activity or occasion. Why wouldn’t we? After all, we spend a significant part of our day and our lives at work.

 

What does it mean to our customers and partners? 

It’s a fact that people do business with people they like. For our customers and partners, it means the knowledge that the NetApp professionals who help create and deliver value are committed to their success.

 

untitled.pngWhat’s next for NetApp?

As we continue to grow and innovate, it’s more important than ever to attract and retain top talent. We are hiring people who want to be part of something big, important, and meaningful. This recognition is a great recruiting resource, because talented and interesting people want to work with equally talented and interesting people.

 

Congratulations NetApp, keep up the good job!

 

Trust + Pride + Enjoyment = A Great Place to Work.

As big data continues to push and stretch the limits of conventional database and data processing technologies, Hadoop is emerging as an innovative, transformative and cost effective solution to tackle the big data challenges. Hadoop, the open-source software framework that supports data-intensive distributed applications was created by Doug Cutting in 2006 while at Yahoo. In this interview Doug shares his insight about the genesis and future of Hadoop.

 

  • According to some estimates, the total addressable market for big data is about $100B
  • Enterprise Strategy Group research shows that 50% of IT organizations are either doing something with Hadoop or planning to do something with Hadoop in the next 12 to 18 months.
  • One of the most searched terms on the Gartner website is Hadoop. In the last 12 months, those searches spiked over 600%.

 

Did you ever imagine Hadoop was going to get that big?

No, no, not at all. I was very lucky to happen across something that was becoming a big trend in computing. At the time, I thought, “There’s this wonderful technology at Google. I would love to be able to use it but I can’t because I don’t work at Google.” Then I thought, “There are probably a lot of other people who feel that same way, and open source is a great way to get technology to everyone.” I understood from the beginning that by using Apache-style open source, we could build something that would become the standard implementation of that technology. That was the goal from the outset, to build a commodity implementation of GFS and MapReduce.

 

Let’s talk about your job as Chief Architect at Cloudera, what do you do at Cloudera?

doug_cutting.jpgI have three things that I do, but I don’t really think of any of them as being an architect.
I spend about a third of my time working on software sales as an engineer. I like to keep my hands dirty. Ever since I was in my 20s, I’ve thought that pretty soon I’d be too old to program, and every decade I’ve kept programming. I’m about to turn 50, and we’ll see if I keep on programming through my 50s. I don't know. So that’s about a third of my time. 

I spend roughly another third doing what I call politics, which is largely working at Apache, helping to keep things running smoothly. I’m chair of the board of directors, so it’s my responsibility to put together the agenda for the meetings, to run the meetings, and to try to resolve the issues that come up. 

My third role is as part of the marketing team for Cloudera and for big data and Hadoop generally.

 

When you created Hadoop in 2006, the term “big data” hadn’t even been coined, what was the problem you were trying to solve back then?
Most of the technology that we named Hadoop in 2006 was actually stuff that we’d been building since about 2003 in a project called Nutch. The problem I was trying to solve was a very specific problem of crawling the web, collecting all of these web pages and building indexes for them and maintaining them.

For the Nutch project we needed distributed computing technology, we needed to store datasets that were much bigger than we could store on one computer, and we needed to have processes that would run and be coordinated across multiple computers. We saw the papers from Google, the GFS paper and the MapReduce paper, and we thought, “That’s the right platform.” So we set about rebuilding that platform. 

The problem we were trying to solve was a very specific problem of building search engines. Then people said, “Hey, this would be useful in a lot of other problems,” although initially, when we turned it into Hadoop, that was at the behest of Yahoo!, who weren’t interested in the search part. They were just interested in the general- purpose distributed platform. So we decided to take that part and call it Hadoop. But Yahoo! was interested in it for web search, for the same problem that we were already using it for. That was what drove the decision by Yahoo! to adopt this technology. 

 

Do you think business and IT leaders understand the value and potential of Hadoop? Do they get it?

I think they’re beginning to get it, yes. There’s been a lot of good writing about this trend, and I think people recognize the trends that are leading us here. With hardware becoming more and more affordable, and keeping in mind Moore’s Law, you can afford a huge amount of computation, a huge amount of storage, yet the conventional approaches don’t really let you exploit that. On the other hand, more and more of our business is becoming online business. Businesses are generating vast quantities of data. If you want to have a picture of your business, you need to save that data; and you need to save it affordably and be able to analyze it in a wide variety of ways.

 

In reference to Geoffrey Moore’s “crossing the chasm” theory, do you think that Hadoop has crossed the chasm from being an early adopter project to an “early majority”?

I haven't seen a real chasm that we need to cross or a trough that we need to get through. There’s a lot of tension, a lot of hype, but I believe that the level of adoption is steadily increasing. People’s expectations are reasonably well matched. They understand that it’s a new technology, and they’re cautious about moving to it because it generally involves a big investment in hardware, and you have to train people. So they start with maybe 15-node clusters, maybe not production, exploring the technology. But the next year they double the size of their cluster, or they start another cluster. It appears to be a relatively steady rate of adoption, rather than the classic hype curve, where people assume that it’s going to be much bigger than it is and then they’re disappointed. After that, it actually becomes a stable part of the estimated platforms. With Hadoop, I don’t see that overhang, where the adoption is over anticipated. Maybe I’m blind to it because I’m right in the middle of it, but it seems like the expectation is that it will be a big part of computing. 

 

What can CIOs do to explore what’s possible with Hadoop?

IT should facilitate experiments.  It should deploy a test cluster with representative datasets from around the organization.  These might just be subsets of full datasets if the full datasets are too big, or they might be slightly out-of-date.  The point is to permit folks to try analyses that weren't possible before, either because the datasets were in different silos or because the systems that hosted them didn't support certain analyses, e.g., machine learning algorithms.  If you provide such an experimental playground to the smart folks in a business they can try out ideas to find things that help the business

 

Providing an experimental playground is important but you also need Hadoop expertise, which is in very high demand these days –How does Cloudera help bridge that gap?

That’s right; we do a lot of training. Training has become a huge part of our business, because people need to be trained in order for the technology to grow. We have courses in everything from cluster administration to programming in the various components, and we’ve just added a course in data science. There are a couple of generations of people out there in the industry who learned a certain way of doing things, and it can be hard to transition. But people can and do learn the new skills they need. They’re not so different. System administrators who have Linux and UNIX skills tend to be the most successful, because it’s basically built on UNIX. So people who are familiar with that world tend to pick up the new skills pretty quickly. I believe that many companies already have people who can become data scientists, people who know their business problems, who understand the data to analyze in the company, who know a little statistics and can put those skills together and come up with solutions for the company.

 

In the context of big data, when is Hadoop an appropriate platform and when is it not?

It’s appropriate in more and more cases. Originally it was very much a vast computing engine, so it was appropriate for off-line analyses of massive datasets, analyses that businesses couldn't afford to do. Many of our customers were able to keep the last month of their data online and visualize it and analyze it, but with Hadoop, they’re able to take five years of data as the basis of their analysis. That allows them to see a lot more trends with better precision and to do a better job of marketing or selling or manufacturing, whatever they’re trying to improve, but as a batch off-line process. We added Apache HBase to the stack, so that companies are able to keep online key value stores on which they can do batch analyses, and can also update them and look at them in real time. Now with Impala we can do complex queries over multiple tables interactively, and that opens up yet more applications.

 

There are lingering questions about Hadoop’s security and reliability. Why do you think Hadoop is ready for the enterprise?

We’ve been attacking those very problems. We started with authentication and authorization, and now that’s pretty much right across the platform. Now we’re deploying encryption across the platform. We’re not quite all the way there yet, but we’re getting closer to our goal of being able to encrypt data end to end, as well as at rest. This is very much driven by enterprise needs.

In terms of reliability, we’ve spent a lot of time over the past year working on high availability for HDFS, and that’s been out in customers’ hands for six months or so. Now we’re working on snapshots to provide better disaster recovery so that businesses can do replication in multiple data centers more affordably. Those are some of the ways in which enterprise demands are driving the development of software today.

 

What is the role of vendors like Cloudera in helping customers realize the value of Hadoop?

Fundamentally, Cloudera's role is filling the gap between what bunch of open source projects deliver and what an enterprise wants.  The first part of that is a well-tested, integrated, software distribution.  Then there's support for that distribution, answering questions, helping to resolve difficulties and promptly supplying bug fixes.  Finally there's our Cloudera Manager software that helps to configure & monitor clusters.  Combined, these let enterprises focus on using Hadoop to solve their problems, not wasting their time figuring out how to install, configure and debug it.

 

What would you like to share with CIOs about Hadoop’s future and why they should be investing in Hadoop now? Hadoop-logo.jpg

Businesses should be investing in Hadoop because it can help them solve the problems that they have today. In the long term, I think all the trends point to Hadoop becoming a mainstay of enterprise data computing, if not the mainstay. It’s a general-purpose platform that will be able to handle  most of the workloads that businesses are now doing with other systems, as well as new kinds of workloads that weren't possible before and different kinds of analyses that weren’t practical on earlier systems. There’s definitely value in getting started with Hadoop and finding that first application, but the first application should also be something that’s useful.

 

Consider Hadoop and mobile technology, there’s a potential to create a massively distributed architecture using mobile technology: compute, storage, and network in a smart phone. Can you imagine using Hadoop in a way that’s akin to SETI @ home?

Well, the place where Hadoop shines the most is where there are huge amounts of data that needs to be moved around. So the SETI approach of having lots of computers all over the world connected wouldn’t work so well with Hadoop because it’s hard to move the data to all these different places. What tends to work well is to have all the data in one data center and have very fast networks between those nodes and do the computing there. Although cell phones are the new commodity hardware for consumers.

 

Seven billion of them! The potential is there.

Yes and the processors in cell phones are much more cost effective. You need maybe 10 of them to take the place of one traditional CPU, but they still use much less power. We’re already starting to see clusters built with ARM processors, for example, and that will make things much less expensive. But I don't know that we’ll see the data-intensive computing that Hadoop is known for, spread out to 7 billion cellphones. The front end will be cellphones, but the data movement is going to stay in the data center.

 

On a personal level, I’m curious to know what inspires you.  What drives you to solve big challenges? 

I like to think about technologies that will make a difference. I’ve always loved open source because it’s such a tremendous lever. What I look for is a way to find the smallest thing I can do, with the least amount of work that will have the most impact. Where is the leverage point? Hadoop came out of that. We needed to do some vast computing, but I also saw a lot of other workloads that could benefit from this.

 

What interesting books have you read lately?

The last novel I read was Anathem, by Neal Stephenson. It’s a great science fiction book. I gave it to my 12-year-old son and he loved it too.

 

What is the book about?

At the beginning, it’s about a convent of philosophers, a bunch of people who live separate from the rest of society. They think about knowledge—all the big, fundamental questions. They’re philosophers, and they study all the different theories about the world. They don’t settle on any particular theory, but they talk about all of these different philosophies and their merits and the consequences of understanding them. It’s a fun thing to think about all these different belief systems without being dogmatic and saying, “Oh, no, you have to believe this.” And then it turns into a great swashbuckling adventure with aliens and rockets and other things.

More

Formatted Text

Reading List