Here’s a little presentation i had to give at work to my team when we first deployed MongoDB and i needed to bring them up to speed.
I discovered this handy little sandbox environment put out by Hortonworks that allows you to play with and learn a little bit about Hadoop. We use Hadoop here at work but its not something I can exactly play with while its in production so i decided id stand up my own instance and see what i could do with it.
I got Hadoop downloaded from their website over at http://hadoop.apache.org/ As i started looking into the install instructions I realized their documentation isn’t all that great. they assume a lot and its not really geared to someone who isn’t familiar with the product already. One of those scenarios you often see in IT….you need to know about the “thing” before you can learn about the “thing”. If you are curious and want to try and make heads or tails of the single node install you can go here: http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html
I started poking around on the net figuring there had to be some sort of a tutorial or something about how to get Hadoop installed and I found a product called Hortonworks…they do Hadoop. They also have a sandbox that you can download that has everything you need pre-installed and configured for you. It imports into Virtualbox, HyperV, VMWare, and the download size is about 2Gb.
The other cool thing about this is that there are included tutorials. It doesn’t really help to download and install something if you don’t know what to do with it after that. There are also links to community contributed tutorials and also partner tutorials from folks like Splunk, Concurrent, Tableau, and Datameer. We have just started using a Splunk here at work so I’m looking forward to digging into that a little bit also. The main Hortonworks site that you use in your install has a button to update the tutorials that Hortonworks updates every so often. I’ve only had this running for a few days so its not time to try that feature out yet.
All in all this looks pretty sharp and well put together for people who are new to big data and Hadoop. Id highly recommend this sandbox to get yourself up and running and learning a little Hadoop. Good luck!
Here’s a quick rundown of what I got in the download. Your version may be different by the time you read this.