This post will show you have to get a super fast kickstart into development with Hadoop (1.2.1). We will use Vagrant (1.2.7) to supply a virtual Hadoop server machine. First of all visit the Vagrant homepage and install it on your system. In addition we need VirtualBox (4.x) to actually run our VM.
Done with that, create a new directory and create a new file called Vagrantfile. This will contain our configuration for the virtual machine that runs Hadoop.
config.vm.provider :virtualboxdo|vb| vb.gui = false vb.customize ["modifyvm", :id, "--memory", "2048"] vb.customize ["modifyvm", :id, "--cpus", "2"] end end
We use Ubuntu LTS 12.04.2 x64 as operating system, 2 GB of RAM and 2 virtual CPU cores. The machine will be reachable under the IP 10.10.10.10 from our local machine. Now lets delegate the nifty work of creating and booting a VM to Vagrant by executing vagrant up. Vagrant will load the base image, configure the network and start the VM with VirtualBox. When Vagrant has finished you can SSH into the machine with vagrant ssh. Creating a secure shell to the VM by its private network IP 10.10.10.10 is possible, too, but by now we don’t have a username/password to get access that way. So lets enter the VM with vagrant ssh and install Hadoop. For that I have create a single script that does all the work.
# configure proxy for APT and export it to environment if [[ $PROXY_HTTP ]]; then echo"Acquire::http::Proxy \"$PROXY_HTTP\";" >> /etc/apt/apt.conf.d/01proxy export http_proxy=$PROXY_HTTP fi if [[ $PROXY_HTTPS ]]; then echo"Acquire::https::Proxy \"$PROXY_HTTPS\";" >> /etc/apt/apt.conf.d/01proxy export https_proxy=$PROXY_HTTPS fi
## ======================== ## Base configuration ## ========================
# update packages aptitude update
# install vim aptitude install -y vim
## ======================== ## Hadoop installation ## see http://hadoop.apache.org/docs/stable/single_node_setup.html ## ========================
# install Oracle Java 6 JDK and make it default Java environment aptitude install -y oracle-java6-installer update-java-alternatives -s java-6-oracle
# show Java VM information java -version
# create group hadoop and user $HADOOP_USER addgroup $HADOOP_GROUP adduser --ingroup $HADOOP_GROUP --disabled-password --gecos ""$HADOOP_USER
# set password of $HADOOP_USER to $HADOOP_USER echo-e"$HADOOP_USER\n$HADOOP_USER\n" | passwd $HADOOP_USER
# create an unencrpyted rsa keypair for user $HADOOP_USER and add it to its authorized keys sudo -u $HADOOP_USER ssh-keygen -t rsa -P ""-f /home/$HADOOP_USER/.ssh/id_rsa sudo -u $HADOOP_USER cp /home/$HADOOP_USER/.ssh/id_rsa.pub /home/$HADOOP_USER/.ssh/authorized_keys
# display Hadoop user credentials echo"=============================================" echo"Installation finished" echo"Username: $HADOOP_USER" echo"Password: $HADOOP_USER" echo"" echo"PLEASE CHANGE THE PASSWORD OF THE USER!" echo"============================================="
# exit exit 0
We just have to agree to the license of Oracle Java 6 (I could not find the way to bypass this single interactive step). Everything else is done by the script. The single steps are commentated as good as possible. Receive a copy of the script and execute it as root. Wait a while, accept the Java license and you are done. A fully working Hadoop server for your pocket. To ensure that everything works as intended lets execute some test commands.