Skip to content

Use Vagrant and Ambari Blueprint API to install PivotalHD 3.0 (or Hortonworks HDP2.x) Hadoop cluster with HAWQ 1.3 (SQL on Hadoop) and Spring XD 1.2

License

Notifications You must be signed in to change notification settings

tzolov/vagrant-pivotalhd

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

96 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multi-VMs PivotalHD3.0 (or Hortonworks HDP2.x) Hadoop Cluster with HAWQ and SpringXD

This project leverages Vagrant and Apache Ambari to create multi-VMs PivotalHD 3.0 or Hortonworks HDP2.x Hadoop cluster including HAWQ 1.3 (SQL on Hadoop) and Spring XD 1.2.

alt text

The logical structure of the cluster is defined in a Blueprint. Related Host-Mapping defines how the blueprint is mapped into physical machines. The Vagrantfile script provisions Virtual Machines (VMs) for the hosts defined in the Host-Mapping and with the help of the Ambari Blueprint API deploys theBlueprint in the cluster. Vagrant supports PivotalHD3.0 (PHD) and Hortonworks 2.x (HDP) blueprint stacks.

The default All-Services-Blueprint creates four virtual machines — one for Apache Ambari and three for the Pivotal HD cluster where Apache Hadoop® (HDFS, YARN, Pig, Zookeeper, HBase), HAWQ (SQL-on-Hadoop) and SpringXD are installed.

Prerequisite

  • From a hardware standpoint, you need 64-bit architecture, the default blueprint requires at least 16GB of physical memory and around 120GB of free disc space (you can configure with only 24GB of disc space but you will not be able to install all Pivotal services together.
  • Install Vagrant (1.7.2+).
  • Install VirtualBox or VMware Fusion (note that VMWare Fusion requires paid Vagrant license).

Environment Setup

  • Clone this project
git clone https://github.com/tzolov/vagrant-pivotalhd.git
  • Follow the Packages download instructions to collect all required tarballs and store them inside the /packages subfolder.
  • Edit the Vagrantfile BLUEPRINT_FILE_NAME and HOST_MAPPING_FILE_NAME properties to select the Blueprint/Host-Mapping pair to deploy. All blueprints and mapping files are in the /blueprint subfolder. By default the 4 nodes, All-Services blueprint is used.

Create Hadoop cluster

From the top directory run

vagrant up --provider virtualbox

Depends on the blueprint stack either PivotalHD or Hortonworks clusters will be created. The default blueprint/host-mapping will create 4 Virtual Machines. When the vagrant up command returns, the VMs are provisioned, the Ambari Server is installed and the cluster deployment is in progress. Open the Ambari interface to monitor the deployment progress:

http://10.211.55.100:8080

(username: admin, password: admin)

Vagrant Configuration Properties

The following Vagrantfile configuration properties can be used to customize a cluster deployment. For instructions how to create a custom Blueprint or Host-Mapping read the blueprints section.

Property Description Default Value
BLUEPRINT_FILE_NAME Specifies the Blueprint file name to deployed. File must exist in the /blueprints subfolder. phd-all-services-blueprint.json
HOST_MAPPING_FILE_NAME Specifies the Host-Mapping file name to deployed. File must exist in the /blueprints subfolder. 4-node-all-services-hostmapping.json
CLUSTER_NAME Sets the cluster name as it will appear in Ambari CLUSTER1
VM_BOX Vagrant box name to use. Tested options are:
- bigdata/centos6.4_x86_64 - 40G disk,
- bigdata/centos6.4_x86_64_small - just 8G of disk space and
- chef/centos-6.6 - CentOS6.6 box.
chef/centos-6.6
AMBARI_NODE_VM_MEMORY_MB Memory (MB) allocated for the Ambari VM 768
PHD_NODE_VM_MEMORY_MB Memory (MB) allocated for every PHD VM 2048
AMBARI_HOSTNAME_PREFIX Set the Ambari host name prefix. The suffix is fixed to '.localdomain'.Note: THE FQDN NAME SHOULD NOT be in the phd[1-N].localdomain range. ambari
DEPLOY_BLUEPRINT_CLUSTER Set TRUE to deploy a cluster defined by BLUEPRINT_FILE_NAME and HOST_MAPPING_FILE_NAME. Set to FALSE if you prefer to install the cluster with the Ambari wizard. TRUE

About

Use Vagrant and Ambari Blueprint API to install PivotalHD 3.0 (or Hortonworks HDP2.x) Hadoop cluster with HAWQ 1.3 (SQL on Hadoop) and Spring XD 1.2

Resources

License

Stars

Watchers

Forks

Packages

No packages published