This file details how to set up docker and create a hive server on linux/mac
Navigate to the root folder where the needed files are located (3_Hadoop in this case).
Start up the terminal from within this folder or simply navigate to it from terminal using
$ cd /path/to/folder/3_Hadoop
Next execute the following command:
$ docker-compose up
After it has finished execution, in a new terminal (also from within 3_Hadoop), execute the following commands to access the hive-server
$ docker exec -it hive-server /bin/bash
Navigate to the all_tracks directory on the hive-server container
ls
cd ..
ls
cd all_tracks/
To create the all_tracks table, execute the all_tracks_table.hql file using the following command:
hive -f all_tracks_table.hql
To add data from the all_tracks.csv file to the table that was just created, the following command is executed
hadoop fs -put all_tracks.csv hdfs://namenode:8020/user/hive/warehouse/testdb.db/all_tracks
To view the data on the server, the following commands are executed
hive
HQL queries:
show databases;
use testdb;
select * from all_tracks limit 10;