PiFlow V0.5 Release
Requirements
- JDK 1.8 or newer
- Spark-2.1.0
- Hadoop-2.6.0
- Hive-1.2.1
- Other products you want to use, such as ElasticSearch, Solr,MongoDB etc.
Configure
-
config.properties
#server ip and port server.ip=10.0.86.191 server.port=8002 #h2 db port h2.port=50002 #spark and yarn config spark.master=yarn spark.deploy.mode=cluster yarn.resourcemanager.hostname=10.0.86.191 yarn.resourcemanager.address=10.0.86.191:8032 yarn.access.namenode=hdfs://10.0.86.191:9000 yarn.stagingDir=hdfs://10.0.86.191:9000/tmp/ yarn.jars=hdfs://10.0.86.191:9000/user/spark/share/lib/*.jar yarn.url=http://10.0.86.191:8088/ws/v1/cluster/apps/ #hive config hive.metastore.uris=thrift://10.0.86.191:9083 #piflow-server.jar path, remenber to modify piflow.bundle=/opt/piflowServer/piflow-server-0.9.jar #checkpoint hdfs path checkpoint.path=hdfs://10.0.86.89:9000/piflow/checkpoints/ #debug path debug.path=hdfs://10.0.88.191:9000/piflow/debug/ #yarn url yarn.url=http://10.0.86.191:8088/ws/v1/cluster/apps/ #the count of data shown in log, set 0 to show no data data.show=10
Run Command
- start: ./start.sh or nohup ./start.sh > piflow.log 2>&1 &
- stop: ./stop.sh