Releases · cas-bigdatalab/piflow · GitHub

13 May 12:23

sosoll7

PiFlow V1.0 Release

Requirements

JDK 1.8
Scala 2.11.8
Spark-2.1.0、Spark-2.2.0、Spark-2.3.0（other spark version of piflow.jar should be built with code）
Hadoop-2.6.0 （other hadoop version of piflow.jar should be with code）
Hive-1.2.1（if you need to use hive，setup and modify the config.properties）

config.properties

  spark.master=yarn
  spark.deploy.mode=cluster
  
  #hdfs default file system
  fs.defaultFS=hdfs://10.0.85.83:9000
  
  #yarn resourcemanager hostname
  yarn.resourcemanager.hostname=10.0.85.83
  
  #if you want to use hive, set hive metastore uris
  #hive.metastore.uris=thrift://10.0.85.83:9083
  
  #show data in log, set 0 if you do not show the logs
  data.show=10

  #monitor the throughput of flow
  monitor.throughput=true

  #server port
  server.port=8001

  #h2db port
  h2.port=50001

Command

  ./start.sh
  ./stop.sh
  ./restart.sh
  ./status.sh

Assets 3

30 Nov 12:26

judy0131

PiFlow V0.9 Release

Requirements

JDK 1.8
Scala 2.11.8
Spark-2.1.0、Spark-2.2.0、Spark-2.3.0（other spark version of piflow.jar should be built with code）
Hadoop-2.6.0 （other hadoop version of piflow.jar should be with code）
Hive-1.2.1（if you need to use hive，setup and modify the config.properties）

config.properties

  spark.master=yarn
  spark.deploy.mode=cluster
  
  #hdfs default file system
  fs.defaultFS=hdfs://10.0.85.83:9000
  
  #yarn resourcemanager hostname
  yarn.resourcemanager.hostname=10.0.85.83
  
  #if you want to use hive, set hive metastore uris
  #hive.metastore.uris=thrift://10.0.85.83:9083
  
  #show data in log, set 0 if you do not show the logs
  data.show=10

  #monitor the throughput of flow
  monitor.throughput=true

  #server port
  server.port=8001

  #h2db port
  h2.port=50001

Command

  ./start.sh
  ./stop.sh
  ./restart.sh
  ./status.sh

Assets 3

30 Sep 12:28

judy0131

PiFlow V0.8 Release

Requirements

JDK 1.8
Scala 2.11.8
Spark-2.1.0、Spark-2.2.0、Spark-2.3.0（other spark version of piflow.jar should be built with code）
Hadoop-2.6.0 （other hadoop version of piflow.jar should be with code）
Hive-1.2.1（if you need to use hive，setup and modify the config.properties）

config.properties

  spark.master=yarn
  spark.deploy.mode=cluster
  
  #hdfs default file system
  fs.defaultFS=hdfs://10.0.85.83:9000
  
  #yarn resourcemanager hostname
  yarn.resourcemanager.hostname=10.0.85.83
  
  #if you want to use hive, set hive metastore uris
  #hive.metastore.uris=thrift://10.0.85.83:9083
  
  #show data in log, set 0 if you do not show the logs
  data.show=10

  #monitor the throughput of flow
  monitor.throughput=true

  #server port
  server.port=8001

  #h2db port
  h2.port=50001

Command

  ./start.sh
  ./stop.sh
  ./restart.sh
  ./status.sh

Assets 3

29 Jun 07:57

judy0131

V0.7-spark-3.0.0

PiFlow V0.7-spark-3.0.0 Release Pre-release

Pre-release

Requirements

JDK 1.8
Scala 2.12.10
Spark-3.0.0
Hadoop-3.2.0

config.properties

  spark.master=yarn
  spark.deploy.mode=cluster
  
  #hdfs default file system
  fs.defaultFS=hdfs://10.0.85.83:9000
  
  #yarn resourcemanager hostname
  yarn.resourcemanager.hostname=10.0.85.83
  
  #if you want to use hive, set hive metastore uris
  #hive.metastore.uris=thrift://10.0.85.83:9083
  
  #show data in log, set 0 if you do not show the logs
  data.show=10

  #monitor the throughput of flow
  monitor.throughput=true

  #server port
  server.port=8001

  #h2db port
  h2.port=50001

Command

  ./start.sh
  ./stop.sh
  ./restart.sh
  ./status.sh

Assets 3

30 Apr 08:17

judy0131

PiFlow V0.7 Release

Requirements

JDK 1.8
Scala 2.11.8
Spark-2.1.0、Spark-2.2.0、Spark-2.3.0（other spark version of piflow.jar should be built with code）
Hadoop-2.6.0 （other hadoop version of piflow.jar should be with code）
Hive-1.2.1（if you need to use hive，setup and modify the config.properties）

config.properties

  spark.master=yarn
  spark.deploy.mode=cluster
  
  #hdfs default file system
  fs.defaultFS=hdfs://10.0.85.83:9000
  
  #yarn resourcemanager hostname
  yarn.resourcemanager.hostname=10.0.85.83
  
  #if you want to use hive, set hive metastore uris
  #hive.metastore.uris=thrift://10.0.85.83:9083
  
  #show data in log, set 0 if you do not show the logs
  data.show=10

  #monitor the throughput of flow
  monitor.throughput=true

  #server port
  server.port=8001

  #h2db port
  h2.port=50001

Command

  ./start.sh
  ./stop.sh
  ./restart.sh
  ./status.sh

Assets 3

28 Nov 12:54

yg000

PiFlow V0.6 Release

Requirements

JDK 1.8 or newer
Spark-2.1.0 （piflow jar of other spark version should be build by the code）
Hadoop-2.6.0 （piflow jar of other hadoop version should be build by the code）
Hive-1.2.1（if you need to use hive）

Configure

  #server Ip and Port
  server.ip=10.0.88.70
  server.port=8002
  
  #Spark master and deploy mode
  spark.master=yarn
  spark.deploy.mode=cluster
  
  #yarn related configurations
  yarn.resourcemanager.hostname=10.0.88.70
  yarn.resourcemanager.address=10.0.88.70:8032
  yarn.access.namenode=hdfs://10.0.88.70:9000
  yarn.stagingDir=hdfs://10.0.88.70:9000/tmp/
  yarn.jars=hdfs://10.0.88.70:9000/user/spark/share/lib/*.jar
  yarn.url=http://10.0.88.70:8088/ws/v1/cluster/apps/
  
  #hive metaStore uris
  hive.metastore.uris=thrift://10.0.88.71:9083

  #piflow server jar folder, please change this parameter to your path
  piflow.bundle=/data/piflow/piflow-server-v0.6/lib/piflow-server-0.9.jar
  
  #hdfs path for checkpoint、debug、increment，please create these folders first 
  checkpoint.path=hdfs://10.0.88.70:9000/user/piflow/checkpoints/
  debug.path=hdfs://10.0.88.70:9000/user/piflow/debug/
  increment.path=hdfs://10.0.88.70:9000/user/piflow/increment/

  #set 0 if you don not want to show data in log
  data.show=10

  #h2 db port
  h2.port=50002

Command

  ./start.sh
  ./stop.sh
  ./restart.sh
  ./status.sh

Assets 3

25 Mar 01:21

judy0131

PiFlow V0.5 Release

Requirements

JDK 1.8 or newer
Spark-2.1.0
Hadoop-2.6.0
Hive-1.2.1
Other products you want to use, such as ElasticSearch, Solr，MongoDB etc.

Configure

config.properties

#server ip and port
server.ip=10.0.86.191
server.port=8002

#h2 db port
h2.port=50002

#spark and yarn config
spark.master=yarn
spark.deploy.mode=cluster
yarn.resourcemanager.hostname=10.0.86.191
yarn.resourcemanager.address=10.0.86.191:8032
yarn.access.namenode=hdfs://10.0.86.191:9000
yarn.stagingDir=hdfs://10.0.86.191:9000/tmp/
yarn.jars=hdfs://10.0.86.191:9000/user/spark/share/lib/*.jar
yarn.url=http://10.0.86.191:8088/ws/v1/cluster/apps/

#hive config
hive.metastore.uris=thrift://10.0.86.191:9083

#piflow-server.jar path, remenber to modify
piflow.bundle=/opt/piflowServer/piflow-server-0.9.jar

#checkpoint hdfs path
checkpoint.path=hdfs://10.0.86.89:9000/piflow/checkpoints/

#debug path
debug.path=hdfs://10.0.88.191:9000/piflow/debug/

#yarn url
yarn.url=http://10.0.86.191:8088/ws/v1/cluster/apps/

#the count of data shown in log, set 0 to show no data
data.show=10

Run Command

start: ./start.sh or nohup ./start.sh > piflow.log 2>&1 &
stop: ./stop.sh

Assets 3