Skip to content

Releases: cas-bigdatalab/piflow

PiFlow V1.0 Release

13 May 12:23
Compare
Choose a tag to compare

Requirements

  • JDK 1.8
  • Scala 2.11.8
  • Spark-2.1.0、Spark-2.2.0、Spark-2.3.0(other spark version of piflow.jar should be built with code)
  • Hadoop-2.6.0 (other hadoop version of piflow.jar should be with code)
  • Hive-1.2.1(if you need to use hive,setup and modify the config.properties)

config.properties

  spark.master=yarn
  spark.deploy.mode=cluster
  
  #hdfs default file system
  fs.defaultFS=hdfs://10.0.85.83:9000
  
  #yarn resourcemanager hostname
  yarn.resourcemanager.hostname=10.0.85.83
  
  #if you want to use hive, set hive metastore uris
  #hive.metastore.uris=thrift://10.0.85.83:9083
  
  #show data in log, set 0 if you do not show the logs
  data.show=10

  #monitor the throughput of flow
  monitor.throughput=true

  #server port
  server.port=8001

  #h2db port
  h2.port=50001

Command

  ./start.sh
  ./stop.sh
  ./restart.sh
  ./status.sh

PiFlow V0.9 Release

30 Nov 12:26
Compare
Choose a tag to compare

Requirements

  • JDK 1.8
  • Scala 2.11.8
  • Spark-2.1.0、Spark-2.2.0、Spark-2.3.0(other spark version of piflow.jar should be built with code)
  • Hadoop-2.6.0 (other hadoop version of piflow.jar should be with code)
  • Hive-1.2.1(if you need to use hive,setup and modify the config.properties)

config.properties

  spark.master=yarn
  spark.deploy.mode=cluster
  
  #hdfs default file system
  fs.defaultFS=hdfs://10.0.85.83:9000
  
  #yarn resourcemanager hostname
  yarn.resourcemanager.hostname=10.0.85.83
  
  #if you want to use hive, set hive metastore uris
  #hive.metastore.uris=thrift://10.0.85.83:9083
  
  #show data in log, set 0 if you do not show the logs
  data.show=10

  #monitor the throughput of flow
  monitor.throughput=true

  #server port
  server.port=8001

  #h2db port
  h2.port=50001

Command

  ./start.sh
  ./stop.sh
  ./restart.sh
  ./status.sh

PiFlow V0.8 Release

30 Sep 12:28
8ebdb32
Compare
Choose a tag to compare

Requirements

  • JDK 1.8
  • Scala 2.11.8
  • Spark-2.1.0、Spark-2.2.0、Spark-2.3.0(other spark version of piflow.jar should be built with code)
  • Hadoop-2.6.0 (other hadoop version of piflow.jar should be with code)
  • Hive-1.2.1(if you need to use hive,setup and modify the config.properties)

config.properties

  spark.master=yarn
  spark.deploy.mode=cluster
  
  #hdfs default file system
  fs.defaultFS=hdfs://10.0.85.83:9000
  
  #yarn resourcemanager hostname
  yarn.resourcemanager.hostname=10.0.85.83
  
  #if you want to use hive, set hive metastore uris
  #hive.metastore.uris=thrift://10.0.85.83:9083
  
  #show data in log, set 0 if you do not show the logs
  data.show=10

  #monitor the throughput of flow
  monitor.throughput=true

  #server port
  server.port=8001

  #h2db port
  h2.port=50001

Command

  ./start.sh
  ./stop.sh
  ./restart.sh
  ./status.sh

PiFlow V0.7-spark-3.0.0 Release

29 Jun 07:57
Compare
Choose a tag to compare
Pre-release

Requirements

  • JDK 1.8
  • Scala 2.12.10
  • Spark-3.0.0
  • Hadoop-3.2.0

config.properties

  spark.master=yarn
  spark.deploy.mode=cluster
  
  #hdfs default file system
  fs.defaultFS=hdfs://10.0.85.83:9000
  
  #yarn resourcemanager hostname
  yarn.resourcemanager.hostname=10.0.85.83
  
  #if you want to use hive, set hive metastore uris
  #hive.metastore.uris=thrift://10.0.85.83:9083
  
  #show data in log, set 0 if you do not show the logs
  data.show=10

  #monitor the throughput of flow
  monitor.throughput=true

  #server port
  server.port=8001

  #h2db port
  h2.port=50001

Command

  ./start.sh
  ./stop.sh
  ./restart.sh
  ./status.sh

PiFlow V0.7 Release

30 Apr 08:17
Compare
Choose a tag to compare

Requirements

  • JDK 1.8
  • Scala 2.11.8
  • Spark-2.1.0、Spark-2.2.0、Spark-2.3.0(other spark version of piflow.jar should be built with code)
  • Hadoop-2.6.0 (other hadoop version of piflow.jar should be with code)
  • Hive-1.2.1(if you need to use hive,setup and modify the config.properties)

config.properties

  spark.master=yarn
  spark.deploy.mode=cluster
  
  #hdfs default file system
  fs.defaultFS=hdfs://10.0.85.83:9000
  
  #yarn resourcemanager hostname
  yarn.resourcemanager.hostname=10.0.85.83
  
  #if you want to use hive, set hive metastore uris
  #hive.metastore.uris=thrift://10.0.85.83:9083
  
  #show data in log, set 0 if you do not show the logs
  data.show=10

  #monitor the throughput of flow
  monitor.throughput=true

  #server port
  server.port=8001

  #h2db port
  h2.port=50001

Command

  ./start.sh
  ./stop.sh
  ./restart.sh
  ./status.sh

PiFlow V0.6 Release

28 Nov 12:54
Compare
Choose a tag to compare

Requirements

  • JDK 1.8 or newer
  • Spark-2.1.0 (piflow jar of other spark version should be build by the code)
  • Hadoop-2.6.0 (piflow jar of other hadoop version should be build by the code)
  • Hive-1.2.1(if you need to use hive)

Configure

  #server Ip and Port
  server.ip=10.0.88.70
  server.port=8002
  
  #Spark master and deploy mode
  spark.master=yarn
  spark.deploy.mode=cluster
  
  #yarn related configurations
  yarn.resourcemanager.hostname=10.0.88.70
  yarn.resourcemanager.address=10.0.88.70:8032
  yarn.access.namenode=hdfs://10.0.88.70:9000
  yarn.stagingDir=hdfs://10.0.88.70:9000/tmp/
  yarn.jars=hdfs://10.0.88.70:9000/user/spark/share/lib/*.jar
  yarn.url=http://10.0.88.70:8088/ws/v1/cluster/apps/
  
  #hive metaStore uris
  hive.metastore.uris=thrift://10.0.88.71:9083

  #piflow server jar folder, please change this parameter to your path
  piflow.bundle=/data/piflow/piflow-server-v0.6/lib/piflow-server-0.9.jar
  
  #hdfs path for checkpoint、debug、increment,please create these folders first 
  checkpoint.path=hdfs://10.0.88.70:9000/user/piflow/checkpoints/
  debug.path=hdfs://10.0.88.70:9000/user/piflow/debug/
  increment.path=hdfs://10.0.88.70:9000/user/piflow/increment/

  #set 0 if you don not want to show data in log
  data.show=10

  #h2 db port
  h2.port=50002

Command

  ./start.sh
  ./stop.sh
  ./restart.sh
  ./status.sh

PiFlow V0.5 Release

25 Mar 01:21
3e06a22
Compare
Choose a tag to compare

Requirements

  • JDK 1.8 or newer
  • Spark-2.1.0
  • Hadoop-2.6.0
  • Hive-1.2.1
  • Other products you want to use, such as ElasticSearch, Solr,MongoDB etc.

Configure

  • config.properties

    #server ip and port
    server.ip=10.0.86.191
    server.port=8002
    
    #h2 db port
    h2.port=50002
    
    #spark and yarn config
    spark.master=yarn
    spark.deploy.mode=cluster
    yarn.resourcemanager.hostname=10.0.86.191
    yarn.resourcemanager.address=10.0.86.191:8032
    yarn.access.namenode=hdfs://10.0.86.191:9000
    yarn.stagingDir=hdfs://10.0.86.191:9000/tmp/
    yarn.jars=hdfs://10.0.86.191:9000/user/spark/share/lib/*.jar
    yarn.url=http://10.0.86.191:8088/ws/v1/cluster/apps/
    
    #hive config
    hive.metastore.uris=thrift://10.0.86.191:9083
    
    #piflow-server.jar path, remenber to modify
    piflow.bundle=/opt/piflowServer/piflow-server-0.9.jar
    
    #checkpoint hdfs path
    checkpoint.path=hdfs://10.0.86.89:9000/piflow/checkpoints/
    
    #debug path
    debug.path=hdfs://10.0.88.191:9000/piflow/debug/
    
    #yarn url
    yarn.url=http://10.0.86.191:8088/ws/v1/cluster/apps/
    
    #the count of data shown in log, set 0 to show no data
    data.show=10
    

Run Command

  • start: ./start.sh or nohup ./start.sh > piflow.log 2>&1 &
  • stop: ./stop.sh