Skip to content

cloudymoma/pubsub-flink

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Flink job that consumes messages from Google Cloud Pubsub

Known Issues:

  1. Flink(Dataproc) PHS server is NOT working properly by now. Hence you may skip the step for creating a PHS sever and check the flink/yarn job directly from the job server.

  2. Apache Flink Pubsub connector may NOT working properly in some cases. Hence this GCP Flink Pubsub connector may worth a try.

https://github.com/cloudymoma/pubsub/tree/da-pom-fix I forked this GCP connector and made the da-pom-fix works so far.

this will build the flink connector GCP DA Fixed version and install on your local maven repository for later use.

cd pubsub/flink-connector
mvn clean package install -DskipTests

you can use --useGcpPubsubConnectors <boolean> in makefile to switch between the two connectors.

before start

you need to change the settings in flink.sh. These are the settings about your GCP project, region, bucket, cluster names etc.

Same for the makefile

It's very important to keep the makefile where you copy the compiled monolithic jar same path as the jar specified in flink.sh

Also, in the makefile, I have copied the Google Cloud Platform service account file from $GOOGLE_APPLICATION_CREDENTIALS this enviroment variable. You may need to change the path according to your own setup. Alternatively, you can grant your dataflow proper IAM permissions instead. More details here

  1. Create a PHS server
./flink.sh flinkhistserver
  1. Create a flink job server
./flink.sh flinkjobserver
  1. Build the flink job and upload to GCS
make build
  1. Run the flink job on the flink job server
make run

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published