Skip to content

zaneli/scalikejdbc-athena

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

99 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

scalikejdbc-athena

Library for using Amazon Athena JDBC Driver with ScalikeJDBC

CI Status

scalikejdbc-athena was originally developed to work around the fact that the Athena JDBC driver did not support PreparedStatement.
This limitation was likely fixed in Athena JDBC driver version 2.x or later, so in most cases this library is no longer necessary.
However, to maintain compatibility with the previous behavior — where PreparedStatement was processed via scalikejdbc-athena's custom string replacement — we provide the useCustomPreparedStatement parameter.

setup

  • Download Athena JDBC driver
    • This library supports both Athena JDBC 3.x and 2.x driver.
      If you encounter problems with a particular version, please feel free to report it.
> mkdir lib
> pushd lib
> curl -L -O https://downloads.athena.us-east-1.amazonaws.com/drivers/JDBC/3.5.1/athena-jdbc-3.5.1-with-dependencies.jar
> popd
  • Add library dependencies to sbt build settings
libraryDependencies ++= Seq(
  "org.scalikejdbc" %% "scalikejdbc" % "4.3.4",
  "com.zaneli" %% "scalikejdbc-athena" % "0.4.0"
)
  • Configure the JDBC Driver Options on resources/application.conf
# v3 driver
athena {
  default {
    driver="com.amazon.athena.jdbc.AthenaDriver"
    url="jdbc:athena://AwsRegion={REGION}"
    readOnly="false"
    S3OutputLocation="s3://query-results-bucket/folder/"
    AwsCredentialsProviderClass="DefaultChain"
    LogPath="logs/application.log"
    LogLevel=3
  }
}

# v2 driver
athena {
  default {
    driver="com.simba.athena.jdbc.Driver"
    url="jdbc:awsathena://AwsRegion={REGION}"
    readOnly="false"
    S3OutputLocation="s3://query-results-bucket/folder/"
    AwsCredentialsProviderClass="com.simba.athena.amazonaws.auth.profile.ProfileCredentialsProvider"
    LogPath="logs/application.log"
    LogLevel=3
  }
}

# compat PreparedStatement custom implementation
athena {
  default {
    driver="com.amazon.athena.jdbc.AthenaDriver"
    url="jdbc:athena://AwsRegion={REGION}"
    readOnly="false"
    useCustomPreparedStatement="true"
    S3OutputLocation="s3://query-results-bucket/folder/"
    AwsCredentialsProviderClass="DefaultChain"
    LogPath="logs/application.log"
    LogLevel=3
  }
}

If you need to update partitions etc., set readOnly="false"

Usage

Run query

import scalikejdbc._
import scalikejdbc.athena._

val name = "elb_demo_001"
DB.athena { implicit s =>
  val r = sql"""
          |SELECT * FROM default.elb_logs_raw_native
          |WHERE elb_name = $name LIMIT 10;
         """.stripMargin.map(_.toMap).list.apply()
  r.foreach(println)
}

Delete S3OutputLocation after run query

  • set S3OutputLocationPrefix instead of S3OutputLocation
athena {
  default {
    driver="com.amazon.athena.jdbc.AthenaDriver"
    url="jdbc:athena://AwsRegion={REGION}"
    readOnly="false"
    S3OutputLocationPrefix="s3://query-results-bucket/folder"
    AwsCredentialsProviderClass="DefaultChain"
    LogPath="logs/application.log"
    LogLevel=3
  }
}
import com.amazonaws.auth.profile.ProfileCredentialsProvider
import com.amazonaws.services.s3.AmazonS3ClientBuilder
import com.amazonaws.services.s3.model.DeleteObjectsRequest

import scalikejdbc._
import scalikejdbc.athena._

import scala.collection.JavaConverters._

val s3Client = AmazonS3ClientBuilder.standard().withCredentials(new ProfileCredentialsProvider()).build()
val regex = """s3://(.+?)/(.+)""".r

DB.athena { implicit s =>
  val r = sql"...".map(_.toMap).list.apply()
  r.foreach(println)

  s.getTmpStagingDir.foreach { // Some("s3://query-results-bucket/folder/${java.util.UUID.randomUUID}")
    case regex(bucketName, path) =>
      val keys = s3Client.listObjects(bucketName, path).getObjectSummaries.asScala
        .map(s => new DeleteObjectsRequest.KeyVersion(s.getKey))
      if (keys.nonEmpty) {
        val delReq = new DeleteObjectsRequest(bucketName)
        delReq.setKeys(keys.asJava)
        s3Client.deleteObjects(delReq)
      }
  }
}

scalikejdbc-athena is inspired by scalikejdbc-bigquery.

About

Library for using Amazon Athena JDBC Driver with ScalikeJDBC

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 5

Languages