Library for using Amazon Athena JDBC Driver with ScalikeJDBC
scalikejdbc-athena was originally developed to work around the fact that the Athena JDBC driver did not support PreparedStatement.
This limitation was likely fixed in Athena JDBC driver version 2.x or later, so in most cases this library is no longer necessary.
However, to maintain compatibility with the previous behavior — where PreparedStatement was processed via scalikejdbc-athena's custom string replacement — we provide the useCustomPreparedStatement
parameter.
- Download Athena JDBC driver
> mkdir lib
> pushd lib
> curl -L -O https://downloads.athena.us-east-1.amazonaws.com/drivers/JDBC/3.5.1/athena-jdbc-3.5.1-with-dependencies.jar
> popd
- Add library dependencies to sbt build settings
libraryDependencies ++= Seq(
"org.scalikejdbc" %% "scalikejdbc" % "4.3.4",
"com.zaneli" %% "scalikejdbc-athena" % "0.4.0"
)
- Configure the JDBC Driver Options on
resources/application.conf
# v3 driver
athena {
default {
driver="com.amazon.athena.jdbc.AthenaDriver"
url="jdbc:athena://AwsRegion={REGION}"
readOnly="false"
S3OutputLocation="s3://query-results-bucket/folder/"
AwsCredentialsProviderClass="DefaultChain"
LogPath="logs/application.log"
LogLevel=3
}
}
# v2 driver
athena {
default {
driver="com.simba.athena.jdbc.Driver"
url="jdbc:awsathena://AwsRegion={REGION}"
readOnly="false"
S3OutputLocation="s3://query-results-bucket/folder/"
AwsCredentialsProviderClass="com.simba.athena.amazonaws.auth.profile.ProfileCredentialsProvider"
LogPath="logs/application.log"
LogLevel=3
}
}
# compat PreparedStatement custom implementation
athena {
default {
driver="com.amazon.athena.jdbc.AthenaDriver"
url="jdbc:athena://AwsRegion={REGION}"
readOnly="false"
useCustomPreparedStatement="true"
S3OutputLocation="s3://query-results-bucket/folder/"
AwsCredentialsProviderClass="DefaultChain"
LogPath="logs/application.log"
LogLevel=3
}
}
If you need to update partitions etc., set readOnly="false"
import scalikejdbc._
import scalikejdbc.athena._
val name = "elb_demo_001"
DB.athena { implicit s =>
val r = sql"""
|SELECT * FROM default.elb_logs_raw_native
|WHERE elb_name = $name LIMIT 10;
""".stripMargin.map(_.toMap).list.apply()
r.foreach(println)
}
- set
S3OutputLocationPrefix
instead ofS3OutputLocation
athena {
default {
driver="com.amazon.athena.jdbc.AthenaDriver"
url="jdbc:athena://AwsRegion={REGION}"
readOnly="false"
S3OutputLocationPrefix="s3://query-results-bucket/folder"
AwsCredentialsProviderClass="DefaultChain"
LogPath="logs/application.log"
LogLevel=3
}
}
- use aws-java-sdk-s3
import com.amazonaws.auth.profile.ProfileCredentialsProvider
import com.amazonaws.services.s3.AmazonS3ClientBuilder
import com.amazonaws.services.s3.model.DeleteObjectsRequest
import scalikejdbc._
import scalikejdbc.athena._
import scala.collection.JavaConverters._
val s3Client = AmazonS3ClientBuilder.standard().withCredentials(new ProfileCredentialsProvider()).build()
val regex = """s3://(.+?)/(.+)""".r
DB.athena { implicit s =>
val r = sql"...".map(_.toMap).list.apply()
r.foreach(println)
s.getTmpStagingDir.foreach { // Some("s3://query-results-bucket/folder/${java.util.UUID.randomUUID}")
case regex(bucketName, path) =>
val keys = s3Client.listObjects(bucketName, path).getObjectSummaries.asScala
.map(s => new DeleteObjectsRequest.KeyVersion(s.getKey))
if (keys.nonEmpty) {
val delReq = new DeleteObjectsRequest(bucketName)
delReq.setKeys(keys.asJava)
s3Client.deleteObjects(delReq)
}
}
}
scalikejdbc-athena is inspired by scalikejdbc-bigquery.