Skip to content
Vassil Dichev edited this page Apr 22, 2016 · 47 revisions

Configuration

Compile time configuration

The macro needs a connection to the database at compile time to read schema metadata. The connection is configured with following system properties:

  • sqltyped.url
  • sqltyped.driver
  • sqltyped.username
  • sqltyped.password
  • sqltyped.schema (optional)

Support for multiple compile time data sources is enabled by bringing configuration name into scope by:

implicit object mydb1config extends ConfigurationName

The above configuration reads following system properties to establish a connection to database at compile time:

  • sqltyped.mydb1config.url
  • sqltyped.mydb1config.driver
  • sqltyped.mydb1config.username
  • sqltyped.mydb1config.password

Runtime configuration

Function sql returns an instance of a QueryN where N is input arity. Concretely a query with two arguments, String and Int returns an instance of

trait Query2[R] {
  def apply(i1: String, i2: Int)(implicit conn: java.sql.Connection): List[R]
}

To execute the query with apply function, an implicit java.sql.Connection must be available.

Defining and executing SQL statements

SQL statements are defined with function sql.

val q = sql("select age, name from person where age > ?")

To execute the statement a connection to database is needed.

implicit val conn = ...
val rows = q(20)

SQL select statement returns results as records. See Working with records.

rows map (_.show) mkString("\n")
// { name = joe, age = 36 }
// { name = moe, age = 14 }

In addition to function sql there's a couple of specialized versions.

  • sqlk

Works like sql but returns generated keys as a result. Useful in some SQL insert cases.

  • sqlt

Works like sql but tags input arguments too. See Tagging.

  • sqlj

Sqlτyped uses custom SQL parsers to parse the SQL statements. The advantage of custom parsing is that it enables better type inference. It is possible to do more thorough query analysis compared to what JDBC API provides. If custom parser fails to parse the SQL statement a fallback to JDBC based metadata is performed. When that happens, a compilation warning is emitted:

Fallback to JDBC metadata. Please file a bug at https://github.com/jonifreeman/sqltyped/issues

If you submit a bug to the issue tracker there's a good chance that the parser is enhanced to accept the SQL statement which it failed to parse.

Function sqlj can be used to completely by-pass custom parser and use directly JDBC based inference.

  • sqls

Generates records with symbols instead of strings as keys. This has some advantages as Shapeless supports some operations with case classes that work with symbols as keys. For instance, you can convert a shapeless record with symbol-based keys to a case class and vice versa. However, this comes at a performance cost, so you can use symbols only when necessary.

Parameterizing your own specialized functions

If you want to use a combination of the specialized functions, e.g. bypass parsing and fall back directly to JDBC using generated keys as a result, you can do that by defining your own function and annotating it using one or more of the sqltyped configuration annotations:

import scala.language.experimental.macros
@jdbcOnly @returnKeys def sqljk(s: String) = macro SqlMacro.sqlImpl

Here's how the existing specialized functions are defined:

@useInputTags def sqlt(s: String) = macro SqlMacro.sqlImpl

@returnKeys def sqlk(s: String) = macro SqlMacro.sqlImpl

@jdbcOnly def sqlj(s: String) = macro SqlMacro.sqlImpl

@useSymbolKeyRecords def sqls(s: String) = macro SqlMacro.sqlImpl

Similarly, you can define parameterized String interpolators:

  import scala.language.experimental.macros
  implicit class DynSQLContext(sc: StringContext) {
    @jdbcOnly def sqlj(exprs: Any*) = macro SqlMacro.dynsqlImpl
  }

There are no specialized String interpolators defined by default.

Connection & transaction management

Sqlτyped does not provide any connection pooling or transaction management solution. Existing solutions can be used instead. The integration is easy, just expose a connection as an implicit parameter as in following Slick example.

import scala.slick.session.Database
val db = Database.forURL("jdbc:mysql://localhost:3306/sqltyped", 
                         driver = "com.mysql.jdbc.Driver", user = "root", password = "")
implicit def conn = Database.threadLocalSession.conn

db withTransaction {
  sql("update person set name=? where id=?").apply("danny", 1)
  sql("update person set name=? where id=?").apply("fanny", 2)
}

db withSession {
  sql("select name from person where id=?").apply(1)
}

Tagging

To enable tagging you need to add an instance of EnableTagging to implicit scope.

implicit val enableTagging = EnableTagging

If a column is a primary or foreign key its type is tagged. For instance, a column which references 'person.id' is typed as Long @@ "person". That funny little @@ symbol is a type tag from Shapeless project. It is used to add extra type information to otherwise simple type and can be used for extra type safety in data access code.

scala> val person = Witness("person")
scala> def findName(id: Long @@ person.T) = sql("select name from person where id=?").apply(id)

scala> sql("select person from job_history").apply map findName

The above code compiles because 'job_history.person' is a foreign key referencing 'person.id'. Thus, its type is Long @@ "person".

Note, input parameters are not tagged (just typed). Otherwise this wouldn't compile:

sql("select name,age from person where id=?").apply(1)

Instead, explicit tagging would had been required:

sql("select name,age from person where id=?").apply(tag[person.T](1))

There's a separate function called sqlt (t as tagged) which tags input parameters too.

scala> val q = sqlt("select name,age from person where id=?")
scala> q.apply(1)
<console>:31: error: type mismatch;
 found   : Int(1)
 required: shapeless.TypeOperators.@@[Long,String("person")]
              q.apply(1)

scala> q.apply(tag[person.T](1))

Working with records

Introduction

Record is a type safe list of key-value pairs. Scala does not have a native record system but they can be emulated with HLists. Before going into details how to do that let's just recap how records differ from case classes, a native Scala construct.

Scala provides us case classes which are very nice to represent data.

case class Person(name: String, age: Int)

val p = Person("Joe", 36)

While case classes are nice there are some limitations.

  • case classes must be named
  • we cannot easily modify the schema of a case class (add or remove fields)

Often those limitations do not matter but a data structure which lifts those restrictions would be quite nice to represent data read from database. After all, we do not have a name for each SELECT query and sometimes we like to add (or remove) something to the query results before further processing.

The above data can be represented as an extensible record too.

record Person = { name: String, age: Int }

val p = Person { name = "Joe", age = 36 }

Now, of course that's not valid Scala syntax. Like said, we do not have a first-class record system available. Fortunately emulating those with HLists turns out to work quite ok (compilation failures are sometimes pretty horrendous though :).

val p = ("name" ->> "Joe") :: ("age" ->> 36) :: HNil

A record field is created with function '->>'. The resulting type will be FieldType[K, V] where K is a singleton type of String literal representing the key of a field. To read more about singleton types (and records) see Shapeless.

Using records

Basic data access and data modifications are supported.

scala> import sqltyped._
scala> import shapeless._
scala> val p = ("name" ->> "Joe") :: ("age" ->> 36) :: HNil

scala> p.show
res0: String = { name = Joe, age = 36 }
scala> p get "name"
res1: String = Joe

scala> p get "lname"
<console>:19: error: No field String("lname") in record ....
              p get "lname"

scala> val p2 = ("lname" ->> "Doe") :: p
scala> p2.show
scala> res2: String = { lname = Doe, name = Joe, age = 36 }
scala> p2 get "lname"
res3: String = Doe

scala> val p3 = p - "name"
scala> p3.show
res3: String = { age = 36 }

scala> val p4 = p renameField("name", "lname")
scala> p4.show
res4: String = { lname = Joe, age = 36 }

scala> val p5 = p.updateWith("name")(_.length)
scala> p5 get "name"
res5: Int = 3

Advanced use is possible by using HList utilities provided by Shapeless. Like mapping over record fields.

import shapeless.record.{field, FieldType}
object toUpper extends Poly1 {
  implicit def stringToUpper[F] = at[FieldType[F, String]] {
    f => field[F](f.toUpperCase)
  }

  implicit def otherTypes[X] = at[X](identity)
}

scala> val p6 = p map toUpper
scala> p6.show
res6: String = { name = JOE, age = 36 }

We have to define toUpper as a polymorphic function. If a value is of type String, we upcase it. For other types we just use identity function.

Referring to records' type (e.g. when passing to functions)

If you need to pass the generated records to a function, you need to define the parameter type. One way to do that would be using the newer shapeless record type syntax, e.g.

def fun(x: Record.`"name" -> Int`.T) = x("name")

However, this becomes unwieldy quickly as the types grow. That's why for convenience the query type contains a type variable called ReturnType, which contains the type of the record that's generated by the macro:

val query = sql("select name, age from person")

def fun(x: query.ReturnType) = x("name") -> x("age")

Conversions to common formats

JSON

scala> import sqltyped.json4s
scala> implicit val formats = org.json4s.DefaultFormats 
scala> val rows = sql("select id, name as fname, age from person limit 100").apply
scala> JSON.compact(rows)
res0: String = [{"id":1,"fname":"joe","age":36},{"id":2,"fname":"moe","age":14}]

CSV

scala> val rows = sql("select id, name as fname, age from person limit 100").apply
scala> CSV.fromList(rows.values)
res1: String = 
"1","joe","36"
"2","moe","14"

scala> CSV.fromList(rows.values, separator = ";")
res2: String = 
"1";"joe";"36"
"2";"moe";"14"

If query returns Date, Time or Timestamp add a proper Show instance to scope:

implicit object ShowTimestamp extends Show[java.sql.Timestamp] {
  def show(tstamp: java.sql.Timestamp) = tstamp.toString
}

Untyped List

scala> val rows = sql("select name, age from person limit 100").apply
scala> Record.toTupleLists(rows)
res2: List[List[(String, Any)]] = List(List(("name", "joe"), ("age", 36)), List(("name", "moe"), ("age", 14)))

Runtime query building

To infer the types the SQL statement must be known statically at compile time. But what if we have to build the SQL at runtime? Say for instance that some parts of the query depends on user input. Turns out that if we relax the inference requirements a bit we can add some support for runtime query building.

When the SQL is fully known at compile time we can use full type inference:

sql("select age, name from person where age > ? and name = ?")

// Inferred type will be:
(Int, String) => List[{ age: Int, name: String }]

To support runtime query building we do not try to infer the types of input parameters:

sql"select age, name from person where $whereExpr"

// Inferred type will be:
Seq[Any] => List[{ age: Int, name: String }]

Note, we changed to interpolation syntax. It allows building of where-part of the SQL statement dynamically. Here sqlτyped macro only parses the beginning of the statement to infer the return types. Input parameters are not inferred and therefore can't be verified at compile time. Now we can build rest of the query at runtime:

scala> val where = "age > ?" + " and " + "age < ?"
scala> val orderBy = "age"
scala> val q = sql"select name from person where $where order by $orderBy"
scala> q.apply(Seq(5, 100))
res0: List[String] = List("moe", "joe")

If you find string concatenation lame (which it of course is) you can quickly craft a small lib which lets you compose the where part of the query. As in the example below.

sealed trait Expr {
  def sql: String = this match {
    case Predicate(e, _) => e
    case And(l, r) => "(" + l.sql + " and " + r.sql + ")"
    case Or(l, r) => "(" + l.sql + " or " + r.sql + ")"
  }

  def args: Seq[Any] = this match {
    case Predicate(_, as) => as
    case And(l, r) => l.args ++ r.args
    case Or(l, r) => l.args ++ r.args
  }

  def and(other: Expr) = And(this, other)
  def or(other: Expr)  = Or(this, other)
}

case class Predicate(sqlExpr: String, arguments: Seq[Any]) extends Expr
case class And(l: Expr, r: Expr) extends Expr
case class Or(l: Expr, r: Expr) extends Expr

def pred(sql: String, args: Any*) = Predicate(sql, args)

Query composition is now a lot nicer:

val p1 = pred("age > ?", 15)
val p2 = pred("age < ?", 2)
val p3 = pred("length(name) < ?", 6)

val expr = (p1 or p2) and p3

sql"select name from person where ${expr.sql}".apply(expr.args)

Notes about compilation performance

It is important to understand the effect of compile time metaprogramming on compilation times to not increase compilation times too much. Performance analysis of sqlτyped can be split to two parts.

1. Processing of SQL statements

Set the system property 'sqltyped.enable-timer' and compilation time breakdown of each SQL statement is printed during compilation. Processing of the first SQL statement is slowest as the database schema needs to be read. On my current project the timings are around (this is just to get an idea, timings of your project will be different):

  • 1st stmt: ~1000ms
  • Rest: ~10 - 50ms each

To compile 100 SQL statements in this project takes 2s - 6s on my machine.

2. Records

sqlτyped models rows returned by SQL statements as records. A record is an HList. HList is a data structure where the compiler keeps track of a type of each element in the HList. Thus working with large HLists can put quite a burden on the compiler. I have noticed some nasty compiler performance degradation when the size of the HList gets bigger than ~35. To mitigate this consider:

  • Avoid 'SELECT *' on tables having many columns.
  • Do not select columns you don't need.
  • If you still have a query returning many columns consider moving logic which processes those rows to a separate file. This can help incremental compilation (but won't help with clean compiles).