Lauda

An ETL'ish tool that ransfers data between any JDBC compliant databases.

Lauda has its own simple repository in given database, that can be configured and loaded either from command line or directly in database. Keeping extract-load definitions in database, makes it easier for database people to manage transfers using a database client. And for the same reason lauda has a simple database model and no orm in source.

Lauda has three concepts for defining an ETL process;

connection: Connection definition to source and target databases. These definitions are kept in connections table in repository. A Connection has the following attributes.
- name: Unique name of the connection. Case sensitive.
- url: JDBC url for the connection. Example: jdbc:oracle:thin:@localhost:1521:orcl
- username: user name for connection.
- password: passowrd for connection.
- class_name: Class name of the JDBC driver. Example: oracle.jdbc.driver.OracleDriver
mapping: Mapping definition of the extract and load targets. These data is kept in mappings table in repository. A mapping has the following attributes.
- name: Unique name of the mapping. Case sensitive.
- source_connection: Name of the source connection. Lookup to connections
- target_connection: Name of the target connection. Lookup to connections
- source_table: Name of the source table. Give database/schema prefix if table is another database/schema than the connection. For example the value can be my_schema.table_name or just table_name
- target_table: Name of the target table. Give database/schema prefix if table is another database/schema than the connection. For example the value can be my_schema.table_name or just table_name
- source_hint: Optional source query hint. Example /*+ parallel(8) */
- target_hint: Optional target statement hint. Example /*+ nologging */
- filter: Filter expression for the source. Example column_a in ('X', 'Y', 'Z') and column_b=2
- batch_size: Batch select and insert size. Defaults to 1000
- drop_create: Drop the target table before insert. Defaults to 0, false
Since some databases do not support boolean, in repository tables, boolean options like drop_create are managed by integers. For false give the value 0 for true give any positve integer. In repository data files(see command line options and data file loading) you can use true or false.
column mapping: Column mapping pairs between source and target tables. This mappings are stored in columns table in repository. Has the following attributes;
- mapping: Name of the mapping, this pair belongs. Lookup to mappings.
- source: Name of the source column or an expression. Example: COL_A or to_char(COL_DATE, 'yyymmdd')
- target: Name of the target column that source will be inserted.

Installation

Make sure you have java 1.8 or greater
Download archive file from dist
Extract lauda-[version].tar.gz to desired location.
Put the required jdbc drivers to lib folder

Building from source

Make sure you have sbt
Clone the repository
run dist.sh

Configuration

Use config.yml for configuratin

Command line arguments

Usage: lauda [-hV] [--columns=<columns>]
             [--source-connection=<sourceConnection>]
             [--source-table=<sourceTable>]
             [--target-connection=<targetConnection>]
             [--target-table=<targetTable>] <command>
Loads data between databases https://github.com/bluecolor/lauda
      <command>             Command to exeucte
                            See https://github.
                              com/bluecolor/lauda#command-line-arguments
      --columns=<columns>   Comma seperated list of columns. Optional
  -h, --help                Show this help message and exit.
      --source-connection=<sourceConnection>
                            Source connection name
      --source-table=<sourceTable>
                            Source table name. can be schema.table_name or just
                              table_name
      --target-connection=<targetConnection>
                            Target connection name
      --target-table=<targetTable>
                            Target table name. can be schema.table_name or just
                              table_name
  -V, --version             Print version information and exit.

command command can be one of the following and can take parameters;

repository.up: Initialize repository. Uses config.yml
repository.down: Drops the repsitory. Uses config.yml
repository.import: Import repository data with parameter
repository.print.connections: Print connections
repository.print.mappings: Print mappings
repository.print.columns: Print source and target columns of given mapping with
mapping.delete: Delete a mapping by name with parameter
mapping.exists: Check if mapping exists with parameter
mapping.run: Run the mapping given with
mapping.create: Create target table in given mapping with
connection.delete: Delete connection by name
connection.test: Test jdbc connection by name

Examples

Create repository, using the parameters in config.yml

./lauda.sh repository.up

Reset/drop repository, using the parameters in config.yml

./lauda.sh repository.down

Import mapping and connection definitions. See example data file.

./lauda.sh repository.import examples/repository-seed.yml

Print available connections

./lauda.sh repository.print.connections

Print defined mappings

./lauda.sh repository.print.mappings

Print the source and target columns of the given mapping

./lauda.sh repository.print.columns mapping_name

Delete a mapping definition(column mappings also)

./lauda.sh mapping.delete mapping_name

./lauda.sh mapping.delete "Mapping name"

Check if mapping with given name already exists

./lauda.sh mapping.exists mapping_name

Run mapping

./lauda.sh mapping.run mapping_name

Create target table in mapping. Does not run mapping, only created target.

./lauda.sh mapping.create mapping_name

Delete given connection definition

./lauda.sh connection.delete connection_name

./lauda.sh connection.delete "Connection name"

Test the connection

./lauda.sh connection.test connection_name

Generate mapping

  ./lauda.sh mapping.generate table_4 \
    --source-connection oracle_1 \
    --target-connection postgre_2 \
    --source-table source_table_3 \
    --target-table target_table_3

Loading definitions

An ETL definiton can be given directly using a database client in repository or using command line option. With a database client you can execute a script like this one For command line, use the repsitory.import command and give data file like this one. See examples for different options.

Example command line usage; ./lauda.sh repository.import /path/to/data-file.yml ./lauda.sh repository.import ./data-file.yml

Oracle to Oracle transfers

For data transfers between two oracle databases you can use elo which uses oracle dblink's and is faster.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
dist		dist
examples		examples
project		project
src		src
template		template
test		test
.editorconfig		.editorconfig
.gitignore		.gitignore
.scalafmt.conf		.scalafmt.conf
README.md		README.md
app.log		app.log
app.todo		app.todo
build.sbt		build.sbt
dist.sh		dist.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Lauda

Installation

Building from source

Configuration

Command line arguments

Loading definitions

Oracle to Oracle transfers

About

Releases

Packages

Languages

bluecolor/lauda

Folders and files

Latest commit

History

Repository files navigation

Lauda

Installation

Building from source

Configuration

Command line arguments

Loading definitions

Oracle to Oracle transfers

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages