Skip to content

Split a given file in several files, for input of batch extraction (ETL) of data

License

Notifications You must be signed in to change notification settings

medeiros/split-batch-input-files

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

split-batch-input-files

CircleCI Code Size License

This library splits a file in several, so an ETL mechanism (like Spring Batch) is able to read data in parallel, improving overall performance.

This library would be used in JVM-based applications.

Created in Clojure functional language.

Todo list:

  • add function to remove generated files
  • improve unit tests using some cool lib
  • add unit tests
  • add file to Leiningen project, in order to generate a lib jar

Clojure Style: cheet sheet being adopted:

CheatSheet URL

Installation

$ lein uberjar

Usage

Just call the lib function:

split-file [file pieces]

And your file will be splitted in pieces.

Examples

(split-file "test-file.csv" 3)
  • test-file.csv file has five lines
  • It will generate:
    • test-file.csv.0: containing 2 lines
    • test-file.csv.1: containing 2 lines
    • test-file.csv.2: containing the remainder 5th line

License

Copyright © 2019 Daniel Medeiros

Distributed under the MIT License.

About

Split a given file in several files, for input of batch extraction (ETL) of data

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages