: (etl '(all the financial data))
ETL project of financial data: equity prices, bond yields, risk free rate of return, GDP data, currency prices, and real estate indecies by zip code.
Quandl has this data available in an API, so this project wraps for insertion to a target postgresql database.
See my talk from the February 2017 Clojure PDX meetup for more about the project https://github.com/skilbjo/articles/blob/master/talks/bare_metal_to_aws_lambda.md
- Generic key mapper (xform into postgres column names)
- Generic value mapper (xform into postgres dates)
- Upsert-multi function <<-- this was a hard one
- Come up with a time/now - subtract 5 days
- Look at JVM settings in profile.clj (does jar need profile?) & deploy/bin/run-job
- Model economics / interest_rates appropriately
- For currency, parse out usd currency and put in column (ie, EUR)
- Better naming, any ideas for refactoring
export quandl_api_key=''
export jdbc_db_uri=''
export test_jdbc_db_uri=$jdbc_db_uri
lein trampoline run -m jobs.equities
LEIN_FAST_TRAMPOLINE=true lein trampoline run -m jobs.equities
llein run -m jobs.equities
lein run -m backfill.equities -d 2018-06-30
or, if you need something manually,
insert into dw.equities_fact
(dataset, ticker, date, close) values
('MSTAR','VMMXX','2018-01-02',1.00)
lein deps :tree 2>deps 1>/dev/null && vim deps
lein cljfmt check ; lein cljfmt fix
lein kibit >tmp ; vim tmp && rm tmp
(refresh)
(use '[clojure.tools.namespace.repl :only (refresh)])
(require '[markets-etl.api :as api])
(def ticker "FB")
(def query-params
{:limit 500
:start_date "2018-07-25"
:end_date "2018-07-28"})
(api/query-tiingo! ticker query-params)
git remote add pi-vpn ssh://skilbjo@router.:43/~/deploy/git/markets-etl.git
git remote add pi-home ssh://skilbjo@pi1/~/deploy/git/markets-etl.git
vim .git/hooks/pre-commit
#!/usr/bin/env bash
lein pom 2>&1 dev-resources/pom.xml
In order to edit a jar with vim in alpine linux, install a non-busybox unzip binary. The busybox binary:
BusyBox v1.25.1 (2017-11-23 08:48:46 GMT) multi-call binary.
Usage: unzip [-lnopq] FILE[.zip] [FILE]... [-x FILE...] [-d DIR]
is not the fully featured version. This is:
UnZip 6.00 of 20 April 2009, by Info-ZIP. Maintained by C. Spieler. Send
bug reports using http://www.info-zip.org/zip-bug.html; see README for details.
Usage: unzip [-Z] [-opts[modifiers]] file[.zip] [list] [-x xlist] [-d exdir]
install with:
apk add --no-cache unzip
Clojure core functions (map, reduce) vs reducers (r/map, r/reduce):
(defn execute! [cxn data]
(jdbc/with-db-transaction [txn cxn]
(->> data
(map prepare-row)
flatten
(map #(update-or-insert! txn %))
doall)))
(defn execute!' [cxn data]
(jdbc/with-db-transaction [txn cxn]
(->> data
(r/map prepare-row)
(into '())
flatten
(map #(update-or-insert! txn %))
doall)))
@mbp:markets-etl $ lein run -m benchmark.equities
### Traditional Clojure core functions
Evaluation count : 540 in 60 samples of 9 calls.
Execution time mean : 131.794661 ms
Execution time std-deviation : 20.667398 ms
Execution time lower quantile : 108.522897 ms ( 2.5%)
Execution time upper quantile : 190.814936 ms (97.5%)
Overhead used : 2.252589 ns
Found 5 outliers in 60 samples (8.3333 %)
low-severe 3 (5.0000 %)
low-mild 2 (3.3333 %)
Variance from outliers : 85.8712 % Variance is severely inflated by outliers
### Clojure reducer functions
Evaluation count : 360 in 60 samples of 6 calls.
Execution time mean : 172.229434 ms
Execution time std-deviation : 24.243476 ms
Execution time lower quantile : 146.725239 ms ( 2.5%)
Execution time upper quantile : 235.378479 ms (97.5%)
Overhead used : 2.252589 ns
Found 3 outliers in 60 samples (5.0000 %)
low-severe 3 (5.0000 %)
Variance from outliers : 82.4206 % Variance is severely inflated by outliers
Adding new tickers will cause sort order of actual results to change, which will cause integration tests to fail.
To resove, instead of editing manually, best to just print f/result
or
f/result'
, format using http://repo.tiye.me/mvc-works/edn-formatter/, and
replace in test/fixtures/equities.clj