Skip to content

Commit

Permalink
Merge pull request #1 from biobricks-ai/fix-encoding
Browse files Browse the repository at this point in the history
Set input file encoding to latin1
  • Loading branch information
zmughal authored Jan 11, 2024
2 parents 70fa054 + 79147a9 commit 4e94de7
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 5 deletions.
8 changes: 4 additions & 4 deletions dvc.lock
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,11 @@ stages:
size: 33866
- path: stages/download.R
hash: md5
md5: fde14a12dfe43458e11081702c2d3625
size: 831
md5: ec800973d471bab40b8abda8e60f39cc
size: 1025
outs:
- path: brick/invitrodb.parquet
hash: md5
md5: 79a0001b906d40f06a0efa2fea6a4bc1.dir
size: 718228047
md5: 110a6e966af64a8ad0c686b7bf8c908e.dir
size: 718228060
nfiles: 4
4 changes: 3 additions & 1 deletion stages/download.R
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,9 @@ invitrodb = "https://clowder.edap-cluster.com/files/63642290e4b04f6bb140a10d/blo
options(timeout = 600)
download.file(invitrodb, destfile = stage, mode = "wb")

df = readr::read_csv(stage)
# Need encoding due to the ± character:
# $ perl -F, -nE 'next unless /[^\x00-\x7F]/; say $F[3]' staging/invitrodb.csv | sort | uniq -c
df = readr::read_csv( file = stage, locale = readr::locale(encoding = "latin1") )
out = fs::dir_create("brick/invitrodb.parquet")

# See
Expand Down

0 comments on commit 4e94de7

Please sign in to comment.