Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update dependencies and scripts for MacOS and Homebrew. #15

Open
wants to merge 14 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
247 changes: 155 additions & 92 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,172 +1,235 @@
# paperbackup.py

Create a pdf with barcodes to backup text files on paper.
Designed to backup ASCII-armored GnuPG and SSH key files and ciphertext.
Create a PDF with barcodes to backup text files on paper. Designed to
backup ASCII-armored GnuPG and SSH key files and ciphertext.

## How to use

###### Backup
### Backup

```
gpg2 --armor --export-options export-minimal --export-secret-key "User Name" >key.asc
gpg2 --armor --export-options export-minimal --export-secret-key "User Name" > key.asc
paperbackup.py key.asc
paperrestore.sh key.asc.pdf | diff key.asc -
lpr key.asc.pdf
```

This will print out the public and private key of "User Name". The
private key is still encrypted with it's passphrase, so make sure
you don't lose or forget it.
private key is still encrypted with it's passphrase, so make sure you
don't lose or forget it.

See some example output here:
https://github.com/intra2net/paperbackup/raw/master/example_output.pdf
<https://github.com/intra2net/paperbackup/raw/master/example_output.pdf>

###### Restore
### Restore

1. Scan the papers
2. Create one file containing all the pages. zbar supports e.g. PDF, TIFF, PNG, JPG,...
3. `paperrestore.sh scanned.pdf >key.asc`
2. Create one file containing all the pages. zbar supports e.g. PDF,
TIFF, PNG, JPG,...
3. `paper-restore.sh scanned.pdf > key.asc`
4. `gpg2 --import key.asc`

If one or more barcodes could not be decoded, try scanning them again. If that does
not work, type in the missing letters from the plaintext output at the end of the pdf.
If one or more barcodes could not be decoded, try scanning them again.
If that does not work, type in the missing letters from the plaintext
output at the end of the PDF.

## Dependencies

- python 3 https://www.python.org/
- python3-pillow https://python-pillow.org/
- PyX http://pyx.sourceforge.net/
- LaTeX (required by PyX) https://www.latex-project.org/
- python3-qrencode https://github.com/Arachnid/pyqrencode
- enscript https://www.gnu.org/software/enscript/
- ghostscript https://www.ghostscript.com/
- ZBar http://zbar.sourceforge.net/
- [Python 3](https://www.python.org/)
- [python3-pillow](https://python-pillow.org/)
- [python3-qrencode](https://github.com/Arachnid/pyqrencode)
- [enscript](https://www.gnu.org/software/enscript/)
- [ghostscript](https://www.ghostscript.com/)
- [LaTeX (required by PyX)](https://www.latex-project.org/)
- [PyX](http://pyx.sourceforge.net/)
- [ZBar](http://zbar.sourceforge.net/)

### MacOS with Homebrew

Some of the dependencies listed above are either outdated, or don't work
on Big Sur, recent versions of Homebrew, or newer versions of Python3.
These issues can be addressed by installing [Homebrew](https://brew.sh/)
and then running the new `paperbackup-homebrew-setup.sh` script to
configure the dependencies.

*Note: Please adjust the script if you'd rather use a LaTex package
other than the BasicTex cask, or if you'd prefer to use GraphicsMagick
rather than ImageMagick.*

## Why backup on paper?

Some data, like GnuPG or SSH keys, can be really really important for you, like that your whole
business relies on them. If that is the case, you should have multiple backups at multiple
places of it.
Some data, like GnuPG or SSH keys, can be really really important for
you, like that your whole business relies on them. If that is the case,
you should have multiple backups at multiple places of it.

I also think it is a good idea to use different media types for it. Hard disks, flash based
media and CD-R are not only susceptible to heat, water and strong EM waves, but also age.
I also think it is a good idea to use different media types for it. Hard
disks, flash based media and CD-R are not only susceptible to heat,
water and strong EM waves, but also age.

Paper, if properly stored, has proven to be able to be legible after centuries. It is also
quite resistant to fire if stored as a thick stack like a book.
Paper, if properly stored, has proven to be able to be legible after
centuries. It is also quite resistant to fire if stored as a thick stack
like a book.

So I think it is a good idea to throw a backup on paper into the mix of locations and media
types of your important backups.
So I think it is a good idea to throw a backup on paper into the mix of
locations and media types of your important backups.

Storing the paper backup in a machine readable format like barcodes makes it practical to restore
even large amounts in short order. If the paper is too damaged for the barcodes to be readable,
you still have the printed plaintext that paperbackup produces.
Storing the paper backup in a machine readable format like barcodes
makes it practical to restore even large amounts in short order. If the
paper is too damaged for the barcodes to be readable, you still have the
printed plaintext that paperbackup produces.

## How to properly store the paper

The ISO has some standards for preservation and long term storage of paper:
The ISO has some standards for preservation and long term storage of
paper:

- ISO/TC 46/SC 10 - Requirements for document storage and conditions for
preservation

ISO/TC 46/SC 10 - Requirements for document storage and conditions for preservation
http://www.iso.org/iso/home/store/catalogue_tc/catalogue_tc_browse.htm?commid=48842
<http://www.iso.org/iso/home/store/catalogue_tc/catalogue_tc_browse.htm?commid=48842>

Here's an example of what ISO 16245 describes:
http://www.iso.org/iso/livelinkgetfile-isocs?nodeId=15011261

- Here's an example of what ISO 16245 describes:

<http://www.iso.org/iso/livelinkgetfile-isocs?nodeId=15011261>

## Choice and error resilency of barcodes

Only 2D barcodes have the density to make key backup practical. QR Code and DataMatrix are
the most common 2D barcodes.
Only 2D barcodes have the density to make key backup practical. QR Code
and DataMatrix are the most common 2D barcodes.

Using a common barcode symbology makes sure that there are several independent implementations
of decoders available. This increases the probability that they handle defects and error
correction differently and are able to tolerate different kinds of defects. So if the barcode
Using a common barcode symbology makes sure that there are several
independent implementations of decoders available. This increases the
probability that they handle defects and error correction differently
and are able to tolerate different kinds of defects. So if the barcode
gets damaged, you have several programs you can try.

Several papers comparing QR and DataMatrix come to the conclusion that DataMatrix allows
a higher density and offers better means for error correction. I tested this and came
to the conclusion that the QR code decoding programs available to me had better error
resilency than the ones for DataMatrix.
Several papers comparing QR and DataMatrix come to the conclusion that
DataMatrix allows a higher density and offers better means for error
correction. I tested this and came to the conclusion that the QR code
decoding programs available to me had better error resilency than the
ones for DataMatrix.

The toughest test I found, other than cutting complete parts from a code, was printing
the code, scanning it, printing the scanned image on a pure black and white printer
and then repeating this several times. While the barcode still looks good to the human
eye, this process slightly deforms the barcode in an irregular pattern.
The toughest test I found, other than cutting complete parts from a
code, was printing the code, scanning it, printing the scanned image on
a pure black and white printer and then repeating this several times.
While the barcode still looks good to the human eye, this process
slightly deforms the barcode in an irregular pattern.

libdmtx was still able to decode a DataMatrix barcode with 3 repetitions of the above
procedure. A expensive commercial library was still able to decode after 5 repetitions.
libdmtx was still able to decode a DataMatrix barcode with 3 repetitions
of the above procedure. A expensive commercial library was still able to
decode after 5 repetitions.

ZBar and the commercial library could still decode a QR code after 7 repetitions.
ZBar and the commercial library could still decode a QR code after 7
repetitions.

A laser printed QR code, completely soaked in dirty water for a few hours, rinsed with
clean water, dried and then scanned, could be decoded by ZBar on the first try.
A laser printed QR code, completely soaked in dirty water for a few
hours, rinsed with clean water, dried and then scanned, could be decoded
by ZBar on the first try.

This is why I chose QR code for this program.

## Encoding and data format

In my tests I found that larger QR codes are more at risk to becoming undecodable due to
wrinkles and deformations of the paper. So paperbackup splits the barcodes at 140 bytes of data.
In my tests I found that larger QR codes are more at risk to becoming
undecodable due to wrinkles and deformations of the paper. So
paperbackup splits the barcodes at 140 bytes of data.

QR codes offer a feature to concatenate the data of several barcodes. As this is not supported
by all programs, I chose not to use it.
QR codes offer a feature to concatenate the data of several barcodes. As
this is not supported by all programs, I chose not to use it.

Each barcode is labeled with a start marker `^<sequence number><space>`. After that the raw
and otherwise unencoded data follows.
Each barcode is labeled with a start marker `^<sequence number><space>`.
After that the raw and otherwise unencoded data follows.

## Plaintext output

paperbackup prints the plaintext in addition to the QR codes. If decoding one or more barcodes
should fail, you can use it as fallback.
paperbackup prints the plaintext in addition to the QR codes. If
decoding one or more barcodes should fail, you can use it as fallback.

To ease entering large amounts of "gibberish" like base64 data, each line is printed with
a checksum. The checksum is the first 6 hexadecimal characters of MD5 sum of the line content.
The MD5 is on the "pure" line content without the line break (e.g. \n or \r\n)
To ease entering large amounts of "gibberish" like base64 data, each
line is printed with a checksum. The checksum is the first 6 hexadecimal
characters of MD5 sum of the line content. The MD5 is on the "pure" line
content without the line break (e.g. `\n` or `\r\n`).

To verify a line checksum use
`echo -n "line content" | md5sum | cut -c -6`
To verify a line checksum use:

If a line is too long to be printed on paper, it is split. This is denoted by a "^" character
at the begin of the next line on paper. The "^" is not included in the checksum.
echo -n "line content" | md5sum | cut -c -6

If a line is too long to be printed on paper, it is split. This is
denoted by a `^` character at the begin of the next line on paper. The
`^` is not included in the checksum.

## Changing the paper format

The program writes PDFs in A4 by default. You can uncomment the respective lines
in the constants section of the source to change to US Letter.
The program writes PDFs for A4 paper by default. You can uncomment the
respective lines in the constants section of the source to change to US
Letter.

## Similar projects

###### paperbackup with reportlab backend https://github.com/tuxlifan/paperbackup
### [paperbackup with reportlab
backend](https://github.com/tuxlifan/paperbackup)

Should behave the same as this paperbackup but with using reportlab instead of PyX/LaTeX for PDF generation.
Any discrepancies should be filed at https://github.com/tuxlifan/paperbackup/issues
Should behave the same as this paperbackup but with using reportlab
instead of PyX/LaTeX for PDF generation. Any discrepancies should be
filed at https://github.com/tuxlifan/paperbackup/issues

###### PaperBack http://ollydbg.de/Paperbak/ and https://github.com/Rupan/paperbak/
### [PaperBack](https://github.com/Rupan/paperbak/)
See also: <http://ollydbg.de/Paperbak/>

Although it is GPL 3, this original version **PaperBack** (program 8-character name PaperBak) is for Windows only (but in 2018 a crossplatform, backwards-compatible, command line version paperbak-cli has been published, see next entry). It uses it's own proprietary barcode type. That allows it to produce much more dense code, but in case of a problem with decoding you are on your own.
Although it is GPLv3, this original version **PaperBack** (program
8-character name PaperBak) is for Windows only (but in 2018 a
cross-platform, backwards-compatible, command line version paperbak-cli
has been published, see next entry). It uses it's own proprietary
barcode type. That allows it to produce much more dense code, but in
case of a problem with decoding you are on your own.

###### paperback-cli https://git.teknik.io/scuti/paperback-cli
### [paperback-cli](https://git.teknik.io/scuti/paperback-cli)

Paperback-cli is the crossplatform, backwards-compatible, command line version of Oleh Yuschuk's PaperBack. https://github.com/Wikinaut/paperback-cli is a copy on github. See discussion on https://github.com/Rupan/paperbak/issues/1 for further programs.
Paperback-cli is the crossplatform, backwards-compatible, command line
version of Oleh Yuschuk's PaperBack.
<https://github.com/Wikinaut/paperback-cli> is a copy on github. See
discussion on <https://github.com/Rupan/paperbak/issues/1> for further
programs.

###### ColorSafe https://github.com/colorsafe/colorsafe
### [ColorSafe](https://github.com/colorsafe/colorsafe)

A data matrix for printing on paper. Inspired by PaperBak, ColorSafe is written with modern methods and technologies and is cross-platform. It aims to allow a few Megabytes of data (or more) to be stored on paper for a worst case scenario backup, for extremely long-term archiving, or just for fun. With best practices, ColorSafe encoded data can safely withstand the viccissitudes of technology changes over long periods of time.
A data matrix for printing on paper. Inspired by PaperBak, ColorSafe is
written with modern methods and technologies and is cross-platform. It
aims to allow a few megabytes of data (or more) to be stored on paper
for a worst case scenario backup, for extremely long-term archiving, or
just for fun. With best practices, ColorSafe encoded data can safely
withstand the viccissitudes of technology changes over long periods of
time.

###### Twibright Optar http://ronja.twibright.com/optar/
### [Twibright Optar](http://ronja.twibright.com/optar/)

Uses the not-so-common Golay code to backup 200KB per page. So it offers a much higher
density than paperbackup.py, but is probably more affected by defects on the paper.
GPL 2 and designed for Linux.
Uses the not-so-common Golay code to backup 200KB per page. So it offers
a much higher density than paperbackup.py, but is probably more affected
by defects on the paper. It is licensed under GPLv2 and designed for
Linux.

###### Paperkey http://www.jabberwocky.com/software/paperkey/
### [Paperkey](http://www.jabberwocky.com/software/paperkey/)

It is designed to reduce the data needed to backup a private GnuPG key. It does not help you
to print and scan the data. So it could be used in addition to paperbackup.py.
It is designed to reduce the data needed to backup a private GnuPG key.
It does not help you to print and scan the data. So it could be used in
addition to paperbackup.py.

###### asc2qr.sh https://github.com/4bitfocus/asc-key-to-qr-code
### [asc2qr.sh](https://github.com/4bitfocus/asc-key-to-qr-code)

Very similar to paperbackup.py. But it only outputs .png images without ordering information.
So you have to arrange printing and ordering yourself.
Very similar to paperbackup.py. But it only outputs PNG images without
ordering information, so you have to arrange printing and ordering
yourself.

## License

MIT X11 License
[MIT X11 License](https://opensource.org/licenses/MIT)
<!-- SPDX-License-Identifier: MIT -->

*NB: The upstream author marked this as "MIT X11" (there's an MIT
license, and an X11 license, but there's no SPDX identifier for an "MIT
X11" license) but included the MIT license in the
[LICENSE](https://github.com/intra2net/paperbackup/blob/master/LICENSE)
file. Please contact
[upstream](https://github.com/intra2net/paperbackup) if you're confused
about which license(s) apply.*
Comment on lines +226 to +235
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think upstream is ambiguous at all here actually. there's a LICENSE file right here:

https://github.com/intra2net/paperbackup/blob/master/LICENSE

sure, it says "x11" here, but that's probably an oversight.

it looks like a plain old expat to me, so maybe you'd like to clarify that way instead of introducing more ambiguity. :)

in other words:

Suggested change
[MIT X11 License](https://opensource.org/licenses/MIT)
<!-- SPDX-License-Identifier: MIT -->
*NB: The upstream author marked this as "MIT X11" (there's an MIT
license, and an X11 license, but there's no SPDX identifier for an "MIT
X11" license) but included the MIT license in the
[LICENSE](https://github.com/intra2net/paperbackup/blob/master/LICENSE)
file. Please contact
[upstream](https://github.com/intra2net/paperbackup) if you're confused
about which license(s) apply.*
[MIT "Expat" License](https://opensource.org/licenses/MIT)
<!-- SPDX-License-Identifier: MIT -->

18 changes: 18 additions & 0 deletions paperbackup-homebrew-setup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#!/usr/bin/env bash

set -e

brew install --cask basictex
brew install \
enscript ghostscript gnu-sed imagemagick libqrencode pillow python3

python3 -m pip install --upgrade pip
python3 -m pip install --upgrade setuptools

# the homebew zbar package interferes with the compilation of zbar-py
brew uninstall zbar --force

CFLAGS="-I$(brew --prefix)/include" LDFLAGS="-L$(brew --prefix)/lib" pip3 install qrencode pyx zbar-py

# (re)install the zbar package to provide the zbarimg binary
brew install zbar
Comment on lines +1 to +18
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't such a recipe supposed to go upstream in homebrew directly?

21 changes: 13 additions & 8 deletions paperrestore.sh → paperbackup-restore.sh
Original file line number Diff line number Diff line change
@@ -1,33 +1,38 @@
#!/bin/bash
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would argue for rewriting this script in plain POSIX instead of trying to guess where the right utilities are.

#!/usr/bin/env bash

# restore data backed up with paperbackup.py

# give one file containing all qrcodes as parameter

SCANNEDFILE=$1
SCANNEDFILE="$1"

if [ "$(uname)" = "Darwin" ]; then
PATH="/usr/local/bin:$PATH"
alias sed="gsed"
fi

if [ -z "$SCANNEDFILE" ]; then
echo "give one file containing all qrcodes as parameter"
exit 1
fi

if ! [ -f "$SCANNEDFILE" ]; then
echo "$SCANNEDFILE is not a file"
if [ ! -f "$SCANNEDFILE" ]; then
echo "$SCANNEDFILE is not a file" > /dev/stderr
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
echo "$SCANNEDFILE is not a file" > /dev/stderr
echo "$SCANNEDFILE is not a file" >&2

exit 1
fi

if [ ! -x "/usr/bin/zbarimg" ]; then
echo "/usr/bin/zbarimg missing"
which zbarimg > /dev/null 2>&1 || {
echo "zbarimg not found in PATH" > /dev/stderr
exit 2
fi
}
Comment on lines +24 to +27
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i this case i wouldn't bother autodetecting or even warning, you'll have a good error just by calling zbarimg without a path prefix lower down.


# zbarimg ends each scanned code with a newline

# each barcode content begins with ^<number><space>
# so convert that to \0<number><space>, so sort can sort on that
# then remove all \n\0<number><space> so we get the originial without newlines added

/usr/bin/zbarimg --raw -Sdisable -Sqrcode.enable "$SCANNEDFILE" \
zbarimg --raw -Sdisable -Sqrcode.enable "$SCANNEDFILE" \
| sed -e "s/\^/\x0/g" \
| sort -z -n \
| sed ':a;N;$!ba;s/\n\x0[0-9]* //g;s/\x0[0-9]* //g;s/\n\x0//g'
Comment on lines 36 to 38
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now this is the tricky bit of course. is there a way this could be ported to BSD sed while still being compatible with gnu sed? maybe it's just a matter of:

Suggested change
| sed -e "s/\^/\x0/g" \
| sort -z -n \
| sed ':a;N;$!ba;s/\n\x0[0-9]* //g;s/\x0[0-9]* //g;s/\n\x0//g'
| sed "s/\^/\x0/g" \
| sort -z -n \
| sed ':a;N;$!ba;s/\n\x0[0-9]* //g;s/\x0[0-9]* //g;s/\n\x0//g'

... it's been a while...

2 changes: 1 addition & 1 deletion paperbackup-verify.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
#!/usr/bin/bash
#!/usr/bin/env bash
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, just make sure the script is posix.


# USAGE: paperbackup-verify.sh backup.pdf
# where backup.pdf should be the pdf created with paperbackup.py
Expand Down