Steps for installing and running initial commands.
- Follow instructions in the ami3 repo for installing ami3 and maven:
- Make sure ami is in PATH according to installation instructions:
This gives an executable in <your dir>/ami3/target/appassembler and you should set your PATH to include <your dir>/ami3/target/appassembler/bin
- Test ami is working, and git pull and rebuild with maven if necessary for latest version (from inside the main ami3 directory).
ami --help
git pull
mvn clean install -Dmaven.test.skip=true
- Create subdirectory with downloaded PDF files, for example, called batterypapers, and the make a new project.
ami -p batterypapers makeproject --rawfiletypes pdf
- Extract pdfs and set each in their own folder
ami -p batterypapers pdfbox
- Produce a subdirectory in each pdf with separate images
ami -p batterypapers image
- Produce images scaled up, and with sharpened text
ami -p batterypapers --inputname raw image --scalefactor 2.0
ami -p batterypapers --inputname raw image --sharpen sharpen8
- Test to select lines within a figure corresponding to data
ami -p batterypapers --inputname raw pixel
- One line command to do OCR with tesseract.
ami -p batterypapers --inputname raw ocr --html true --tesseract=/opt/local/bin/tesserlefactor 2.0
Manually pointing to where tesseract is installed, in this case:
- Complete one line command:
ami -p <targetDir> --inputname raw -v pdfbox filter -sdm image --posterize 4 ocr --html true --tesseract=/opt/local/bin/tesserlefactor 2.0
- Using ami section to extract table and figure information:
ami -p liion -v --forcemake section --extract tab fig --summary fig tab
- Make summary files of various parts of paper in xml"
ami -vvv -p . --output sections/body/methods summary --glob **/PMC*/sections/*_body/*_methods/**/*_p.xml --flatten