pdf2html

INFO

pdf2html is a tool to extract text from a pdf and output to an HTML.

Is based on the ExtractText util from pdfbox library from Apache foundation.

BUILD

To build the project just run ant on the root directory.

Run jarall to have a merged jar with dependencies included.

RUN

To run the project execute java -jar pdf2html-all.jar filename.pdf a filename.html will be outputed.

For more information run java -jar pdf2html-all.jar filename.pdf

License

pdf2html is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

pdf2html is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with pdf2html. If not, see http://www.gnu.org/licenses/.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

pdf2html

INFO

BUILD

RUN

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

pdf2html

INFO

BUILD

RUN

License