Skip to content

anumhosen/pdf2html

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pdf2html

INFO

pdf2html is a tool to extract text from a pdf and output to an HTML.

Is based on the ExtractText util from pdfbox library from Apache foundation.

BUILD

To build the project just run ant on the root directory.

Run jarall to have a merged jar with dependencies included.

RUN

To run the project execute java -jar pdf2html-all.jar filename.pdf a filename.html will be outputed.

For more information run java -jar pdf2html-all.jar filename.pdf

License

pdf2html is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

pdf2html is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with pdf2html. If not, see http://www.gnu.org/licenses/.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Java 100.0%