Summary
pdfsandwich generates "sandwich" OCR pdf files, i.e. pdf files which contain only images (no text) will be processed by optical character recognition (OCR) and the text will be added to each page invisibly "behind" the images. pdfsandwich is a command line tool which is supposed to be useful to OCR scanned books or journals. It is able to recognize the page layout even for multicolumn text. Essentially, pdfsandwich is a wrapper script which calls the following binaries: unpaper, convert, gs, and tesseract. It supports parallel processing on multiprocessor systems.
Versions
v0.1.7 :: 0 :: gentoo
- Modified
- License
- GPL-2
- Keywords
- amd64 ~x86
- USE flags
- png
USE flags
General
- png
- Add support for libpng (PNG images)
Dependencies
dev-lang / ocaml : Programming language supporting functional, imperative & object-oriented styles
Runtime Dependencies
app-text / ghostscript-gpl : Interpreter for the PostScript language and PDF
app-text / poppler : PDF rendering library based on the xpdf-3.0 code base
app-text / tesseract : An OCR Engine, originally developed at HP, now open source
app-text / unpaper : Post-processor for scanned and photocopied book pages
media-gfx / exact-image : A fast, modern and generic image processing library
virtual / imagemagick-tools : Virtual for imagemagick command line tools
Change logs
- Repository mirror & CI · gentoo
Merge updates from master - Alfredo Tupone · gentoo
app-text/pdfsandwich: VariableOrderWrong
Signed-off-by: Alfredo Tupone <tupone@gentoo.org> - Repository mirror & CI · gentoo
Merge updates from master - Lucio Sauer · gentoo
*/*: inline mirror://sourceforge
bump copyright of touched ebuilds to 2024 Signed-off-by: Lucio Sauer <watermanpaint@posteo.net> Signed-off-by: Michał Górny <mgorny@gentoo.org> - Repository mirror & CI · gentoo
Merge updates from master - Alfredo Tupone · gentoo
app-text/pdfsandwich: ignore QA. Built with ocaml
Closes: https://bugs.gentoo.org/924965 Signed-off-by: Alfredo Tupone <tupone@gentoo.org> - Repository mirror & CI · gentoo
Merge updates from master - Alfredo Tupone · gentoo
app-text/pdfsandwich: stabilize 0.1.7 for amd64
Closes: https://bugs.gentoo.org/919707 Signed-off-by: Alfredo Tupone <tupone@gentoo.org> - Repository mirror & CI · gentoo
Merge updates from master - Alfredo Tupone · gentoo
app-text/pdfsandwich: filter lto
Closes: https://bugs.gentoo.org/866043 Package-Manager: Portage-3.0.30, Repoman-3.0.3 Signed-off-by: Alfredo Tupone <tupone@gentoo.org> - Repository mirror & CI · gentoo
Merge updates from master - Aaron Bauman · gentoo
app-text/pdfsandwich: drop old EAPI=5
Signed-off-by: Aaron Bauman <bman@gentoo.org> - Repository mirror & CI · gentoo
Merge updates from master - Alfredo Tupone · gentoo
app-text/pdfsandwich: fix dependency and full description
Closes: https://bugs.gentoo.org/611532 Package-Manager: Portage-2.3.99, Repoman-2.3.22 Signed-off-by: Alfredo Tupone <tupone@gentoo.org> - Repository mirror & CI · gentoo
Merge updates from master - Alfredo Tupone · gentoo
app-text/pdfsandwich: version bump to 0.1.7
Package-Manager: Portage-2.3.99, Repoman-2.3.22 Signed-off-by: Alfredo Tupone <tupone@gentoo.org> - Robin H. Johnson · gentoo
Drop $Id$ per council decision in bug #611234.
Signed-off-by: Robin H. Johnson <robbat2@gentoo.org> - David Seifert · gentoo
app-text/pdfsandwich: Depend on virtual/imagemagick-tools
Package-Manager: Portage-2.3.3, Repoman-2.3.1 Closes: https://github.com/gentoo/gentoo/pull/3907 - Robin H. Johnson · gentoo
proj/gentoo: Initial commit
This commit represents a new era for Gentoo: Storing the gentoo-x86 tree in Git, as converted from CVS. This commit is the start of the NEW history. Any historical data is intended to be grafted onto this point. Creation process: 1. Take final CVS checkout snapshot 2. Remove ALL ChangeLog* files 3. Transform all Manifests to thin 4. Remove empty Manifests 5. Convert all stale $Header$/$Id$ CVS keywords to non-expanded Git $Id$ 5.1. Do not touch files with -kb/-ko keyword flags. Signed-off-by: Robin H. Johnson <robbat2@gentoo.org> X-Thanks: Alec Warner <antarus@gentoo.org> - did the GSoC 2006 migration tests X-Thanks: Robin H. Johnson <robbat2@gentoo.org> - infra guy, herding this project X-Thanks: Nguyen Thai Ngoc Duy <pclouds@gentoo.org> - Former Gentoo developer, wrote Git features for the migration X-Thanks: Brian Harring <ferringb@gentoo.org> - wrote much python to improve cvs2svn X-Thanks: Rich Freeman <rich0@gentoo.org> - validation scripts X-Thanks: Patrick Lauer <patrick@gentoo.org> - Gentoo dev, running new 2014 work in migration X-Thanks: Michał Górny <mgorny@gentoo.org> - scripts, QA, nagging X-Thanks: All of other Gentoo developers - many ideas and lots of paint on the bikeshed