app-text / pdfsandwich

generator of sandwich OCR pdf files

Official package sites : http://www.tobias-elze.de/pdfsandwich ·

pdfsandwich generates "sandwich" OCR pdf files, i.e. pdf files which contain only images (no text) will be processed by optical character recognition (OCR) and the text will be added to each page invisibly "behind" the images. pdfsandwich is a command line tool which is supposed to be useful to OCR scanned books or journals. It is able to recognize the page layout even for multicolumn text. Essentially, pdfsandwich is a wrapper script which calls the following binaries: unpaper, convert, gs, and tesseract. It supports parallel processing on multiprocessor systems.

v0.1.7 :: 0 :: gentoo

Modified
License
GPL-2
Keywords
~amd64 ~x86
USE flags
png

v0.1.4-r1 :: 0 :: gentoo

Modified
License
GPL-2
Keywords
~amd64 ~x86
USE flags
png

General

png
Add support for libpng (PNG images)

dev-lang / ocaml : Programming language supporting functional, imperative & object-oriented styles

sys-apps / gawk : GNU awk pattern-matching language

app-text / ghostscript-gpl : Interpreter for the PostScript language and PDF

app-text / poppler : PDF rendering library based on the xpdf-3.0 code base

app-text / tesseract : An OCR Engine, originally developed at HP, now open source.

app-text / unpaper : Post-processor for scanned and photocopied book pages

media-gfx / exact-image : A fast, modern and generic image processing library

virtual / imagemagick-tools : Virtual for imagemagick command line tools

Repository mirror & CI · gentoo
Merge updates from master
Alfredo Tupone · gentoo
app-text/pdfsandwich: fix dependency and full description
Closes: https://bugs.gentoo.org/611532 Package-Manager: Portage-2.3.99, Repoman-2.3.22 Signed-off-by: Alfredo Tupone <tupone@gentoo.org>
Repository mirror & CI · gentoo
Merge updates from master
Alfredo Tupone · gentoo
app-text/pdfsandwich: version bump to 0.1.7
Package-Manager: Portage-2.3.99, Repoman-2.3.22 Signed-off-by: Alfredo Tupone <tupone@gentoo.org>
Robin H. Johnson · gentoo
Drop $Id$ per council decision in bug #611234.
Signed-off-by: Robin H. Johnson <robbat2@gentoo.org>
David Seifert · gentoo
app-text/pdfsandwich: Depend on virtual/imagemagick-tools
Package-Manager: Portage-2.3.3, Repoman-2.3.1 Closes: https://github.com/gentoo/gentoo/pull/3907
Robin H. Johnson · gentoo
proj/gentoo: Initial commit
This commit represents a new era for Gentoo: Storing the gentoo-x86 tree in Git, as converted from CVS. This commit is the start of the NEW history. Any historical data is intended to be grafted onto this point. Creation process: 1. Take final CVS checkout snapshot 2. Remove ALL ChangeLog* files 3. Transform all Manifests to thin 4. Remove empty Manifests 5. Convert all stale $Header$/$Id$ CVS keywords to non-expanded Git $Id$ 5.1. Do not touch files with -kb/-ko keyword flags. Signed-off-by: Robin H. Johnson <robbat2@gentoo.org> X-Thanks: Alec Warner <antarus@gentoo.org> - did the GSoC 2006 migration tests X-Thanks: Robin H. Johnson <robbat2@gentoo.org> - infra guy, herding this project X-Thanks: Nguyen Thai Ngoc Duy <pclouds@gentoo.org> - Former Gentoo developer, wrote Git features for the migration X-Thanks: Brian Harring <ferringb@gentoo.org> - wrote much python to improve cvs2svn X-Thanks: Rich Freeman <rich0@gentoo.org> - validation scripts X-Thanks: Patrick Lauer <patrick@gentoo.org> - Gentoo dev, running new 2014 work in migration X-Thanks: Michał Górny <mgorny@gentoo.org> - scripts, QA, nagging X-Thanks: All of other Gentoo developers - many ideas and lots of paint on the bikeshed