app-text / pdfsandwich

generator of sandwich OCR pdf files

Official package sites : http://www.tobias-elze.de/pdfsandwich ·

pdfsandwich generates "sandwich" OCR pdf files, i.e. pdf files which contain only images (no text) will be processed by optical character recognition (OCR) and the text will be added to each page invisibly "behind" the images. pdfsandwich is a command line tool which is supposed to be useful to OCR scanned books or journals. It is able to recognize the page layout even for multicolumn text. Essentially, pdfsandwich is a wrapper script which calls the following binaries: convert, cuneiform, gs, and hocr2pdf. It is known to run on Unix systems and has been tested on Linux and MacOS X. It supports parallel processing on multiprocessor systems.

v0.1.4-r1 :: 0 :: gentoo

Modified
License
GPL-2
Keywords
~amd64 ~x86
USE flags
png

General

png
Add support for libpng (PNG images)

dev-lang / ocaml : Type-inferring functional programming language descended from the ML family

sys-apps / gawk : GNU awk pattern-matching language

app-text / ghostscript-gpl : Interpreter for the PostScript language and PDF

app-text / tesseract : An OCR Engine, orginally developed at HP, now open source.

app-text / unpaper : Post-processor for scanned and photocopied book pages

media-gfx / exact-image : A fast, modern and generic image processing library

virtual / imagemagick-tools : Virtual for imagemagick command line tools

611532
app-text/pdfsandwich-0.1.6 version bump, description update
Robin H. Johnson · gentoo
Drop $Id$ per council decision in bug #611234.
Signed-off-by: Robin H. Johnson <robbat2@gentoo.org>
David Seifert · gentoo
app-text/pdfsandwich: Depend on virtual/imagemagick-tools
Package-Manager: Portage-2.3.3, Repoman-2.3.1 Closes: https://github.com/gentoo/gentoo/pull/3907
Robin H. Johnson · gentoo
proj/gentoo: Initial commit
This commit represents a new era for Gentoo: Storing the gentoo-x86 tree in Git, as converted from CVS. This commit is the start of the NEW history. Any historical data is intended to be grafted onto this point. Creation process: 1. Take final CVS checkout snapshot 2. Remove ALL ChangeLog* files 3. Transform all Manifests to thin 4. Remove empty Manifests 5. Convert all stale $Header$/$Id$ CVS keywords to non-expanded Git $Id$ 5.1. Do not touch files with -kb/-ko keyword flags. Signed-off-by: Robin H. Johnson <robbat2@gentoo.org> X-Thanks: Alec Warner <antarus@gentoo.org> - did the GSoC 2006 migration tests X-Thanks: Robin H. Johnson <robbat2@gentoo.org> - infra guy, herding this project X-Thanks: Nguyen Thai Ngoc Duy <pclouds@gentoo.org> - Former Gentoo developer, wrote Git features for the migration X-Thanks: Brian Harring <ferringb@gentoo.org> - wrote much python to improve cvs2svn X-Thanks: Rich Freeman <rich0@gentoo.org> - validation scripts X-Thanks: Patrick Lauer <patrick@gentoo.org> - Gentoo dev, running new 2014 work in migration X-Thanks: Michał Górny <mgorny@gentoo.org> - scripts, QA, nagging X-Thanks: All of other Gentoo developers - many ideas and lots of paint on the bikeshed