Summary
HtmlCleaner is an open source HTML parser written in Java. HTML found on the Web is usually dirty, ill-formed and unsuitable for further processing. For any serious consumption of such documents, it is necessary to first clean up the mess and bring some order to the tags, attributes and ordinary text. For any given HTML document, HtmlCleaner reorders individual elements and produces well-formed XML. By default, it follows similar rules that the most of web browsers use in order to create the Document Object Model. However, you can provide custom tag and rule sets for tag filtering and balancing.
Versions
v2.13 :: 0 :: gentoo
- Modified
- License
- BSD
- Keywords
- ~amd64 ~x86
- USE flags
- doc source test
USE flags
General
- doc
- Add extra documentation (API, Javadoc, etc). It is recommended to enable per package instead of globally
- source
- Zip the sources and install them
- test
- Enable dependencies and/or preparations necessary to run tests (usually controlled by FEATURES=test but can be toggled independently)
elibc
- FreeBSD
- ELIBC setting for systems that use the FreeBSD C library
Dependencies
app-arch / unzip : unzipper for pkzip-compressed files
app-arch / zip : Info ZIP (encryption support)
dev-java / ant-core : Java-based build tool similar to 'make' that uses XML configuration files
dev-java / java-config : Java environment configuration query tool
dev-java / jdom : Java API to manipulate XML data
dev-java / junit : Simple framework to write repeatable tests
Runtime Dependencies
app-arch / zip : Info ZIP (encryption support)
dev-java / java-config : Java environment configuration query tool
Change logs
- Repository mirror & CI · gentoo
Merge updates from master - Michał Górny · gentoo
*/*: [QA] Fix trivial cases of MissingTestRestrict
The result was achieved via the following pipeline: pkgcheck scan -c RestrictTestCheck -R FormatReporter \ --format '{category}/{package}/{package}-{version}.ebuild' | xargs -n32 grep -L RESTRICT | xargs -n32 sed -i -e '/^IUSE=.*test/aRESTRICT="!test? ( test )"' The resulting metadata was compared before and after the change. Few Go ebuilds had to be fixed manually due to implicit RESTRICT=strip added by the eclass. Two ebuilds have to be fixed because of multiline IUSE. Suggested-by: Robin H. Johnson <robbat2@gentoo.org> Closes: https://github.com/gentoo/gentoo/pull/13942 Signed-off-by: Michał Górny <mgorny@gentoo.org> - Robin H. Johnson · gentoo
Drop $Id$ per council decision in bug #611234.
Signed-off-by: Robin H. Johnson <robbat2@gentoo.org> - Patrice Clement · gentoo
dev-java/htmlcleaner: Stable for amd64. Retroactively mark stable for the remaining arches using the ALLARCHES policy.
Package-Manager: portage-2.2.26 - James Le Cuirot · gentoo
dev-java/htmlcleaner: JAVA_CLASSPATH_EXTRA has been renamed
Package-Manager: portage-2.2.20.1 - James Le Cuirot · gentoo
dev-java/htmlcleaner: Imported and bumped from java-overlay
Closes bug #369977. Package-Manager: portage-2.2.20.1