This repository contains reusable filters that can be added to a Funnelback collection to extend the filtering.
Crawl filters can be added to the main filter chain (filter.classes) and operate on whole documents as they are filtered during the gather phase.
See: Developing custom filters
- CA extra filters: Additional content filters for use with the content auditor.
Jsoup filters can be added to the Jsoup filter chain (filter.jsoup.classes).
Jsoup filters are used to transform HTML documents by operating on a Jsoup object representing the HTML structure.
See: Jsoup filters
- Metadata delimiters: Replace delimiters in specified metadata fields.