There is now a new release of the Validator.nu HTML Parser. Change highlights:
- Adds optional support for heuristic encoding sniffing using the ICU4J sniffer, jchardet or both.
- Adds support for rewinding and reparsing when becoming confident about the character encoding and the tentative encoding was wrong.
- Performs encoding name matching per spec instead of using the JDK mechanism.
- Implements spec changes up until just before SVG and MathML support. (Those will merit 1.1 or something.)
- Warning: The semantics of the doctype token have changed in case you have your own token handler (unlikely).
This item was originally posted at: http://blog.whatwg.org and is licensed under the MIT license
0 responses so far ↓
There are no comments yet...Kick things off by filling out the form below.
Leave a Comment