Sponsorised links
June 2008
May 2008
Parser Generators
John Resig - Pure JavaScript HTML Parser
html5 parsing difficile à implémenter ?(I also contemplated porting the HTML 5 parser, wholesale, but that seemed like a herculean effort.)
Sponsorised links
April 2008
Ian Bicking: a blog :: Python HTML Parser Performance
a performance comparison of several parsers and document models. The situation is a little complex because there’s different steps in handling HTML: 1. Parse the HTML 2. Parse it into something (a document object) 3. Serialize it
March 2008
Messages in a bottle » Blog Archive » Grune and Jacobs, Parsing Techniques, Second Edition
The second edition of Parsing techniques: A practical guide, by Dick Grune and Ceriel J. H. Jacobs, has now appeared.
February 2008
PHP Simple HTML DOM Parser
PHP Simple HTML DOM Parser
January 2008
librdfa - a pure C RDFa parser from Manu Sporny on 2008-01-31 (public-rdf-in-xhtml-tf@w3.org from January 2008)
librdfa is a pure C implementation of a standards-compliant RDFa parser. The library is quite easy to use (there are only 5 functions). librdfa is stream-based, very small and quite fast.
December 2007
PHP: DOM - Manual
PHP: SimpleXML - Manual
Javascript RDF/Turtle Parser
Parses RDF/Turtle string and generates Javascript/JSON array of RDF triples, which is the same format as that of Jim Ley's Javascript RDF Parser.
ReportLab - PyRXP
pyRXP - the fastest XML parser? ReportLab are proud to present pyRXP version 0.9, the fastest validating XML parser available for Python, and quite possibly anywhere :-).
