In some cases, part of a document needs to be parsed. One option is
to use load_structure/2
or one of its variations and extract the desired elements from the
returned structure. This is a clean solution, especially on small and
medium-sized documents. It however is unsuitable for parsing really big
documents. Such documents can only be handled with the call-back output
interface realised by the
call(Event, Action) option of sgml_parse/2.
Event-driven processing is not very natural in Prolog.
The SGML2PL library allows for a mixed approach. Consider the case
where we want to process all descriptions from RDF elements in a
document. The code below calls process_rdf_description(Element)
on each element that is directly inside an RDF element.
:- dynamic
in_rdf/0.
load_rdf(File) :-
retractall(in_rdf),
open(File, read, In),
new_sgml_parser(Parser, []),
set_sgml_parser(Parser, file(File)),
set_sgml_parser(Parser, dialect(xml)),
sgml_parse(Parser,
[ source(In),
call(begin, on_begin),
call(end, on_end)
]),
close(In).
on_end('RDF', _) :-
retractall(in_rdf).
on_begin('RDF', _, _) :-
assert(in_rdf).
on_begin(Tag, Attr, Parser) :-
in_rdf, !,
sgml_parse(Parser,
[ document(Content),
parse(content)
]),
process_rdf_description(element(Tag, Attr, Content)).