Flushes a twig up to (and including) the current element, then deletes all
unnecessary elements from the tree that's kept in memory. flush keeps track
of which elements need to be open/closed, so if you flush from handlers you
don't have to worry about anything. Just keep flushing the twig every time
you're done with a sub-tree and it will come out well-formed. After the
whole parsing don't forget to flush one more time to print the end of the
document. The doctype and entity declarations are also printed.
OPTIONNAL_OPTIONS
Use that option if you have updated the (internal) DTD and/or the enity
list and you want the updated DTD to be output
Example $t->flush( Update_DTD => 1); $t->flush( \*FILE, Update_DTD
=> 1); $t->flush( \*FILE);
flush take an optional filehandle as an argument.
- print OPTIONNAL_FILEHANDLE OPTIONNAL_OPTIONS
-
Prints the whole document associated with the twig. To be used only AFTER
the parse.
OPTIONNAL_OPTIONS: see flush.
- print_prolog OPTIONNAL_FILEHANDLE OPTIONNAL_OPTIONS
-
Prints the prolog (XML declaration + DTD + entity declarations) of a
document.
OPTIONNAL_OPTIONS: see flush.
- new
-
Should be private.
- set_gi ($gi)
-
Sets the gi of an element
- gi
-
Returns the gi of the element
- closed
-
Returns true if the element has been closed. Might be usefull if you are
somewhere in the tree, during the parse, and have no idea whether a parent
element is completely loaded or not.
- set_pcdata ($text)
-
Sets the text of a #PCDATA element. Returns the text or undef if the
element was not a #PCDATA.
- pcdata
-
Returns the text of a #PCDATA element or undef
- root
-
Returns the root of the twig containing the element
- twig
-
Returns the twig containing the element.
- parent ($optional_gi)
-
Returns the parent of the element, or the first ancestor whose gi is $gi.
- first_child ($optional_gi)
-
Returns the first child of the element, or the first child whose gi is $gi.
(ie the first of the element children whose gi matches) .
- last_child ($optional_gi)
-
Returns the last child of the element, or the last child whose gi is $gi.
(ie the last of the element children whose gi matches) .
- prev_sibling ($optional_gi)
-
Returns the previous sibling of the element, or the first one whose gi is
$gi.
- next_sibling ($optional_gi)
-
Returns the next sibling of the element, or the first one whose gi is $gi.
- atts
-
Returns a hash ref containing the element attributes
- set_atts ({att1=>$att1_val, att2=> $att2_val... )
-
Sets the element attributes with the hash supplied as argument
- del_atts
-
Deletes all the element attributes.
- set_att ($att, $att_value)
-
Sets the attribute of the element to a value
- att ($att)
-
Returns the attribute value
- del_att { delete $_[0]->{'att'}->{$_[1]}; }
-
Delete the attribute for the element
- set_id ($id)
-
Sets the id attribute of the element to a value. See
elt_id
to change the id attribute name
- id
-
Gets the id attribute vakue
- children ($optional_gi)
-
Returns the list of children (optionally whose gi is $gi) of the element
- ancestors ($optional_gi)
-
Returns the list of ancestors (optionally whose gi is $gi) of the element
- next_elt ($optional_gi)
-
Returns the next elt (optionally whose gi is $gi) of the element. This is
defined as the next element which opens after the current element opens.
Which usually means the first child of the element. Counter-intuitive as it
might look this allows you to loop through the whole document by starting
from the root.
- prev_elt ($optional_gi)
-
Returns the previous elt (optionally whose gi is $gi) of the element. This
is the first element which open the current one. So it's usually either the
last descendant of the previous sibling or simply the parent
- level ($optionnal_gi)
-
Returns the depth of the element in the tree (root is 1) If the optionnal
gi is given then only ancestors of the given type are counted
- in ($potential_parent)
-
Returns true if the element is in the potential_parent
- in_context ($gi, $optional_level)
-
Returns true if the element is included in an element whose gi is $gi,
within $level levels.
- cut
-
Cuts the element from the tree.
- paste ($optional_position, $ref)
-
Pastes a (previously cut) element. The optionnal position element can be
- - first_child (default)
-
The element is pasted as the first child of the
$ref
element
- - last_child
-
The element is pasted as the last child of the
$ref
element
- - before
-
The element is pasted before the
$ref
element, as its previous
sibling
- - after
-
The element is pasted after the
$ref
element, as its next
sibling
- erase
-
Erases the element: the element is deleted and all of its children are
pasted in its place.
- delete
-
Cut the element and frees the memory
- DESTROY
-
Frees the element from memory
- start_tag
-
Returns the string for the start tag for the element, including the />
at the end of an empty element tag
- end_tag
-
Returns the string for the end tag of an element, empty for an empty one.
- print OPTIONNAL_FILEHANDLE
-
Prints an entire element, including the tags, optionally to a FILEHANDLE
- sprint
-
Returns the string for an entire element, including the tags. To be used
with caution!
- text
-
Returns a string consisting of all the PCDATA in an element, without the
tagging
- set_text ($string)
-
Sets the text for the element: if the element is a PCDATA, just set its
text, otherwise cut all the children of the element and create a single
PCDATA child for it, which holds the text
- set_content (@list_of_elt_and_strings)
-
Sets the content for the element, from as list of strings and elements.
Cuts all the element children, then pastes the list elements, creating a
PCDATA element for strings.
- private methods
-
- close
-
- set_parent ( $parent)
-
- set_first_child ( $first_child)
-
- set_last_child ( $last_child)
-
- set_prev_sibling ( $set_prev_sibling)
-
- set_next_sibling ( $set_next_sibling)
-
- flushed
-
- flush
-
Those methods should not be used, unless of course you find some creative
and interesting, not to mention usefull, ways to do it.
- new
-
Creates an entity list
- add ($ent)
-
Adds an entity to an entity list.
- delete ($ent or $gi).
-
Deletes an entity (defined by its name or by the Entity object) from the
list.
- print (OPTIONAL_FILEHANDLE)
-
Prints the entity list
- new ($name, $val, $sysid, $pubid, $ndata)
-
Same arguments has the Entity handler for XML::Parser
- print (OPTIONNAL_FILEHANDLE)
-
Prints an entity declaration
- text
-
Returns the entity declaration text
See the test file in XML-Twig-1.6/t/test[1-n].t
To figure out what flush does call the following script with an xml file
and an element name as arguments
use XML::Twig;
my ($file, $elt)= @ARGV; my $t= new XML::Twig( TwigHandlers => {
$elt
=> sub {$_[0]->flush; print ``\n[flushed
here]\n'';} }); $t->parsefile( $file, ErrorContext => 2);
$t->flush; print ``\n'';
3 possibilities here
- No DTD
-
No doctype, no DTD information, no entitiy information, the world is
simple...
- Internal DTD
-
The XML document includes an internal DTD, and maybe entity declarations
If you use the TwigLoadDTD when creating the twig the DTD information and
the entity declarations can be accessed.
The DTD and the entity declarations will be flush'ed (or print'ed) either
asis (if they have not been modified) or as reconstructed (poorly, comments
are lost, order is not kept, due to it's content this DTD should not be
viewed bu anyone) if they have been modified. You can also modify them
directly by changing the $twig->{twig_doctype}->{internal} field
(straight from XML::Parser, see the Doctype handler doc)
- External DTD
-
The XML document includes a reference to an external DTD, and maybe entity
declarations.
If you use the TwigLoadDTD when creating the twig the DTD information and
the entity declarations can be accessed. The entity declarations will be
flush'ed (or print'ed) either asis (if they have not been modified) or as
reconstructed (badly, comments are lost, order is not kept).
You can change the doctype through the $twig->set_doctype method and
print the dtd through the $twig->dtd_text or $twig->dtd_print
methods.
If you need to modify the entity list this is probably the easiest way to
do it.
If an element contains ONLY whitespaces (as in the regexp \s), then
XML::Twig does not generate a PCDATA child for this element.
This can bite you if you are interested in the white spaces included in
some elements. This could be improved in a future version, through a
general option that processes all spaces, or by giving a list of elements
for which whitespaces are to be processed. Let me know what your
requirements are!
If you set handlers and use flush, do not forget to flush the twig one last
time AFTER the parsing, or you might be missing the end of the document.
Remember that element handlers are called when the element is CLOSED, so if
you have handlers for nested elements the inner handlers will be called
first. It makes it for example trickier than it would seem to number nested
clauses.
- - ID list
-
The ID list is NOT updated at the moment when ID's are modified or elements
cut or deleted.
- - change_gi
-
Does not work if you do: $twig->change_gi( $old1, $new);
$twig->change_gi( $old2, $new); $twig->change_gi( $new, $even_newer);
- - sanity check on XML::Parser method calls
-
XML::Twig should really prevent calls to some XML::Parser methods,
especially the setHandlers one.
- - Notation declarations
-
Are not output (in fact they are completely ignored).
- - multiple twigs are not well supported
-
A number of twig features are just global at the moment. These include the
ID list and the ``gi pool'' (if you use change_gi then you change the gi
for ALL twigs).
Next version will try to support these while trying not to be to hard on
performances (at least when a single twig is used!).
- - XML::Parser-like handlers
-
Sometimes it would be nice to be able to use both XML::Twig handlers and
XML::Parser handlers, for example to perform generic tasks on all open
tags, like adding an ID, or taking care of the autonumbering.
Next version...
- - create an element (not a twig) from a string.
-
You can use the benchmark
file to do additional bechmarks. Please send me bechmark information for
additional systems.
Michel Rodriguez
This library is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.
Bug reports and comments to m.v.rodriguez@ieee.org.
XML