dummy-link

EzXML

XML/HTML handling tools for primates

Readme

EzXML.jl Logo

EzXML.jl - XML/HTML tools for primates

Docs Latest TravisCI Status Appveyor Status codecov.io

Still in beta-quality package; the APIs may change in the future.

EzXML.jl is a package to handle XML/HTML documents for primates.

The main features are:

  • Reading and writing XML/HTML documents.
  • Traversing XML/HTML trees with DOM interfaces.
  • Searching elements using XPath.
  • Proper namespace handling.
  • Capturing error messages.
  • Automatic memory management.
  • Document validation.
  • Streaming parsing for large XML files.

Installation

You have to install libxml2 first.

For debian/ubuntu users:

apt-get install libxml2

For Homebrew users:

brew install libxml2

Then install EzXML.jl:

julia -e 'Pkg.add("EzXML")'

Usage

using EzXML

# Parse an XML string
# (use `readxml(<filename>)` to read a document from a file).
doc = parsexml("""

    
        Human
    
    
        Bonobo
        Chimpanzee
    

""")

# Get the root element from `doc`.
primates = root(doc)

# Iterate over child elements.
for genus in eachelement(primates)
    # Get an attribute value by name.
    genus_name = genus["name"]
    println("- ", genus_name)
    for species in eachelement(genus)
        # Get the content within an element.
        species_name = nodecontent(species)
        println("  └ ", species["name"], " (", species_name, ")")
    end
end
println()

# Find texts using XPath query.
for species_name in nodecontent.(find(primates, "//species/text()"))
    println("- ", species_name)
end

Quick reference

See the reference page or docstrings for more details.

Types:

  • EzXML.Document: an XML/HTML document
  • EzXML.Node: an XML/HTML node including elements, attributes, texts, etc.
  • EzXML.XMLError: an error happened in libxml2
  • EzXML.StreamReader: a streaming XML reader

IO:

  • From file: read(EzXML.Document, filename), readxml(filename|stream), readhtml(filename|stream)
  • From string or byte array: parse(EzXML.Document, string), parsexml(string), parsehtml(string)
  • To file: write(filename, doc)
  • To stream: print(io, doc)

Accessors:

  • Node information: nodetype(node), nodepath(node), nodename(node), nodecontent(node), setnodename!(node, name), setnodecontent!(node, content)
  • Document: root(doc), dtd(doc), hasroot(doc), hasdtd(doc), setroot!(doc, element_node), setdtd!(doc, dtd_node)
  • Attributes: node[name], node[name] = value, haskey(node, name), delete!(node, name)
  • Node predicate:
    • Document: hasdocument(node)
    • Parent: hasparentnode(node), hasparentelement(node)
    • Child: hasnode(node), haselement(node)
    • Sibling: hasnextnode(node), hasprevnode(node), hasnextelement(node), hasprevelement(node)
    • Node type: iselement(node), isattribute(node), istext(node), iscdata(node), iscomment(node), isdtd(node)
  • Tree traversal:
    • Document: document(node)
    • Parent: parentnode(node), parentelement(node)
    • Child: firstnode(node), lastnode(node), firstelement(node), lastelement(node)
    • Sibling: nextnode(node), prevnode(node), nextelement(node), prevelement(node)
  • Tree modifiers:
    • Link: link!(parent_node, child_node), linknext!(target_node, node), linkprev!(target_node, node)
    • Unlink: unlink!(node)
    • Create: addelement!(parent_node, name, [content])
  • Iterators:
    • Iterator: eachnode(node), eachelement(node), eachattribute(node)
    • Vector: nodes(node), elements(node), attributes(node)
  • Counters: countnodes(node), countelements(node), countattributes(node)
  • Namespaces: namespace(node), namespaces(node)

Constructors:

  • EzXML.Document type: XMLDocument(version="1.0"), HTMLDocument(uri=nothing, externalID=nothing)
  • EzXML.Node type: XMLDocumentNode(version="1.0"), HTMLDocumentNode(uri, externalID), ElementNode(name), TextNode(content), CommentNode(content), CDataNode(content), AttributeNode(name, value), DTDNode(name, [systemID, [externalID]])

Queries:

  • XPath: find(doc|node, xpath), findfirst(doc|node, xpath), findlast(doc|node, xpath)

Examples

Other XML/HTML packages in Julia

First Commit

11/02/2016

Last Touched

10 days ago

Commits

195 commits

Requires: