xmlstarlet usage notes

version 2023-08-15 | news

Introduction

xmlstarlet (xmlstar.sf.net) is the no-nonsense XML multitool that lets you write simple queries or edits on the command line, avoids most of the stylesheet formal stuff, and gives your <o/><o/>-looking eyes a moment of relief. It’s also a cranky minimalist tool which targets the 1.0 versions of XPath / XSLT / EXSLT still widely used and, it seems, users with little need for documentation.

This is an edited version of my personal notes on xmlstarlet with worked examples – knowledge gained as an outsider through use, trial and error – focusing on the select and edit commands and EXSLT. It’s not a tutorial or a FAQ, it requires a grasp of XML tools and the POSIX shell. Copyright is retained.

xmlstarlet features

Notation used in this document

-q (--quiet) means either short option -q or long option --quiet can be used.

«name» in a command or message is a placeholder for the actual name used, e.g. xmlXPathCompOpEval: function «name» not found.

Links look like this: external, internal, internal link appearing in a navigation link cloud, ditto* linking to a larger section with a local link cloud, [ sel ] linking into the table of contents. On mouseover headers display a permalink icon, on level 2 and 3 also navigation link icons, on level 4 a section link icon.

Code looks like this: test -s file.xml || log …, occasionally with an (ellipsis) inside for brevity. For readability longer commands are usually split across lines and indented.

Admonitions look like this: Caution.

All shell code samples were made for a POSIX shell (dash 0.5.10) with xmlstarlet 1.6.1 (linked with libxml2 20913, libxslt 10134, and libexslt 820) from the Debian distribution.

A word about stylesheet syntax

[T]he use of SGML syntax for stylesheets was proposed as long ago as 1994, and it seems that this idea gradually became the accepted wisdom. It’s difficult to trace exactly what the overriding arguments were, and when you find yourself writing something like:

    <xsl:variable name="y">
      <xsl:call-template name="f">
        <xsl:with-param name="x"/>
      </xsl:call-template>
    </xsl:variable>

to express what in other languages would be written as y = f(x);, then you may find yourself wondering how such a decision came to be made.

– Michael Kay, XSLT Programmer’s Reference, Ch.1, ISBN 1861005067

Documentation

User’s guide, examples, source code, forums

Selected resources

List the generated XSLT: xmlstarlet select -C

select’s -C (--comp) option lists the stylesheet the current command line will generate – it requires no input file – e.g. 

xmlstarlet select -T -C -t -m 'str:tokenize("Hello, world",",o")' -v '.' -n

Output:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:str="http://exslt.org/strings" xmlns:exslt="http://exslt.org/common" version="1.0" extension-element-prefixes="exslt str">
  <xsl:output omit-xml-declaration="yes" indent="no" method="text"/>
  <xsl:template match="/">
    <xsl:for-each select="str:tokenize(&quot;Hello, world&quot;,&quot;,o&quot;)">
      <xsl:call-template name="value-of-template">
        <xsl:with-param name="select" select="."/>
      </xsl:call-template>
      <xsl:value-of select="'&#10;'"/>
    </xsl:for-each>
  </xsl:template>
  <xsl:template name="value-of-template">
    <xsl:param name="select"/>
    <xsl:value-of select="$select"/>
    <xsl:for-each select="exslt:node-set($select)[position()&gt;1]">
      <xsl:value-of select="'&#10;'"/>
      <xsl:value-of select="."/>
    </xsl:for-each>
  </xsl:template>
</xsl:stylesheet>

xmlstarlet commands

General notes

Caution: Other features need some work too warns the user’s guide.

Perhaps they’re thinking of these:

Special characters

The special XML characters are &<>'" or – as references to predefined general entities&amp; &lt; &gt; &apos; &quot;. With few exceptions[1] they are never entered as entity references in an xmlstarlet command, and they are output as literals when xmlstarlet select’s -T (--text) option is in effect.

For example, in XPath predicates xmlstarlet wants < (less-than) on the command line, as in factor[. < 2.19], whereas XSLT stylesheets require &lt; inside attribute values, or > (greater-than) with operands swopped. Likewise, a numeric character reference such as &#9; for a tab character belongs in an XML file, not on the xmlstarlet command line.

[1] Exceptions

The predefined entity ref:s – as well as character ref:s below &#x100; – are recognized in the following which therefore require &amp; to represent an & (ampersand) character:

Caution: The example in the user’s guide section 4.1 meant to convert newlines to blanks using a character reference for the newline – xml sel … -v "translate(. , '&#10;', ' ')" … – in fact converts & (ampersand) characters to blanks, and strips #, 1, 0, and ; characters. The translate(…) expression would work as intended in an XSLT stylesheet but means something rather different on the xmlstarlet select command line.

See also: xmlstarlet esc / xmlstarlet unesc | Replace text sample

xmlstarlet edit handling special characters:

printf '%s' '<e/>'|
xmlstarlet edit -O \
  -s     '*' -t elem -n esuv -v '' -u '$prev' -v '&Save as <.oona>' \
  -a '$prev' -t elem -n eaux -v '' -u '$prev' -x '"&Save as <.oona>"' \
  -s     '*' -t elem -n esv1 -v '&amp;Save as <.oona>' \
  -a '$prev' -t elem -n eav1 -v '&amp;Save&#x20;&#x61;&#x73;&#x20;&lt;.oona&gt;' \
  -i '$prev' -t elem -n eiv1 -v "$(xmlstarlet escape '&Save as <.oona>')"

Output:

<e>
  <esuv>&amp;Save as &lt;.oona&gt;</esuv>
  <eaux>&amp;Save as &lt;.oona&gt;</eaux>
  <esv1>&amp;Save as &lt;.oona&gt;</esv1>
  <eiv1>&amp;Save as &lt;.oona&gt;</eiv1>
  <eav1>&amp;Save as &lt;.oona&gt;</eav1>
</e>

Using variables with xmlstarlet select, e.g. 

Using variables with xmlstarlet edit, e.g.

See also: select --var | edit --var

XML declaration

The various xmlstarlet commands each handle the XML declaration in their own way but all print it with a trailing newline if requested:

Adding an XML declaration using select:

$ printf '<v w="x"/>' |
  xmlstarlet select -D -E 'ISO-8859-2' -t -c '/'
<?xml version="1.0" encoding="ISO-8859-2"?>
<v w="x"/>

XML parsing and serialization

Several xmlstarlet commands allow selected options to be passed to the libxml2 XML parser (cf. API reference and source code) or the libxml2 XML serializer (cf. API reference and source code).

format’s -H (--html) and transform’s --html options substitute the libxml2 HTML 4.0 parser.

The c14n command converts an XML document to a normal format.

To expand empty-element tags, changing <p/> to <p></p>, for example:

xmlstarlet edit --pf -s '//*[not(node())]' -t text -n ignored -v '' file.xml

See also: network access | Try out edit’s formatting options example

The xmlEscapeEntities function in libxml2’s xmlsave.c serialization module gives special treatment to characters &<> (output as &amp;, &lt;, and &gt;) but neither apostrophe nor double quote ('"). xmlstarlet has no option to override this.

Using a CDATA section to keep the serializer from applying default rules:

$ printf '%s\n' '<v><w>x</w><x>🧩</x></v>' |
  xmlstarlet edit -O -P -d '*/w'
<v><x>&#x1F9E9;</x></v>
$ :
$ printf '%s\n' '<v><w>x</w><x><![CDATA[🧩]]></x></v>' |
  xmlstarlet edit -O -P -d '*/w'
<v><x><![CDATA[🧩]]></x></v>

XML external entities

Caution: Of xmlstarlet’s commands only c14n, select, and transform seem to understand an entity reference like <doc>&e;</doc>, according to the following test script. This makes pre/post-processing a requirement if using xmlstarlet’s other commands to handle external entities.

#!/bin/sh
# Test xmlstarlet commands with external general parsed entity.
# - ${xdata} holds data file contents, defaults to a few <e>N</e>
# - ${keepf} non-empty to keep temporary files in $TMPDIR
# - ${dryrun} non-empty to print but not execute commands
# - ${doecho} non-empty to also print commands before executing

skelf=$(mktemp -t "xsskel-$$-XXXXXXXXXX.xml")
dataf=$(mktemp -t "xsdata-$$-XXXXXXXXXX.xml")
idxff=$(mktemp -t "xsidxf-$$-XXXXXXXXXX.xsl")
test "${keepf}" ||
    trap "rm '${skelf}' '${dataf}' '${idxff}'" INT EXIT

printf '%s\n' \
  '<!DOCTYPE skel [<!ENTITY e SYSTEM "'"${dataf}"'">]><doc>&e;</doc>' \
> "${skelf}"
printf '%s' \
  "${xdata:-<e>1</e><e>2</e><e>3</e><e>4</e>}" \
> "${dataf}"
printf '<v/>' | xmlstarlet select -t \
 -e xsl:transform -a version -o 1.0 -b \
 -e xsl:template -a match -o '@*|node()' -b \
   -e xsl:copy -e xsl:apply-templates -a select -o '@*|node()' \
> "${idxff}"    ## identity transform

for cmd  in  c14n ed el fo pyx sel tr val
do
  case ${cmd} in
    (c14n|el|fo|pyx|val)
            set -- ;;
    (ed)    set -- -d '*/*[3]' ;;
    (sel)   set -- -T -t -c / -n ;;
    (tr)    set -- "${idxff}" ;;
    (*)     break ;;
  esac
  set -- xmlstarlet "${cmd}" "$@" "${skelf}"
  if test "${dryrun}${doecho}"; then
    printf '\n\n# command:'; printf " '%s'" "$@"; printf '\n'
  fi
  if ! test "${dryrun}"; then
    "$@"; printf '\n## %s returned %d\n\n' "${cmd}" "$?"
  fi
done

Given a data file containing <e>1</e><e>2</e><e>3</e><e>4</e> (making it well-formed XML) pyx returns 4 (outputs the doctype but says Entity 'e' not defined) while c14n, ed, el, fo, sel, tr, and val all return zero. But ed, el, and fo (plus val, presumably) fail to expand the entity reference.

Given a data file containing <a>B</c> (clearly making it non-XML) the ed, el, and val commands all return zero – and val even pronouncing «datafile» - valid – while c14n, fo, pyx, sel, and tr return 3, 2, 4, 3, and 6, respectively.

xmllint from the libxml2-utils package has a --noent option to substitute entity values for entity references (e.g. xmllint --noent --dropdtd file.xml).

Exit values

src/xmlstar.h defines the following exit values for xmlstarlet:

but mind these:

Numeric representation

XPath 1.0 does not support numbers expressed in scientific notation, cf. W3C recommendation (Number ::= Digits ('.' Digits?)? | '.' Digits and Digits ::= [0-9]+).

Tools based on libxml2 do support it, however, cf.  xmlXPathFormatNumber() (snprintf(work, sizeof(work),"%*.*e", integer_place, fraction_place, number);).

Here are a few examples of libxml2 handling XPath computations – and libxslt handling the XSLT format-number() function.

printf '%s\n' '<v>1240057409536</v>' |
xmlstarlet select -T -t \
  -v '*' -n \
  -v '0 + *' -n \
  -v '* div 1' -n \
  -v '* div 1000 * 1E3' -n \
  -v '* div 1.240057409536e+12' -n \
  -o '---' -n \
  -v 'round(* div 1)' -n \
  -v 'round(* div 10)' -n \
  -v 'round(* div 100)' -n \
  -v 'round(* div 1000)' -n \
  -o '---' -n \
  -v 'format-number(* div 1,"#")' -n \
  -v 'format-number(* div 1,"#,###")' -n

Output:

1240057409536
1.240057409536e+12
1.240057409536e+12
1.240057409536e+12
1
---
1.240057409536e+12
1.24005740954e+11
1.2400574095e+10
1240057410
---
1240057409536
1,240,057,409,536

One-lining

In this document longer commands are usually split across lines and indented, like this:

xmlstarlet select -T -t \
  --var sq -o "'" -b \
  -o 'xmlstarlet edit --omit-decl '\\ -n \
  -o "  --var N 'Names/Name' \\" -n \
  -m '*/*' \
    -o '  -r ' -v 'concat($sq,"$N[",position(),"]",$sq)' \
    -o '  -v ' -v 'concat($sq,name(),format-number(position(),"0000"),$sq)' -o ' '\\ -n \
  -b \
  -f -n \
"${infile:-file.xml}"

To strip such a command of line continuation characters and leading whitespace pipe it through following sed command (changing one line, not an entire shell script),

sed -e ':1' -e 's/^[[:blank:]]*//' -e '/\\$/!b' -e '$b' -e 'N' -e 's/\\\n[[:blank:]]*//' -e 'b1'

or, as an alias, silently using xsel to paste from the clipboard, call sed, have paste add a trailing newline if needed, and return the result to the clipboard:

alias mfyoi="xsel -b -o |
sed -e 's/^[[:blank:]]*//' -e ':1' -e '/\\\\\$/!b' \
    -e '\$b' -e 'N' -e 's/\\\\\\n[[:blank:]]*//' -e 'b1' |
paste -s -d '\\n' |
xsel -b -i"

Thus minified:

xmlstarlet select -T -t --var sq -o "'" -b -o 'xmlstarlet edit --omit-decl '\\ -n -o "  --var N 'Names/Name' \\" -n -m '*/*' -o '  -r ' -v 'concat($sq,"$N[",position(),"]",$sq)' -o '  -v ' -v 'concat($sq,name(),format-number(position(),"0000"),$sq)' -o ' '\\ -n -b -f -n "${infile:-file.xml}"

makefile notes (GNU Make)

Links: GNU Make manual | Ask Mr. Make article on GNU Make escaping

Sample makefile:

SHELL := /bin/sh
space := $(info) $(info)
tab   := $(shell printf '\t')
define newline =


endef
# next line defines U+0023 NUMBER SIGN (aka \043, pound sign, hashtag, …)
\H    := \#
.RECIPEPREFIX = >
.PHONY: all
all:
> printf '%s' '<v a="fee" b="fi" c="fo" d="fum"/>' | \
  xmlstarlet select -T -t --var x='*/@*' -v '$$x' -n | \
  paste -s -d '$$ ' -
> printf '%s\n' '$(space)x$(tab)\$(newline)'"$${OLDPWD$(\H)$(\H)*/}" \
  "process $$$$ exiting"

Output from make -s:

fee$fi fo$fum
 x	\
incubator
process 20965 exiting

Global options and parameters

Global options go before the command, as in xmlstarlet -q format file.

An input filename starting with - (dash) – unless it’s short for stdin – must be prefixed with ./ (dot slash) otherwise it will be parsed as an option, possibly causing select (Caution) to ignore the file.

Beware of known bugs for filenames containing (#123 ) ' (single quote), or (#110) urlencoded characters, e.g. %20.

See also: couldn’t read file | failed to load external entity

--help

xmlstarlet --help shows the general usage reminder, xmlstarlet «command» -h (--help) the command-specific ditto.

--version

Prints version information and terminates.

Sample output from xmlstarlet --version:

1.6.1
compiled against libxml2 2.9.4, linked with 20910
compiled against libxslt 1.1.33, linked with 10134

-q (--quiet): suppress error output

Error messages from libxml2 or libxslt are suppressed by this option.

Caution: this option also suppresses ordinary output (to stdout) from xmlstarlet select.

See also: select -Q (--quiet) local option | format -Q (--quiet) local option

--no-doc-namespace: don’t use namespace bindings from input’s root element

--doc-namespace: extract namespace bindings from input’s root element (default)

By default (--doc-namespace being in effect) namespaces declared in input’s root element (aka document element) can be referred to without explicit -N options; if the default namespace is declared there it is bound to the _ (underscore) (aka DEFAULT) prefix.

Although --no-doc-namespace and --doc-namespace are global options only xmlstarlet select and xmlstarlet edit use them.

See also: User’s guide ch. 5 | Use a namespace | select -N | edit -N

Network access (--net)

Several xmlstarlet commands - select, edit, format, c14n, validate, and transform - have a --net option to allow network access, to fetch remote DTDs and entities. --net clears the XML_PARSE_NONET flag for the libxml2 XML parser (API ref).

For security, network access is disallowed by default, cf. article on XML external entity attack.

uri replacing input file

xmlstarlet --help says,

Wherever file name mentioned in command help it is assumed that URL can be used instead as well.

Should work with HTTP and FTP protocols, not HTTPS (due to libxml2 limitations). (Distribution-dependent?)

See also: --net

Display structure: xmlstarlet elements

xmlstarlet elements (aka el) displays the structure of an XML document by listing the paths of elements and optionally attributes and attribute values.

Usage: elments [option] [«xml-file»]

At most one option and one input file is accepted.

Local options

-a - include attributes

-v - include attribute values

-u - sorted unique lines

-dN - sorted unique lines to depth N

Examples

Sample session

$ : "${infile=recently-used.xbel}"
$ :
$ xmlstarlet elements -u "${infile}"
xbel
xbel/bookmark
xbel/bookmark/info
xbel/bookmark/info/metadata
xbel/bookmark/info/metadata/bookmark:applications
xbel/bookmark/info/metadata/bookmark:applications/bookmark:application
xbel/bookmark/info/metadata/bookmark:groups
xbel/bookmark/info/metadata/bookmark:groups/bookmark:group
xbel/bookmark/info/metadata/mime:mime-type
$ :
$ xmlstarlet el -d3 "${infile}"
xbel
xbel/bookmark
xbel/bookmark/info
$ :
$ # Skip repetitions
$ xmlstarlet el -a "${infile}" | awk '!seen[$1]++' | head -n 10
xbel
xbel/@xmlns:bookmark
xbel/@xmlns:mime
xbel/@version
xbel/bookmark
xbel/bookmark/@href
xbel/bookmark/@added
xbel/bookmark/@modified
xbel/bookmark/@visited
xbel/bookmark/info
$ :
$ xmlstarlet el -v "${infile}" | sed '2d;9q'
xbel[@xmlns:bookmark='http://www.freedesktop.org/standards/desktop-bookmarks' and @xmlns:mime='http://www.freedesktop.org/standards/shared-mime-info' and @version='1.0']
xbel/bookmark/info
xbel/bookmark/info/metadata[@owner='http://freedesktop.org']
xbel/bookmark/info/metadata/mime:mime-type[@type='image/jpeg']
xbel/bookmark/info/metadata/bookmark:groups
xbel/bookmark/info/metadata/bookmark:groups/bookmark:group
xbel/bookmark/info/metadata/bookmark:applications
xbel/bookmark/info/metadata/bookmark:applications/bookmark:application[@name='Image Viewer' and @exec="'eog %u'" and @modified='2022-03-28T07:27:27Z' and @count='1']
$ :
$ # Compute tree height as maximum branch node depth
$ xmlstarlet el -u "${infile}" | awk -F / '{d=NF-1;if(d>h)h=d}END{print 0+h}'
5

See also: Print XPath of selected elements or attributes example

Query: xmlstarlet select

xmlstarlet select (aka sel) is basically a shorthand XSLT generator that can either process or print the stylesheet it generates. Typically used to extract and format data it supports a subset of XSLT 1.0 elements, all XPath 1.0 and XSLT 1.0 functions, plus the EXSLT functions offered by libexslt.

select implements 7 XSLT instruction elements – xsl:attribute, xsl:choose, xsl:copy-of, xsl:element, xsl:for-each, xsl:text, xsl:value-of – plus xsl:variable (and xsl:stylesheet, xsl:template, xsl:output partially) but note the absence of xsl:apply-templates, xsl:key a.o. This means recursion and identity transforms are off-limits (unless resorting to code generation).

xmlstarlet select returns the same system-property() values as xmlstarlet transform. A stylesheet generated by select appears as located in the current directory.

Like grep xmlstarlet select returns an exit value of 1 if no nodes were selected, e.g.  xmlstarlet select -T -t -m '(//xsl:document)[1]' -f *.xsl returns 0 if at least one input file matches the XPath expression, otherwise 1 (with or without the -Q (--quiet) option).

See also: XML parsing and serialization

Caution: xmlstarlet select does not flag invalid non-template options (src/xml_select.c#selParseOptions()) and ignores characters following the first letter in short template options (src/xml_select.c#selGenTemplate()). Next command outputs:

optfuscation
xmlstarlet select --nonet --rsn -:=% -C -t -i\*r 2=2 -eR_W- x -a'!e'ee y -omit z -bar -b:rrrf -newln | {
xmlstarlet select -C -t -i 2=2 -e x -a y -o z -b -b -n |
cmp -s - /dev/fd/3
} 3<&0 && echo 'optfuscation' || echo 'returned non-zero'

Usage: select [option …] template … [«xml-file» …]

Non-template options

-h (--help) - display help

-Q (--quiet) - do not write anything to standard output

See also: global option -q (--quiet) (lowercase -q)

-C (--comp) - display generated XSLT

Lists the XSLT stylesheet that will be generated from the current template options. No input file is required for this option. It produces no output other than a stylesheet or an error message.

Usage samples: -t -m … | --output … | --value-of … | --xinclude

See also: Introspection example

-R (--root) - print root element <xsl-select>

Wraps a container element named xsl-select around output. It includes any namespace nodes declared with -N «prefix»=«value» except the predefined namespaces.

-T (--text) - output is text (default is XML)

Sets method="text" on the xsl:output element.

$ cat file.xml
<v><w>a&amp;</w><w>l&lt;</w><w>q&quot;</w><w>g&gt;</w></v>
$ :
$ xmlstarlet select -t -c '*/*[position()>2]' -n file.xml
<w>q"</w><w>g&gt;</w>
$ :
$ xmlstarlet select --text -t -c '*/*[position()<3]' -n file.xml
a&l<

See also: Special characters | -o (--output)

-I (--indent) - indent output

Sets indent="yes" on the xsl:output element.

To re-indent, for example:

xmlstarlet select -B -I -t -c / in.xml > out.xml

See also: XML parsing and serialization | -B (--noblanks)

-D (--xml-decl) - do not omit XML declaration line

Sets omit-xml-declaration="no" on the xsl:output element.

Use with -E (--encode) to specify encoding.

See also: XML declaration

-B (--noblanks) - remove nonsignificant whitespace from XML tree

To strip nonsignificant whitespace, for example:

xmlstarlet select -B -t -c / in.xml > out.xml

See also: XML parsing and serialization

-E (--encode) «encoding» - output in the given encoding

Encoding value for the XML declaration, e.g. UTF-8, ISO-8859-2. Use with -D (--xml-decl).

See also: XML declaration

-N «prefix»=«value» - declare namespaces

This option is repeatable. E.g. -N xsql='urn:oracle-xsql' -N X='http://www.w3.org/1999/xhtml'. Either side of the equal sign may be empty[1], e.g. -N ''='' (or -N =) for xmlns="".

Not needed for predefined namespaces or those declared in input’s root element (see --doc-namespace) but required

See also: Use a namespace | -R (--root) | edit -N

[1] -N foo='' is not allowed; echo '<v/>' | xmlstarlet sel -N foo= -t -e a -e foo:k outputs <a><k/></a>.

--net - allow fetch DTDs or entities over network

See also: network access

Template options

-t (--template) is <xsl:template match="/">

The -t (--template) option marks the beginning of an xsl:template element which ends at a following -t option (i.e. non-nestable) or at the last option after -t. -t must be followed by at least one template option. NB: <xsl:template match="expression"> cannot be generated by combining -t and -m options.

-t (--template) makes the root node (/, not the root element /*) the current node so XPath expressions can be relative,

xmlstarlet select -t -m '*/*/r' -v '@id' -n file

even obscure,

echo '<q>2</q>' | xmlstarlet sel -t -v '*******************'

with thanks to Michael Kay for his original Christmas cracker the output of which is 1024.

As xmlstarlet select --help shows, two or more --templates are implemented as:

<xsl:template match="/">
  <xsl:call-template name="t1"/>
  <xsl:call-template name="t2"/>
  …
</xsl:template>

See also: List the generated XSLT -C (--comp)

-m (--match) is <xsl:for-each select="xpath-expr">

-m (--match) is a rare misnomer among xmlstarlet’s option names: it translates to the xsl:for-each element and has nothing to do with an xsl:template pattern. -m (--match) is nestable and can be explicitly terminated with -b (--break).

Links: XSLT current node | XSLT xsl:for-each | XSLT current() | XPath context node

xsl:for-each changes the current node. The XPath functions position() and last() return the context position and context size, respectively.

$ printf '<v w="a:b:c:d:e:f:g:h:i:j"/>' |
  xmlstarlet select --text -t \
    -m 'str:split(v/@w,":")' \
      --if 'position() mod 3 = 0' \
        -v 'concat(position()," ",.,"  ")'
3 c  6 f  9 i  

whereas -m 'str:split(v/@w,":")[position() mod 3 = 0]' -v 'concat(…)' outputs 1 c 2 f 3 i.

Keeping a reference to root for node changes.

$ cat file.xml
<r><e id="a">fee</e><e id="b">fi</e><e id="c">fo</e><e id="d">fum</e></r>
$ :
$ xmlstarlet select -T -t \
  -m 'str:split("a b c d")' \
    -v 'concat(//e[@id=current()],". ")' \
  -b -n \
  file.xml
. . . . 
$ :
$ xmlstarlet select -T -t \
  --var R='/' \
  -m 'str:split("a b c d")' \
    -v 'concat($R//e[@id=current()],". ")' \
  -b -n \
  file.xml
fee. fi. fo. fum. 

-s (--sort) is <xsl:sort …/>

To process a nodeset in sorted order add one or more -s (--sort) 'X:Y:Z' 'xpath' options immediately after -m (--match).

For example:

See also: examples/sort* | Query Euro rates | Remove all but the latest member of each group

--var «name» «value» --break is <xsl:variable name="…">«value»</xsl:variable>

--var «name»=«value» is <xsl:variable name="…"/>

xmlstarlet select has 2 forms of --var, cf. xsl:variable:

  1. --var name=value, e.g.
    • --var n='5'
    • --var s='"fee fi fo fum"'
    • --var f='true()'
    • --var V='//_:abc[@class="def"]'
    • --var W='$V/_:ghi[boolean(@jkl)]'
    • -m 'str:split($ws)' --var w='.' …
    • --var lut='document("")//xsl:variable[@name="rtf"]/*'
  2. --var name value --break
    defining string content and terminated with -b (--break), e.g.
    • --var nl -n -b
    • --var s -o '<f&g>' -b
    • --var stuff -e doranc -c 'a[c] | d[c]' -b -b
    • --var lines -m '$expr' -v '…' -n -b -b
    • --var reply --if '$v > 4' -o 'yes' --elif '$v < 2' -o 'no' --else -o 'maybe' -b -b

(xmlstarlet edit has 1 form: --var name xpath.)

See also: --var «name»=«value» namespace issue

Result tree fragment (RTF) demo:

printf '<v/>\n' |
xmlstarlet select -t \
  --var rtf \
    -e x  -a k -o 1st -b  -o First. -b \
    -e x  -a k -o 2nd -b  -o Second. -b \
    -e x  -a k -o 3rd -b  -o Third. -b \
  -b \
  --var tbl='exslt:node-set($rtf)' \
  -v 'exslt:object-type($rtf)' -o ' rtf ' -v '$rtf' -n -c '$rtf' -n \
  -v 'exslt:object-type($tbl)' -o ' tbl ' -v '$tbl' -n -c '$tbl' -n

Output:

RTF rtf First.Second.Third.
<x k="1st">First.</x><x k="2nd">Second.</x><x k="3rd">Third.</x>
node-set tbl First.Second.Third.
<x k="1st">First.</x><x k="2nd">Second.</x><x k="3rd">Third.</x>

$tbl/x[@k="2nd"] is a valid XPath expression, $rtf/x[@k="2nd"] is not and triggers an Invalid type run-time error.

See also: RTF examples file list | accumulation

Links: exslt:node-set | nodeset vs. RTF by David Carlisle, Jörg Pietschmann | RTF background by Michael Kay

--var «name»=«value» namespace issue

Caution: An EXSLT namespace prefix (other than exslt (?)) used only inside xmlstarlet select’s --var name='…' triggers runtime error xmlXPathCompOpEval: function «func» bound to undefined prefix «ns» unless option -N ns=… is given. Workaround: use -N ns=… or use the prefix outside --var name='…', e.g. in -v or -m or (for string content) --var name … -b.

$ printf '%s\n' '<v s="a b c"/>' |
xmlstarlet select -t \
  --var d='str:split(v/@s)' \
  -v '$d' -n 
xmlXPathCompOpEval: function split bound to undefined prefix str
runtime error: element variable
Failed to evaluate the expression of variable 'd'.
no result for -
$ :
$ printf '%s\n' '<v s="a b c"/>' |
xmlstarlet select -t \
  -m 'str:split(v/@s)' \
    -v . -b -n
abc

-o (--output) is <xsl:text>«value»</xsl:text>

$ xmlstarlet select -T -C -t -o 'A<&'\''">z'
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:output omit-xml-declaration="yes" indent="no" method="text"/>
  <xsl:template match="/">
    <xsl:text>A&lt;&amp;'"&gt;z</xsl:text>
  </xsl:template>
</xsl:stylesheet>

-o '' translates to an empty <xsl:text/> element.

See also: Special characters

To manage parameters of the xsl:output element, see XML parsing and serialization.

-e (--elem) is <xsl:element name="…">

-e is nestable and can be explicitly terminated with -b (--break).

See also: Create a namespace | Create a SOAP envelope example

-a (--attr) is <xsl:attribute name="…">

-a can be explicitly terminated with -b (--break).

In XSLT, the latter of two same-named attributes is accepted, e.g.

$ echo '<v/>' | 
  xmlstarlet select -t -e doc  -a f -o n -b  -a f -o y
<doc f="y"/>

-c (--copy-of) is <xsl:copy-of select="xpath-expr"/>

See examples at: -T (--text) | -I (--indent)

-v (--value-of) is string-join((xpath-expr),newline)

With zero or one nodeset members in xpath-expr -v (--value-of) works exactly as XSLT 1.0’s <xsl:value-of select="xpath-expr"/>, otherwise (like string-join() in XSLT 2.0) all members are output, stringified and separated by newlines.

$ echo '<v><w>fee</w><w>fi</w><w>fo</w><w>fum</w></v>' |
  xmlstarlet select -T -t -v '*/*' -t -n
fee
fi
fo
fum

Adding -C (--comp) option to list the XSLT code for the value-of-template:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:exslt="http://exslt.org/common" version="1.0" extension-element-prefixes="exslt">
  <xsl:output omit-xml-declaration="yes" indent="no" method="text"/>
  <xsl:template match="/">
    <xsl:call-template name="t1"/>
    <xsl:call-template name="t2"/>
  </xsl:template>
  <xsl:template name="t1">
    <xsl:call-template name="value-of-template">
      <xsl:with-param name="select" select="*/*"/>
    </xsl:call-template>
  </xsl:template>
  <xsl:template name="t2">
    <xsl:value-of select="'&#10;'"/>
  </xsl:template>
  <xsl:template name="value-of-template">
    <xsl:param name="select"/>
    <xsl:value-of select="$select"/>
    <xsl:for-each select="exslt:node-set($select)[position()&gt;1]">
      <xsl:value-of select="'&#10;'"/>
      <xsl:value-of select="."/>
    </xsl:for-each>
  </xsl:template>
</xsl:stylesheet>

-i (--if) [--elif …] [--else] is <xsl:when> … [<xsl:otherwise>]

-i (--if) is nestable and can be explicitly terminated with -b (--break). It translates to an xsl:choose element.

-b (--break) ends current container element

-b (--break) closes the currently open container element, one of:

These can be followed by a variable number of options and so must be terminated explicitly unless followed by one of:

closing all open elements. In other words, trailing -bs may be omitted if they’re the last options in the current template.

A -b (--break) too many can trigger compilation error: xsltParseStylesheetTop: unknown «name» element.

-n (--nl) prints a newline

-f (--inp-name) prints pathname / URI of current input

Shorthand for -v '$inputFile' (a predefined variable). Outputs - (dash) for standard input (stdin).

Examples

Query Euro rates

Download (< 2K) and convert the European Central Bank’s Euro rates sorted by currency in Ascending order as Text, Upper-first:

wget -qO- 'https://www.ecb.europa.eu/stats/eurofxref/eurofxref-daily.xml' |
xmlstarlet select --text -t \
  -m '//_:Cube[@currency]' \
    -s 'A:T:U' '@currency' \
    -v 'concat(@currency," ",@rate)' -n

See also: -s (--sort)

List XML files matching XPath expression

List files in current dir and subdirs containing at least one milk element (returns non-zero if no match):

find . -type f -name '*.xml' -exec \
  xmlstarlet select -T -t -m '(//*[local-name()="milk"])[1]' -f -n {} +

Return zero if at least one XML element text exactly matches milk, otherwise non-zero (no output is produced):

find . -type f -name '*.xml' -exec \
  xmlstarlet select -Q -T -t -m '(//*[text()="milk"])[1]' -f -n {} +

find’s {} + fills up the command line with pathnames.

See also: -f (--inp-name) | -Q (--quiet) | exit values

Note: This handles element or attribute nodes but no other node types.

: ${fileglob:=/usr/share/*/xslt/docbook/common/db-common.xsl}
: ${target:='//xsl:param[string(@select)]'}

xmlstarlet select --text -t \
  -m "${target}" \
    -m 'ancestor-or-self::*' \
      --var pos='1+count(preceding-sibling::*[name() = name(current())])' \
      -v 'concat("/",name(),"[",$pos,"]")' \
    -b \
    --if 'count(. | ../@*) = count(../@*)' \
      -v 'concat("/@",name())' \
    -b \
    -n \
${fileglob}

where:

Output:

/xsl:stylesheet[1]/xsl:template[1]/xsl:param[1]
/xsl:stylesheet[1]/xsl:template[4]/xsl:param[1]
/xsl:stylesheet[1]/xsl:template[5]/xsl:param[1]
/xsl:stylesheet[1]/xsl:template[5]/xsl:param[2]
/xsl:stylesheet[1]/xsl:template[6]/xsl:param[1]

Output if called with target='//xsl:*/@test[contains(.,"position")]':

/xsl:stylesheet[1]/xsl:template[2]/xsl:for-each[1]/xsl:if[1]/@test
/xsl:stylesheet[1]/xsl:template[3]/xsl:for-each[1]/xsl:if[1]/@test
/xsl:stylesheet[1]/xsl:template[7]/xsl:for-each[1]/xsl:choose[1]/xsl:when[1]/@test
/xsl:stylesheet[1]/xsl:template[7]/xsl:for-each[1]/xsl:choose[1]/xsl:when[3]/@test

See also: xmlstarlet elements

Mark up plaintext

If the plaintext input is uncomplicated perhaps EXSLT’s string functions can do the conversion. Note that str:replace, str:split, and str:tokenize are available for xmlstarlet select (and transform), but not for edit.

<root>
A;2022-08-10;db #1
B;sortie bidon;50.0
A;2022-08-12;db Cth
B;mali climber;40.0
C;fray illumine;9.75
</root>
xmlstarlet select --indent -t \
  --var ifs -o ';' -b \
  --var iss -n -b \
  --var irs='concat($iss,"A")' \
  -e recs \
    -m 'str:split(*,$irs)' \
      -e rec \
        --var sr='str:split(.,$iss)' \
        --var hd='str:split($sr[1],$ifs)' \
        -e hd \
          -e dt -v '$hd[1]' -b \
          -e wd -v '$hd[2]' -b \
        -b \
        -e bd \
          -m '$sr[position()!=1]' \
            --var f='str:split(.,$ifs)' \
            -e fld \
              -a typ -v '$f[1]' -b \
              -e dsc -v '$f[2]' -b \
              -e amt -v '$f[3]' -b \
"${infile:-file.xml}"

See also: --var | -m (--match) | -e (--elem) | -b (--break)

Output:

<recs>
  <rec>
    <hd>
      <dt>2022-08-10</dt>
      <wd>db #1</wd>
    </hd>
    <bd>
      <fld typ="B">
        <dsc>sortie bidon</dsc>
        <amt>50.0</amt>
      </fld>
    </bd>
  </rec>
  <rec>
    <hd>
      <dt>2022-08-12</dt>
      <wd>db Cth</wd>
    </hd>
    <bd>
      <fld typ="B">
        <dsc>mali climber</dsc>
        <amt>40.0</amt>
      </fld>
      <fld typ="C">
        <dsc>fray illumine</dsc>
        <amt>9.75</amt>
      </fld>
    </bd>
  </rec>
</recs>

Use the document() function

Links: document() in W3C rec

The XSLT document() function

Examples: merge 2 XML files | extract and merge records | introspection | external lookup table

ex1: merge 2 XML files

Insert child nodes of ${partfile}’s root element into ${infile}’s ${destination} element – using a 3-stage pipeline:

xmlstarlet select -R -t \
  --var part -o "${partfile:-file2.xml}" -b \
  -c ' / | document($part)' "${infile:-file.xml}" |
xmlstarlet edit -m '/xsl-select/*[2]/node()' '/xsl-select'"${destination:-/..}" |
xmlstarlet select -B -I -t -c '/xsl-select/*[1]'

See also: -R (--root) | --var … -b | -B (--noblanks) | -I (--indent) | -c (--copy-of)

If called with this ${partfile}

<items>
  <item>1</item><item>2</item><item>3</item>
</items>

and this ${infile}

<doc><g><g1/><g2/><g3/></g></doc>

and destination=/doc//g1, then output becomes:

<doc>
  <g>
    <g1>
      <item>1</item>
      <item>2</item>
      <item>3</item>
    </g1>
    <g2/>
    <g3/>
  </g>
</doc>

See also: transform --xinclude

ex2: extract and merge records

Given a number of similar XML input files each containing a simple record set,

echo '<v/>' |
xmlstarlet select -R -I -t \
  --var fls \
    -e f -o 'data/rs1.xml' -b \
    -e f -o 'data/rs2.xml' -b \
    -e f -o 'data/rs3.xml' -b \
  -b \
  -c 'document(exslt:node-set($fls)/f) /*/r'

See also: -R (--root) | -I (--indent) | select --var | -e (--elem)

Output:

<xsl-select>
  <r a1="x" a2="42" a3="-2"/>
  <r a1="x" a2="41" a3="-2"/>
  <!-- etc. -->
</xsl-select>

Also possible:

If file order determined by sort is sufficient the EXSLT str:split() function can split the newline-separated output from find into a nodeset:

echo '<v/>' |
xmlstarlet sel -R -I -t \
  --var sep -n -b \
  --var fls2 -o "$(find 'data' -type f -name 'rs*.xml' | sort)" -b \
  -c 'document(str:split($fls2,$sep)) /*/r'
ex3: introspection

With a different -c (--copy-of) argument in the previous example,

  -c 'document("")'

outputs the stylesheet like the -C (--comp) option (but inside a wrapper element here because of -R (--root)).

With

  -c 'document("")//xsl:variable[@name="fls"]'

the file list variable is copied:

<xsl-select>
  <xsl:variable xmlns:xsl="http://www.w3.org/1999/XSL/Transform" name="fls">
    <xsl:element name="f">data/rs1.xml</xsl:element>
    <xsl:element name="f">data/rs2.xml</xsl:element>
    <xsl:element name="f">data/rs3.xml</xsl:element>
  </xsl:variable>
</xsl-select>
ex4: external lookup table

A simple food composition table lists – per 100 gram food – the calorie count (kcal) as well as the amount in grams of protein, fat, and carbohydrate:

<fc:foodcomp xmlns:fc="urn:foodcomp-subset">
  <fc:nutrient nid="n0893" kcal="297" prot="24.3" fat="1.9" carb="48.8" name="Lentils, green, dried, raw"/>
  <fc:nutrient nid="n2443" kcal="98" prot="7.9" fat="0.6" carb="16.3" name="Garlic, raw"/>
  <!-- etc. -->
</fc:foodcomp>

With an input file containing a culinary recipe on the form

<recipe servings="4" name="Lentil and goats' cheese salad">
  <ingredients>
    <ing foodid="n0893" grams="200" name="green lentils"/>
    <ing foodid="n2443" grams="10" name="garlic"/>
    <!-- etc. -->
  </ingredients>
  <method><!-- etc. --></method>
</recipe>

specify the calorie count per serving per ingredient:

xmlstarlet select --text -N fc='urn:foodcomp-subset' -t \
  --var fcfile -o "${lutfile:-file2.xml}" -b \
  --var FC='document($fcfile)/fc:foodcomp' \
  --var RP='/recipe' \
  -m '//ing' \
    --var kcal='$FC/*[@nid = current()/@foodid]/@kcal' \
    --var kcal-per-serv='$kcal div 100.0 * @grams div $RP/@servings' \
    -v 'str:align(current()/@name,str:padding(20," ."),"left")' \
    -o ' : ' \
    -v 'str:align(format-number($kcal-per-serv,"0 kcal"),str:padding(8),"right")' \
    -n \
  -b \
"${infile:-file.xml}"

Output:

green lentils. . . . : 149 kcal
garlic . . . . . . . :   2 kcal
lemon juice. . . . . :   0 kcal
extra virgin olive o :  56 kcal
fresh basil. . . . . :   4 kcal
goats' cheese. . . . :  96 kcal
black pepper . . . . :   0 kcal
salt . . . . . . . . :   0 kcal

To compute nutritional values for an entire recipe collect the gram-weighted food composition data – here in a result tree fragment (RTF, cf. select --var) as data size is modest – and sum(…) vertically, along the lines of:

…
  --var attrib='str:split("kcal prot fat carb")' \
  --var nutr-weighted-rtf \
    -m '//ing' \
      --var ing='.' \
      -e data \
        -c '@foodid' \
        -m '$attrib' \
          -a '{.}' -v '$FC/*[@nid = $ing/@foodid]/@*[name() = current()] div 100.0 * $ing/@grams' -b \
        -b \
      -b \
    -b \
  -b \
  --var nutr-wt='exslt:node-set($nutr-weighted-rtf)' \
  -o 'Nutrition per serving: ' \
  -m '$attrib' \
    --var sum-per-serv='sum($nutr-wt/data/@*[name() = current()]) div $RP/@servings' \
    -v 'concat(.," ",format-number($sum-per-serv,"0"))' \
…

Output:

Nutrition per serving: kcal 307, prot 19g, fat 15g, carb 26g

Extract subtree to depth N removing namespaces

Caution:
This method does not preserve document order as element nodes are copied after other node types causing mixed content (e.g. marked up text) to be messed up. Unless added to the argument of -c (--copy-of) comments and processing-instructions are ignored.

Recursion and xsl:template are off-limits to xmlstarlet select but nesting -m (--match) options is OK, i.e. using nested xsl:for-each elements. To make this script extract a subtree to depth N – while (Caution) removing selected prefixed namespaces – repeat the -m '*' line to reach the desired depth, and it might be the hack that works…

: "${exclude:=(//namespace::xsi)[1] | (//namespace::ns3)[1] }"
: "${subroot:=//soupenv:body}"

xmlstarlet select -B -I -N ns3='https://www.example.com/ns/ns3' -t \
  --var nl -n -b \
  --var xlist -n -v "${exclude}" -n -b \
  -m "${subroot}" -e '{local-name()}' -c '@*[not(contains($xlist,concat($nl,namespace-uri(),$nl)))] | text()' \
           -m '*' -e '{local-name()}' -c '@*[not(contains($xlist,concat($nl,namespace-uri(),$nl)))] | text()' \
           -m '*' -e '{local-name()}' -c '@*[not(contains($xlist,concat($nl,namespace-uri(),$nl)))] | text()' \
"${infile:-file.xml}"

Notes:

Edit: xmlstarlet edit

xmlstarlet edit (aka ed) copies its input to output, supporting basic create, update, delete, rename, and move actions (operations).

Note that edit

To do conditional updates, or to dynamically create -n names or -v values for an edit command, it may be worthwhile having xmlstarlet select generate it.

Usage: edit option […] [action …] [«xml-file-or-uri» …]

Non-action options

-h (--help) - display help

-O (--omit-decl) - omit XML declaration

-P (--pf) - preserve original formatting

-S (--ps) - preserve non-significant spaces

-O (--omit-decl), -P (--pf), and -S (--ps) set/unset libxml2 flags, cf. XML parsing and serialization.

See also: Try out edit’s formatting options | select -I (--indent)

-L (--inplace) - edit input file(s) in-place

This option

--net - allow network access

See also: network access

-N «prefix»=«value» - declare namespaces

This option is repeatable; must be last non-action option(s). E.g. -N xsql='urn:oracle-xsql'.

Not needed for predefined namespaces or those declared in input’s root element (--doc-namespace) but required

See: Use a namespace | select -N

Caution: xmlstarlet edit isn’t an XSLT processor so with or without the -N … option,

printf '%s' '<a/>' |
xmlstarlet edit --pf -O -N b='https://www.example.org/b' \
  -s '*' -t elem -n 'b:c' -v 'd'

generates:

<a><b:c>d</b:c></a>

See also: Create a SOAP envelope

Action options

-i (--insert) - add node before

-a (--append) - add node after

-s (--subnode) - add node as child

There are 3 ways to add an element, an attribute, or a text node to each member of a nodeset:

xmlstarlet edit OP xpath -t node-type -n node-name -v value

where

Basic usecase (v/e may replace $prev here):

$ printf '%s' '<v/>' |
  xmlstarlet edit -O \
    -s 'v' -t elem -n 'e' -v '42' \
    -s '$prev' -t attr -n 'a' -v 'y'
<v>
  <e a="y">42</e>
</v>

Examples: examples/ed-append | examples/ed-insert | examples/ed-subnode | Insert HTML <link …/>

See also: -u (--update)

Back reference $prev variable (aka $xstar:prev)

The $prev (aka $xstar:prev) variable refers to the nodeset created by the most recent -i (--insert), -a (--append), or -s (--subnode) option, which all define or redefine it. To reset $prev (to avoid a false match later) for example -a '/..' -t elem -n nil which fails as the root node has no parent.

$prev isn’t mentioned in the user’s guide; examples are given in doc/xmlstarlet.txt and in this section.

--var name 'xpath'

The --var name xpath option to define an xmlstarlet edit variable is mentioned in doc/xmlstarlet.txt but not in the user’s guide. It uses a different format than select’s --var.

Examples:

xmlstarlet edit --inplace \
  --var T '//_:p[@class="eyg"] | //_:span[contains(@class,"eyg_")]' \
  --var res "$((3 * 7 * 2))" \
  -u '$T' -x 'concat(.,", currently ",$res)' \
file.xhtml
xmlstarlet edit \
  -s '/doc/abc' -t elem -n 'ns:nd' \
  --var nsnd '$prev' \
  # ...

See also: xpath arguments | -u (--update) | $prev

-u (--update) 'xpath' -v (--value) 'value'

-u (--update) 'xpath' -x (--expr) 'xpath'

There are 2 ways to modify the value of each member of a nodeset:

where

-x makes a deep copy of its argument. Given an element e, <e a="v"><c1/><c2/></e>, -x 'e' copies the entire thing whereas -x 'e/node() | e/@*' copies e’s child nodes and e’s attribute nodes (cf. the many-to-many move example). (Attributes are not children of their parent – background.)

Creation of a new (empty) node is often followed by an update using $prev, for example to enable an -x expression:

xmlstarlet edit --inplace \
  -s '*' -t elem -n entry \
  -u '$prev' -x 'date:date-time()' \
  -s '$prev' -t attr -n user -v "${LOGNAME}" \
log.xml

See also: Moving nodes

-d (--delete) 'xpath'

See also: xpath arguments | Delete a namespace

-r (--rename) 'xpath' -v (--value) 'new-name'

new-name is an XML QName such as item or svg:g.

Caution: The -v (--value) clause of this option admits its new-name argument unmodifed – accepting an empty string or one containing tab, newline, XML special characters a.o. – ignoring XML QName requirements. In this respect it works like the -n (--name) clause of the -i, -a, and -s options.

See also: xpath arguments | Rename elements example

-m (--move) 'xpath1' 'xpath2'

The source (xpath1) of the -m (--move) action can be nodes other than root and namespace: element, attribute, text, comment, or processing instruction.

The destination (xpath2) must be a single element (the only container node) – otherwise xmlstarlet exits with an error message and a non-zero return code. Source nodes will be appended as last nodes at destination.

Caution: With overlapping source and destination --move completes with exit value 0 and no messages.

Caution: -m (--move) causes a segmentation fault (exit value 139) if attempting a many-to-many move.

See also: xpath arguments | Moving nodes | Move a namespace

xpath arguments

For the xpath argument – in the -i, -a, -s, -d, -r, -m, -u, -x, and --var options – xmlstarlet edit can use an XPath 1.0 expression, incl. XPath 1.0 functions[1] and selected EXSLT functions, but using XSLT functions such as current(), document(), generate-id(), or format-number() triggers (as expected) the error xmlXPathCompOpEval: function «name» not found.

[1] The XPath functions position() and last() rely on an evaluation context. With xmlstarlet edit they can be used inside an XPath predicate (e.g. --move '…' '…[last()]') – last() returning the context size – but used outside (as in -u '…' -x 'substring("abcdef",position(),1)') triggering an Invalid context position error or an Invalid context size error.

position() alternative: 1 + count(preceding-sibling::«node») or count(preceding::«node»).

EXSLT in xpath arguments

Based on the exslt«name»XpathCtxtRegister functions in libexslt xmlstarlet edit supports selected functions from the EXSLT modules dates-and-times, math, sets, and strings in its xpath arguments:

All date, math, and set are there but note the absence of str:replace (removed from xmlstarlet edit in 2012), str:split, and str:tokenize.

Hello, EXSLT:

printf '%s' '<v y="." e="." pi="." z="." r="."/>' | 
xmlstarlet edit -O \
  --var e 'math:constant("E",10)' \
  -u '*/@*[math:power(1e3,0)]' -x 'date:day-name("2011-09-24")' \
  -u '*/@e' -x '$e' \
  -u '*/@z' -x 'count(set:distinct(/*/@*))' \
  -u '*/@pi' -x 'math:constant("PI",10)' \
  -u '*/@r' -x '$e*/*/@z*/*/@pi' \
  -r '*/@*[starts-with(.,str:align(25,"  "))]' -v 'ezpi'

The number 1e3 (in scientific notation) isn’t XPath 1.0 but understood by libxml2.

Output:

<v y="Saturday" e="2.71828182" pi="3.14159265" z="3" ezpi="25.6192025590219"/>

See also: Divide a document into sections example

“No kidding” issue

Caution: xmlstarlet edit -u '…' -x '…' silently deletes the first child node at destination (whether absolute or relative -x XPath expression), -u '…' -v '…' silently deletes all child nodes. This suggests that the -u (--update) option is intended to modify leaf nodes (aka external nodes) but xmlstarlet’s silence in this matter extends to both documentation and source code.

Links: src/xml_edit.c#edUpdate()

Given this input,

<r><a><a1 k="v1">V1</a1>
      <a2 k="v2">V2</a2></a>
   <b><b1><D/></b1><b2/><b3><F1/><F2/></b3></b>
</r>
xmlstarlet edit -O -P \
  -u 'r/b/*' -x '../../a/a1/text()' \
"${infile:-file.xml}"

produces:

<r><a><a1 k="v1">V1</a1>
      <a2 k="v2">V2</a2></a>
   <b><b1>V1</b1><b2>V1</b2><b3><F2/>V1</b3></b>
</r>

Workaround for -x: insert a sacrificial element (or non-whitespace text) node as first child. The -i (--insert) option has no effect if no nodes exist on destination’s child axis.

xmlstarlet edit -O -P \
  -i 'r/b/*/node()[1]' -t elem -n 'herenow' \
  -u 'r/b/*' -x '../../a/a1/text()' \
"${infile:-file.xml}"

Output:

<r><a><a1 k="v1">V1</a1>
      <a2 k="v2">V2</a2></a>
   <b><b1><D/>V1</b1><b2>V1</b2><b3><F1/><F2/>V1</b3></b>
</r>

If instead -i … -u 'r/b/*' -v 'gonethere':

…  <b><b1>gonethere</b1><b2>gonethere</b2><b3>gonethere</b3></b> …

Examples

xmlstarlet edit --omit-decl --pf \
  -s '/_:html/_:head' -t elem -n link \
  --var lk '$prev' \
  -s '$lk' -t attr -n 'rel' -v 'stylesheet' \
  -s '$lk' -t attr -n 'type' -v 'text/css' \
  -s '$lk' -t attr -n 'href' -v 'style/www.css' \
in.xhtml > out.xhtml

<link rel="stylesheet" type="text/css" href="style/www.css"/>

See also: -P (--pf) | -s (--subnode) | --var | $prev | Use a namespace

Toggle 0/1 or false/true

$ cat file.xml
<doc><e f="0">false<c>false</c></e></doc>
$ :
$ xmlstarlet edit -L -O --pf --var T 'doc/e/@f' -u '$T' -x '($T+1) mod 2' file.xml
$ cat file.xml
<doc><e f="1">false<c>false</c></e></doc>
$ :
$ xmlstarlet edit -O --pf --var T 'doc/e/text()' -u '$T' -x 'not($T="true")' file.xml
<doc><e f="1">true<c>false</c></e></doc>

The -L (--inplace) option edits the input file in-place.

See also: -P (--pf) | --var | -u (--update)

Replace text

The EXSLT function str:replace was removed from xmlstarlet edit in 2012 so it’s either straight XPath 1.0:

xmlstarlet edit -O \
  -u 'doc/e' \
  -x 'concat(substring-before(.,"&text=Ulysses"),
              substring-after(.,"&text=Ulysses"))' \
"${infile:-file.xml}"

or invoke xmlstarlet select first to apply str:replace:

avar="$(xmlstarlet select --text -t \
  -v 'str:replace(doc/e,"&text=Ulysses","")' "${infile:-file.xml}")"
xmlstarlet edit -O -u 'doc/e' -v "${avar}" "${infile:-file.xml}"

With this XML input:

<doc>
  <e>abcdefghi/q?sid=ry12345&amp;text=Ulysses&amp;ofmt=x'"ml</e>
</doc>

both commands produce:

<doc>
  <e>abcdefghi/q?sid=ry12345&amp;ofmt=x'"ml</e>
</doc>

See also: pyx, depyx

Update or add if not exists

If a record rec has a desc element add the text of the sibling name element to it, otherwise add a new desc element and set its value to the text of name.

xmlstarlet edit \
  -u '//recs/rec/desc' -x 'concat(.," - ",../name/text())' \
  -a '//recs/rec[not(desc)]/name' -t elem -n 'desc' \
  -u '$prev' -x '../name/text()' \
"${infile:-file.xml}"

See also: -u (--update) | -a (--append) | $prev | -s (--subnode)

If desc and name are attributes, instead:

xmlstarlet edit \
  -u '//recs/rec/@desc' -x 'concat(.," - ",../@name)' \
  -s '//recs/rec[not(@desc)]' -t attr -n 'desc' \
  -u '$prev' -x 'string(../@name)' \
"${infile:-file.xml}"

Note that -x 'string(../@name)' – and ../@name as a concat() string argument – copies the attribute value, -x '../@name' the attribute node; the latter fails as attributes cannot contain other nodes (causing an empty value to be assigned to @desc).

Add if «condition»

xmlstarlet edit has no if-then-else construct so the following snippets use standard XPath 1.0 expressions and edit’s back reference $prev variable to apply conditions. All use the following nodeset variable:

--var T '/Server/Service[@name="Catalina"]'

Add $T/Connector after last ditto:

-a '$T/Connector[last()]' -t elem -n Connector \
--var C '$prev' -s '$C' -t attr -n port -v '7654'

Nothing is added if no $T/Connector node exists – in which case $prev becomes null and -s '$C' … has no effect – otherwise a Connector element is appended as first following sibling (to become the new last Connector) and given a port attribute.

Add $T/Connector if not exists, as last child of $T:

-s '$T[not(Connector)]' -t elem -n Connector \
--var C '$prev' -s '$C' -t attr -n port -v '8765'

Nothing is added if a $T/Connector node exists – if the first -s … matches nothing then the second -s … (due to a null $prev) will match nothing.

Add $T/Connector if not exists, after $T/Executor[1] if exists:

-a '$T[not(Connector)]/Executor[1]' -t elem -n Connector \
--var C '$prev' -s '$C' -t attr -n port -v '9876'

Nothing is added if a $T/Connector node exists or no $T/Executor node exists, otherwise appended as first following sibling of first $T/Executor.

See also: --var | -a (--append) | -s (--subnode) | $prev | -u (--update)

Duplicate an element, keep the formatting

This code duplicates a bean element from a formatted input file, inserts the copy right after the original, changing its @id, and restores the inter-element whitespace that was. Use with edit’s -P (--pf) or -S (--ps) option.

xmlstarlet edit --ps \
  --var N '/beans/bean[@id="bean4"]' \
  --var ws '$N/following::text()[1][normalize-space()=""]' \
  -a '$N' -t elem -n bean \
  -u '$prev' -x '$N/node() | $N/@*' \
  -u '$prev/@id' -v 'bean4a' \
  -i '$prev' -t text -n whitespace -v '' \
  -u '$prev' -x '$ws' \
file.xml > newfile.xml

Notes:

See also: --var | -a (--append) | -u (--update) | $prev | -i (--insert)

Moving nodes using xmlstarlet edit -m or -a, -u, -r, -d

The basic usecases for moving XML nodes (except namespace nodes) from source to destination are:

The ground rules are:

In this context one-to-many is a copy (update) operation handled by -u … -v … or -u … -x … followed by -d ….

See also: xpath arguments | Namespaces

The following examples (one-to-one | many-to-one | many-to-many | move to position N) use this input XML file:

<div>
  <a>anchor</a>
  <p><span><a id="a1">value 1</a></span></p>
  <p><span><a id="a2">value 2</a></span></p>
  <p id="vol"/>
  <p/>
</div>
ex1: one-to-one

Moving one node to another:

xmlstarlet edit --omit-decl \
  -m '/div/p[4]' '/div/p[3]' \
  -m '/div/p[3]/@id' '/div/p[2]' \
  -m '/div/a' '/div' \
"${infile:-file.xml}"

Note the placement of a as the last element in destination.

See also: -m (--move)

Output:

<div>
  <p>
    <span>
      <a id="a1">value 1</a>
    </span>
  </p>
  <p id="vol">
    <span>
      <a id="a2">value 2</a>
    </span>
  </p>
  <p>
    <p/>
  </p>
  <a>anchor</a>
</div>
ex2: many-to-one

Moving the first 3 p elements to the 4th p:

xmlstarlet edit --omit-decl \
  -m '/div/p[position() <= 3]' '/div/p[4]' \
"${infile:-file.xml}"

Destination can also be expressed as (//p)[4], being the 4th p in document order.

See also: -m (--move)

Output:

<div>
  <a>anchor</a>
  <p>
    <p>
      <span>
        <a id="a1">value 1</a>
      </span>
    </p>
    <p>
      <span>
        <a id="a2">value 2</a>
      </span>
    </p>
    <p id="vol"/>
  </p>
</div>

Whereas overlapping -m '/div/p[position() <= 3]' '/div/p[3]' gives:

<div>
  <a>anchor</a>
  <p/>
</div>
ex3: many-to-many

Move all a children of span elements up one level, then remove the emptied spans (untagging).

xmlstarlet edit --omit-decl --pf \
  --var N '//span[a]' \
  -a '$N' -t elem -n 'a' \
  -u '$prev' -x 'preceding-sibling::span[1]/a/node() | preceding-sibling::span[1]/a/@*' \
  -d '$N' \
"${infile:-file.xml}"

In the general case use the following-sibling axis with the -i (--insert) action and the preceding-sibling axis with -a (--append). In this specific case preceding::span[1] or ../span also refers to preceding-sibling::span[1].

Output:

<div>
  <a>anchor</a>
  <p><a id="a1">value 1</a></p>
  <p><a id="a2">value 2</a></p>
  <p id="vol"/>
  <p/>
</div>

Alternatively, untag using an identity transform plus a template such as:

<xsl:template match="span[a]">
  <xsl:xsl:apply-templates/>
</xsl:template>
ex4: move to position N

Move the 3rd p element to position 2, to become the new 1st p.

The move destination cannot be a list position so work around:

xmlstarlet edit --omit-decl \
  --var src '/div/p[3]' \
  --var tgt '/div/*[2]' \
  -i '$tgt' -t elem -n 'p_TMP' \
  -u '$prev' -x '$src/node() | $src/@*' \
  -d '$src' \
  -r '$prev' -v 'p' \
"${infile:-file.xml}"

Both -u '$prev' -x '…' and -m '…' '$prev' work here.

See also: -i (--insert) | -u (--update) | $prev | -d (--delete) | -r (--rename) | -m (--move)

Output:

<div>
  <a>anchor</a>
  <p id="vol"/>
  <p>
    <span>
      <a id="a1">value 1</a>
    </span>
  </p>
  <p>
    <span>
      <a id="a2">value 2</a>
    </span>
  </p>
  <p/>
</div>

Divide a document into sections

This example – a many-to-many move operation – uses xpath arguments with relative expressions and the EXSLT set:leading function to group by h2 and move elements into divs.

See also: Moving nodes | -i (--insert) | $prev | -u … -x … | -d (--delete) | EXSLT in xpath arguments

<doc>
<h1/>
<h2/><p1/><p2/>
<h2/><p3/><p4/><p5/>
<h2/><p id="p6"><v/></p><p/>
</doc>
xmlstarlet edit -O \
  -i 'doc/h2' -t elem -n div \
  -u '$prev' -x 'set:leading(following-sibling::*, following-sibling::div[1])' \
  -d 'doc/div/following-sibling::*[not(self::div)]' \
"${infile:-file.xml}"

Output:

<doc>
  <h1/>
  <div>
    <h2/>
    <p1/>
    <p2/>
  </div>
  <div>
    <h2/>
    <p3/>
    <p4/>
    <p5/>
  </div>
  <div>
    <h2/>
    <p id="p6">
      <v/>
    </p>
    <p/>
  </div>
</doc>

See also: Use set:leading and set:trailing example

Try out xmlstarlet edit’s formatting options

Test edit’s various formatting options on this input file:

<div>
  <a>anchor</a>
  <p><span><a id="a1">value 1</a></span></p>
  <p>
    <span>  <a 
             id="a2"
            >  value 2  </a
    >       </span>
  </p>
  <p id="empty"></p>
  <p/>
</div>

See also: XML parsing and serialization | Duplicate an element, keep the formatting | -O (--omit-decl) | -P (--pf) | -S (--ps)


Default formatting:

xmlstarlet edit -O "${infile:-file.xml}"
<div>
  <a>anchor</a>
  <p>
    <span>
      <a id="a1">value 1</a>
    </span>
  </p>
  <p>
    <span>
      <a id="a2">  value 2  </a>
    </span>
  </p>
  <p id="empty"/>
  <p/>
</div>

See also: select -I (--indent)

xmlstarlet edit -O --pf "${infile:-file.xml}"
<div>
  <a>anchor</a>
  <p><span><a id="a1">value 1</a></span></p>
  <p>
    <span>  <a id="a2">  value 2  </a>       </span>
  </p>
  <p id="empty"/>
  <p/>
</div>
xmlstarlet edit -O --ps "${infile:-file.xml}"
<div>
  <a>anchor</a>
  <p><span><a id="a1">value 1</a></span></p>
  <p>
    <span>  <a id="a2">  value 2  </a>       </span>
  </p>
  <p id="empty"/>
  <p/>
</div>

This combination appears to match that of xmllint --pretty 2 file.xml. It could prove a temptation to a regex user (but beware). Swopping --pf and --ps makes no difference.

xmlstarlet edit -O --pf --ps "${infile:-file.xml}"
<div
  >
  <a
    >anchor</a
  >
  <p
    ><span
      ><a
          id="a1"
        >value 1</a
      ></span
    ></p
  >
  <p
    >
    <span
      >  <a
          id="a2"
        >  value 2  </a
      >       </span
    >
  </p
  >
  <p
      id="empty"
  />
  <p
  />
</div
>

Add a subnode:

xmlstarlet edit -O -s '*/p[4]' -t elem -n added "${infile:-file.xml}"
<div>
  <a>anchor</a>
  <p>
    <span>
      <a id="a1">value 1</a>
    </span>
  </p>
  <p>
    <span>
      <a id="a2">  value 2  </a>
    </span>
  </p>
  <p id="empty"/>
  <p>
    <added/>
  </p>
</div>
xmlstarlet edit -O --pf -s '*/p[4]' -t elem -n added "${infile:-file.xml}"
<div>
  <a>anchor</a>
  <p><span><a id="a1">value 1</a></span></p>
  <p>
    <span>  <a id="a2">  value 2  </a>       </span>
  </p>
  <p id="empty"/>
  <p><added/></p>
</div>
xmlstarlet edit -O --ps -s '*/p[4]' -t elem -n added "${infile:-file.xml}"
<div>
  <a>anchor</a>
  <p><span><a id="a1">value 1</a></span></p>
  <p>
    <span>  <a id="a2">  value 2  </a>       </span>
  </p>
  <p id="empty"/>
  <p><added/></p>
</div>

Delete whitespace-only text nodes:

xmlstarlet edit -O --pf -d '//text()[normalize-space()=""]' "${infile:-file.xml}"
<div><a>anchor</a><p><span><a id="a1">value 1</a></span></p><p><span><a id="a2">  value 2  </a></span></p><p id="empty"/><p/></div>

See also: select -B (--noblanks)

xmlstarlet edit -O --ps -d '//text()[normalize-space()=""]' "${infile:-file.xml}"
<div>
  <a>anchor</a>
  <p>
    <span>
      <a id="a1">value 1</a>
    </span>
  </p>
  <p>
    <span>
      <a id="a2">  value 2  </a>
    </span>
  </p>
  <p id="empty"/>
  <p/>
</div>
xmlstarlet edit -O --pf --ps -d '//text()[normalize-space()=""]' "${infile:-file.xml}"
<div
  ><a
    >anchor</a
  ><p
    ><span
      ><a
          id="a1"
        >value 1</a
      ></span
    ></p
  ><p
    ><span
      ><a
          id="a2"
        >  value 2  </a
      ></span
    ></p
  ><p
      id="empty"
  /><p
  /></div
>

Format: xmlstarlet format

The format (aka fo) command is an XML code formatter which accepts one input file, default is stdin.

See also: XML parsing and serialization | select -I (--indent) | Try out edit’s formatting options

Usage: format [option …] [«xml-file»]

Local options

-h (--help) - display help

-e (--encode) «encoding» - output in the given encoding

Cf. select’s -E option

-n (--noindent) - do not indent

Sets indentation to zero spaces, left-aligning the output. Does not strip nonsignificant whitespace from input.

See also: select -B (--noblanks)

-o (--omit-decl) - omit XML declaration

Caution: Setting this option causes xmlstarlet format to return an exit value equal to the number of bytes written (or -1 in case of error) modulo 256 (src/xml_format.c#foProcess(), cf. <libxml/xmlIO.h>).

See also: XML declaration

-s (--indent-spaces) «N» - indent output with N spaces

Default indentation per level is 2 spaces.

-t (--indent-tab) - indent output with tabulation

-C (--nocdata) - replace CDATA section with text nodes

$ infile=$(mktemp)
$ printf '<v><![CDATA[A\t%s\nZ]]></v>' '"&'\''<>' > "${infile}"
$ :
$ xmlstarlet format -o -C "${infile}"
<v>A	"&amp;'&lt;&gt;
Z</v>
$ :
$ xmlstarlet format -o "${infile}"
<v><![CDATA[A	"&'<>
Z]]></v>
$ :
$ xmlstarlet pyx "${infile}"
(v
[A\t"&'<>\nZ
)v

See also: pyx

-D (--dropdtd) - remove the DOCTYPE of the input doc

Alternatives: xmllint --dropdtd a.xml | … or xsltproc --novalid b.xsl a.xml.

-H (--html) - input is HTML

Reads input using the libxml2 HTML 4.0 parser, cf. API reference.

Attempt to convert HTML – or broken XML – to usable XHTML:

wget -qO- "${url}" |
xmlstarlet -q format --html --recover --dropdtd --omit-decl > output

See also: global -q (--quiet) option

Links: HTML Tidy | W3C’s html-xml-utils | xmllint

-N (--nsclean) - remove redundant namespace declarations

See also: Remove namespace declarations

-Q (--quiet) [undocumented] - suppress error output

Does what the -q (--quiet) global option does.

-R (--recover) - try to recover what is parsable

See also: -H (--html)

--net - allow network access

See also: network access

Compare: xmlstarlet c14n

The c14n (aka canonic) command is used to convert an XML document to Canonical XML, a normal format intended to allow relatively simple comparison of pairs of XML documents for equivalence.

The W3C recommendations list examples of XML canonicalization. Examples of the c14n command are given in the source code’s examples/c14n*.

Links: Canonical XML - Wikipedia | Canonical XML - W3C rec | Exclusive XML Canonicalization - W3C rec

Caution xmlstarlet c14n does not flag invalid options, cf.  src/xml_C14N.c#c14nMain().

Usage: c14n [option] [«mode»] «xml-file» [«xpath-file»] [«inclusive-ns-list»]

Local options

-h (--help) - display help

--net - allow network access

See also: network access

Parameters

«mode» - canonicalization mode

«mode» is one of the following:

«xml-file» - input XML document file name (stdin is used if -)

Basic use case:

xml-generator-command | xmlstarlet c14n |
{ xmlstarlet c14n expected.xml | diff -b -C 1 - /dev/fd/3; } 3<&0 || log …

«xpath-file» - XML file with document subset expression

Cf. document subset in the W3C recommendation.

Sample xpath-file, from examples/xml/c14n.xpath:

<?xml version="1.0"?>
<XPath xmlns:n0="http://a.example.com" xmlns:n1="http://b.example">
(//. | //@* | //namespace::*)[ancestor-or-self::n1:elem1]
</XPath>

«inclusive-ns-list» - list of inclusive namespace prefixes

The InclusiveNamespaces PrefixList as a comma-separated (Caution: the user’s guide says blank-separated) list of namespace prefixes; for exclusive canonicalization only.

Validate: xmlstarlet validate

The validate (aka val) command performs validation on XML documents. Examples of the val command are given in the source code’s examples/valid1. NB: XML Schemas (XSD) are not fully supported due to incomplete support in libxml2.

Wikipedia links: XML schemas in general | XSD (W3C) | RELAX NG | DTD

See also: XML parsing and serialization | External entities

Usage: validate [option …] [«xml-file-or-uri» …]

Local options

-w (--well-formed) - validate well-formedness only (default)

-d (--dtd) «dtd-file» - validate against DTD

--net - allow network access

See also: network access

-s (--xsd) «xsd-file» - validate against XSD schema

-E (--embed) - validate using embedded DTD

-r (--relaxng) «rng-file» - validate against Relax-NG schema

-e (--err) - print verbose error messages on stderr

-S (--stop) - stop on first error

-b (--list-bad) - list only files which do not validate

-g (--list-good) - list only files which validate

-q (--quiet) - do not list files (return result code only)

PYX: xmlstarlet pyx, depyx

xmlstarlet’s pyx (aka xmln) and depyx (aka p2x) commands are used to convert XML to PYX during processing. PYX is a simple line-oriented text-based format usable with standard text tools such as grep, sed, or awk. Given xmlstarlet’s lack of native support for regular expressions this type of processing is occasionally useful, but beware of side effects: a pyx | depyx pipeline does not guarantee an accurate roundtrip. pyx uses a SAX parser.

PYX’s simplicity and lack of structure (and namespaces) makes it a good choice for certain types of operations – e.g. queries or editing of non-complex data like config files or database record sets – and a poor choice for handling complex documents or operations.

The PYX format lives a quiet life these days; xml.com still carries its article on Pyxie whereas IBM’s intro is now at archive.org.

The first character of each line of PYX indicates the type of parsing event:

char  event
----  -----
 (    start-tag
 )    end-tag
 A    attribute or namespace
 -    character data
 ?    processing instruction
 C    comment
 [    CDATA section
 D    DTD declaration
 N    notation declaration
 U    unparsed entity
 &    external entity

Caution: pyx strips an XML declaration if present.
Caution: & (ampersand) is buggy, e.g. external entities, cf.  src/xml_pyx.c.
Caution: depyx outputs non-collapsed empty elements, e.g.  <void></void>.
Caution: depyx outputs XML special characters inside comments as entity references, e.g. & as &amp;.
Caution: depyx may output spurious newlines, for example after a comment, cf.  src/xml_depyx.c.

Links: packages.debian.org xml2

Usage: pyx [–help] [«xml-file»]

Usage: depyx [–help] [«pyx-file»]

PYX output sample

xmlstarlet pyx "${infile:-pom.xml}" | head -n 40

Output:

(project
Axmlns http://maven.apache.org/POM/4.0.0
Axmlns:xsi http://www.w3.org/2001/XMLSchema-instance
Axsi:schemaLocation http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd
-\n  
(modelVersion
-4.0.0
)modelVersion
-\n\n  
(groupId
-com.github.example8
)groupId
-\n  
(artifactId
-maven-simple
)artifactId
-\n  
(version
-0.2-SNAPSHOT
)version
-\n  
(packaging
-jar
)packaging
-\n\n  
(name
-Simple Maven example
)name
-\n  
(url
-https://example8.io/#example8/maven-simple/0.1
)url
-\n\n  
(dependencies
-\n    
(dependency
-\n      
(groupId
-junit
)groupId

Examples

Grep for Dvorak in tags’ text

xmlstarlet pyx '/usr/share/X11/xkb/rules/base.xml' | grep '^-.*Dvorak'

Extract URIs

xmlstarlet pyx index.xhtml | sed -n 's/^Ahref //p'

Convert date attributes to extended ISO 8601 format

Using GNU sed (for e flag of s command) and GNU date (for -d (--date) option and %F format):

xmlstarlet pyx "${infile:-file.xml}" |
sed -E '1v;/^(Adate )(.*)/ s//date -d "\2" "+\1%F"/e' |
xmlstarlet depyx

The do-nothing v command fails on non-GNU seds. The empty regex causes the last applied regex to be reused.

Convert datetime elements from basic to extended ISO 8601 format

xmlstarlet pyx "${infile:-file.xml}" |
sed '/^(datetime$/,/^)datetime$/ { /^-\(....\)\(..\)\(..\)\(..\)\(..\)\(..\)/ s//-\1-\2-\3T\4:\5:\6/; }' |
xmlstarlet depyx

Assuming non-nested datetime elements. The /^-…/ condition leaves non-text nodes (incl. CDATA sections) unmodified.

Delete foo elements

xmlstarlet pyx "${infile:-file.xml}" |
awk -v FS='\n' '
$0=="(foo" {flag++; next;}
$0==")foo" {flag--; next;}
!flag
' | 
xmlstarlet depyx

!flag prints current line if flag is zero.

Split XML file by parent/element

This awk script reads a PYX-format file and extracts each group element having a parent glist element (including any nested ditto) to a separate numbered file, converting each chunk from PYX to XML by invoking xmlstarlet depyx. Alternatively, output PYX-format files and convert them in parallel.

Caution: Doesn’t understand namespaces, and beware of pyx/depyx side effects.

xmlstarlet pyx "${infile:-file.xml}" |
awk -v FS='\n' -v partfmt='./part%04d.xml' -v element='group' -v parent='glist' '
  /^\(/ { E[++level] = substr($0,2) }
  $0 ~ "[()]" element "$"  &&  E[level-1] == parent {
    if ( !flag && "(" == substr($0,1,1) ) {
      fxml = sprintf(partfmt,++partnum)
      fpyx = fxml ".pyx.tmp"
      flag=1
    } else if ( flag ) {
      print >> fpyx
      close(fpyx)
      system("xmlstarlet depyx " fpyx " > " fxml " && rm " fpyx)
      flag=0
    }
  }
  /^\)/ { --level }
  flag { print >> fpyx }
'

See also: Create multiple result documents example

Special characters: xmlstarlet escape, unescape

The escape (aka esc) command converts

taking its input from the first text string on the command line, or stdin if it’s - (dash) or absent.

The unescape (aka unesc) command does the inverse. Caution: unesc leaves longer references such as &#x20AC; unmodified (cf. MAX_ENTITY_NAME = 1+4 in src/xml_escape.c), prints a diagnostic message, and returns zero.

See also: Special characters | --xinclude (for parse="text")

Usage: escape [–help] [«text»]

Usage: unescape [–help] [«text»]

Sample session

$ xmlstarlet escape 'a&<>'\''"z'
a&amp;&lt;&gt;'"z
$ :
$ xmlstarlet unescape 'a&amp;&lt;&gt;&apos;&quot;z'; printf '\n'
a&<>'"z
$ :
$ # Unicode U+20AC EURO SIGN
$ xmlstarlet esc '€'
&#x20AC;
$ :
$ xmlstarlet unesc '&#x61;&#x20AC;&#x9;100&#x7A;&#xA;'
entity name too long: &#x20AC
a&#x20AC;	100z
$ :
$ xmlstarlet esc < "${infile:-file.xml}"
&lt;foo&gt;if they treat children as they do &lt;bar&gt;documentation&lt;/bar&gt; they'll be &lt;bat&gt;prosecuted&lt;/bat&gt;&lt;/foo&gt;
$ :
$ xmlstarlet esc < "${infile:-file.xml}" | xmlstarlet unesc
<foo>if they treat children as they do <bar>documentation</bar> they'll be <bat>prosecuted</bat></foo>

Directory list: xmlstarlet list

The list (aka ls) command prints the contents of a file system directory in XML format. It accepts a directory name as its only argument; default is current dir. No recursion option and no -h (--help) option available.

Usage: list [«directory-name»]

Output sample

xmlstarlet list /etc/sgml

Output:

<dir>
<f p="rw-r--r--" a="20220704T194637Z" m="20211001T213455Z" s="376"              n="docbook-xml.cat"/>
<d p="rwxr-xr-x" a="20220706T075545Z" m="20220419T100344Z" s="4096"             n="docbook-xml"/>
<l p="rwxrwxrwx" a="20220706T075532Z" m="20220702T102624Z" s="31"               n="catalog"/>
<f p="rw-r--r--" a="20220704T194637Z" m="20201229T232017Z" s="652"              n="sgml-data.cat"/>
<f p="rw-r--r--" a="20220704T194637Z" m="20190227T001849Z" s="45"               n="xml-core.cat"/>
</dir>

Elements inside the dir document element have a one-char name indicating the file type,

f   regular file
d   directory
c   character device
b   block device
l   symlink
p   FIFO
s   socket
u   unknown

and attributes as returned by stat:

p   read-write-execute permissions for user, group, and other
a   UTC time of last access in ISO 8601 basic format
m   UTC time of last modification in ISO 8601 basic format
s   file size in bytes
n   filename

See man 7 inode for permissions s (S_ISUID, S_ISGID) and t (S_ISVTX).

The T in XSLT: xmlstarlet transform

The transform (aka tr) command is an XSLT processor supporting XSLT 1.0 plus several EXSLT, crypto, and saxon extensions.

xmlstarlet transform returns the same system-property() values as xmlstarlet select and xsltproc:

xsl:vendor      libxslt
xsl:vendor-url  http://xmlsoft.org/XSLT/
xsl:version     1.0

Caution: xmlstarlet transform doesn’t flag invalid options (src/xml_trans.c#trParseOptions()).

Caution: The --catalogs option mentioned in the user’s guide was never implemented, it seems; not listed by xmlstarlet transform --help.

Usage: transform [option …] «xsl-file» [-p|-s «name»=«value» …] [«xml-file-or-uri» …]

Local options

-h (--help) - display help

--omit-decl - omit XML declaration

See also: XML declaration

-E (--embed) - allow applying embedded stylesheet

Links: <?xml-stylesheet?> - W3C recommendation | Embedding stylesheets - W3C XSLT 1.0 Rec

With an e.xml XML document containing an <?xml-stylesheet type="text/xml" href="e.xsl"?> processing instruction before the document element, the following command will run the XSLT stylesheet e.xsl on e.xml.

xmlstarlet tr -E e.xml > output

This option is mentioned in doc/xmlstarlet.txt but not in the user’s guide.

--show-ext - show list of extensions

Prints a list of registered XSLT extensions to stderr and terminates.

--val - allow validate against DTDs or schemas

--net - allow fetch DTDs or entities over network

See also: network access

--xinclude - do XInclude processing on document input

Links: XML Inclusions XInclude - W3C recommendation

See also: the XSLT document() function

Basic XInclude example: include file2.xml in file1.xml.

cat << 'HERE' > 'file1.xml'
<root>
  <gs>
    <xi:include
      xmlns:xi="http://www.w3.org/2001/XInclude" 
      href="file2.xml"
      xpointer="xpointer(//g[@id='items']/*)"
      parse="xml"
    />
  </gs>
</root>
HERE

cat << 'HERE' > 'file2.xml'
<doc><g id="items"><g1/><g2/><g3/><g4/></g></doc>
HERE

xmlstarlet select -C -t -c / |
xmlstarlet transform --xinclude /dev/stdin 'file1.xml'

Output:

<root>
  <gs>
    <g1/><g2/><g3/><g4/>
  </gs>
</root>

If instead parse="text":

<root>
  <gs>
    &lt;doc&gt;&lt;g id="items"&gt;&lt;g1/&gt;&lt;g2/&gt;&lt;g3/&gt;&lt;g4/&gt;&lt;/g&gt;&lt;/doc&gt;

  </gs>
</root>

An alternative to XInclude or document():

The hxincl utility from the W3C html-xml-utils package is HTML/XML-aware and expands certain embedded comments – or prints a makefile rule listing the dependent include files – e.g. hxincl -x -s incfnm=file2.xml file1.xml.

--maxdepth value - increase the maximum depth

Used to detect template loops, cf. variable xsltMaxDepth in xslt.h.

--html - input document(s) are in HTML format

Reads input using the libxml2 HTML 4.0 parser, cf. XML parsing and serialization.

Parameters

«xsl-file» - main XSLT stylesheet for transformation

Cf. option -E (--embed).

-p - parameter is an XPath expression

-s - parameter is a string literal

-p and -s are repeatable, up to a maximum of 256 key-value pairs.

«name»=«value» - name and value of the parameter passed to XSLT processor

E.g. … -p m1='"Hello, XSLT"' -s m2='0xab 0xbb' file.xml

«xml-file» - input XML document file name (stdin is used if missing)

This parameter is repeatable.

Namespaces

Links: Understanding XML namespaces - Evan Lenz | Namespaces in XML 1.0 / 1.1 - W3C rec | The “xml:” namespace - W3C memo | Namespaces at Pawson Q&A

Topics

Use a namespace

xmlstarlet predefines the namespaces xml, xsl, and those used with EXSLT functions and elements minus crypto plus saxon. By default (global option --doc-namespace being in effect) select and edit can use the namespaces declared in the input’s root element (document element) without explicit -N «prefix»=«value» options; if the default namespace is declared there it is bound to the _ (underscore) (aka DEFAULT) prefix.

A QName (qualified name) with no prefix appearing in an XPath expression uses the null namespace, not the default namespace.

Prefixed namespace:

xmlstarlet select --text -t \
  -m 'set:distinct(//mime:mime-type/@type)' -v '.' -n \
recently-used.xbel

Default namespace:

xmlstarlet edit --inplace --pf \
  -u '/_:html/_:head/_:link/@href[.="www.css"]' -v 'solarized.css' \
  -d '//_:*[contains(@class,"pull-quote"] | //_:aside' \
article.xhtml

Null namespace:

xmlstarlet select -T -t \
  -m 'recs/rec' -v '@date' -n \
file.xml

See also: User’s guide ch. 5 | Undefined namespace prefix error | name bound to undefined prefix error

$ cat nspre.xml
<p:r xmlns:p="urn:ns1">r1
  <r xmlns="urn:ns2">r2
    <p:e>e1</p:e>
    <e>e2</e>
  </r>
</p:r>
$ :
$ xmlstarlet select -t -m '//p:*' -v 'normalize-space(text())' -n nspre.xml
r1
e1
$ :
$ xmlstarlet select -N p='urn:ns2' -t -m '//p:*' -v 'normalize-space(text())' -n nspre.xml
r2
e2

Create a namespace

A namespace declaration cannot be created directly with XSLT 1.0[1]. It’s done by adding element and attribute nodes which have a (possibly null) namespace and a local name. Hint: In the following examples, add -C (--comp) before select’s -t option to list the generated XSLT code.

[1] xmlstarlet edit isn’t so picky: see edit -N | Create a SOAP envelope

See also: select -R (--root)

Prefixed namespace
echo '<v/>' |
xmlstarlet select -N m=urn:ssssssssskeyssssstickingagain:local -t \
  -e m:doc -a flag -o 1 -b -a m:flag -o 0
<m:doc xmlns:m="urn:ssssssssskeyssssstickingagain:local" flag="1" m:flag="0"/>
Default namespace
echo '<v/>' |
xmlstarlet select -N ''='https://www.example.org' -t \
  -e 'doc' -a 'flag' -v '"x"'
<doc xmlns="https://www.example.org" flag="x"/>
Null namespace
printf '<fi/>' |
xmlstarlet select -t \
   -e fee -a faw -o fum -b -e fi -e fo -b -o fum
<fee faw="fum"><fi><fo/>fum</fi></fee>

Input file:

<h:rs id="hrs" xmlns="urn:e" xmlns:f="urn:f" xmlns:g="urn:g" xmlns:h="urn:h">
  <f:r id="fr"/><g:r id="gr"/><h:r id="hr"/>
</h:rs>

Query:

xmlstarlet select -t \
  -m 'h:rs' -e '{local-name()}' -c '@*' -b -n \
"${infile:-file.xml}"
<rs xmlns="urn:e" id="hrs"/>

Use -N ''='' for xmlns="":

xmlstarlet select -N = -t \
  -m 'h:rs' -e '{local-name()}' -c '@*' -b -n \
"${infile:-file.xml}"
<rs id="hrs"/>

Edit:

xmlstarlet edit --omit-decl --pf \
  -s 'h:rs' -t elem -n 'foo' -v 'bar' \
  -s '$prev' -t attr -n 'xmlns' -v '' \
"${infile:-file.xml}"
<h:rs xmlns="urn:e" xmlns:f="urn:f" xmlns:g="urn:g" xmlns:h="urn:h" id="hrs">
  <f:r id="fr"/><g:r id="gr"/><h:r id="hr"/>
<foo xmlns="">bar</foo></h:rs>

Move a namespace

xmlstarlet edit -m '//namespace::xsi' '/_:doc/_:el' examples/xml/S0.xml returns non-zero and the error message FIXME: can't move namespace nodes.

Links: examples/xml/S0.xml

Delete a namespace

xmlstarlet edit -d '//namespace::xsi' examples/xml/S0.xml returns non-zero and the error message FIXME: can't delete namespace nodes.

Links: examples/xml/S0.xml

See also: null-ns hack

Remove unnecessary namespace declarations

Tools to remove redundant namespace declarations include xmlstarlet format’s --nsclean option, xmlstarlet c14n, the--nsclean option of xmllint – all with side effects – but they won’t remove xmlns:xi nodes left by XInclude processing.

xml2/2xml or pyx/depyx and grep can do the doctoring (Caution: no questions asked):

xml2 < file.xml | grep -v '^/doc/@xmlns:xi' | 2xml > newfile.xml

Insert node with namespace prefix

Caution: xmlstarlet edit silently ignores the namespace of an inserted node referencing a previously inserted node having a namespace prefix.

For instance, to insert an element such as <ns1:c class="caveat"/> it’s logical to say,

xmlstarlet edit \
  -s '/a/b' -t elem -n 'ns1:c' \
  -s '/a/b/ns1:c' -t attr -n 'class' -v 'caveat' \
file.xml

but the output will not contain the attribute node as the following -s (or -i or -a) option returns an empty nodeset. In other words ns1:c gets inserted but is not available as such in following edit actions. This is on the to-do list as hinted by NULL /* TODO: NS */ in src/xml_edit.c#edInsert().

Workaround: Use the $prev back reference instead, as in … -s '$prev' -t attr ….

See also: -s (--subnode)

Examples

Display namespace nodes in Clark notation

Links: XPath recommendation: Namespace nodes | namespace axis

<doc xmlns="http://www.example.org"
     xmlns:xi="http://www.w3.org/2001/XInclude">
  a
  <xi:include href="b.xml"/>
  b
  <c xmlns="urn:my:local"/>
  <d xmlns="">In no namespace</d>
</doc>
xmlstarlet select -T -t \
  -m 'set:distinct(//namespace::*)' \
    -v 'concat("{",.,"}",name())' -n \
"${infile:-file.xml}"

Output:

{http://www.w3.org/XML/1998/namespace}xml
{http://www.w3.org/2001/XInclude}xi
{http://www.example.org}
{urn:my:local}
{}

Create a SOAP envelope

Links: SOAP on Wikipedia

printf '%s' '<v/>' |
xmlstarlet select --xml-decl --indent \
  -N xsi='http://www.w3.org/2001/XMLSchema-instance' \
  -N soapenv='http://schemas.xmlsoap.org/soap/envelope/' \
  -N my='http://www.example.org/myService' \
  -t \
  -e 'soapenv:Envelope' \
    -e 'soapenv:Header' -o '' -b \
    -e 'soapenv:Body' \
      -e 'my:Service' \
        -e 'Param1' -a 'xsi:type' -o 'integer' -b -o '1' -b \
        -e 'Param2' -a 'xsi:type' -o 'string'  -b -o 'message' -b
<?xml version="1.0"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
  <soapenv:Header/>
  <soapenv:Body>
    <my:Service xmlns:my="http://www.example.org/myService">
      <Param1 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="integer">1</Param1>
      <Param2 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="string">message</Param2>
    </my:Service>
  </soapenv:Body>
</soapenv:Envelope>

To force namespace nodes into the root element (namespace normalization) append dummy attributes there, then strip them:

printf '%s' '<v/>' |
xmlstarlet select \
  -N xsi='http://www.w3.org/2001/XMLSchema-instance' \
  -N soapenv='http://schemas.xmlsoap.org/soap/envelope/' \
  -N my='http://www.example.org/myService' \
  -t \
  -e 'soapenv:Envelope' -a 'xsi:nslift' -b -a 'my:nslift' -b \
    -e 'soapenv:Header' -o '' -b \
    -e 'soapenv:Body' \
      -e 'my:Service' \
        -e 'Param1' -a 'xsi:type' -o 'integer' -b -o '1' -b \
        -e 'Param2' -a 'xsi:type' -o 'string'  -b -o 'message' -b \
| xmlstarlet edit -d 'soapenv:*/@xsi:nslift | soapenv:*/@my:nslift'
<?xml version="1.0"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:my="http://www.example.org/myService">
  <soapenv:Header/>
  <soapenv:Body>
    <my:Service>
      <Param1 xsi:type="integer">1</Param1>
      <Param2 xsi:type="string">message</Param2>
    </my:Service>
  </soapenv:Body>
</soapenv:Envelope>

To have xmlstarlet edit produce the latter version, for example:

printf '%s' '<v/>' |
xmlstarlet edit \
  -r '*' -v 'soapenv:Envelope' \
  -a '*' -type attr -n 'xmlns:soapenv' -v 'http://schemas.xmlsoap.org/soap/envelope/' \
  -a '*' -type attr -n 'xmlns:xsi' -v 'http://www.w3.org/2001/XMLSchema-instance' \
  -a '*' -type attr -n 'xmlns:my' -v 'http://www.example.org/myService' \
  -s '*' -type elem -n 'soapenv:Header' -v '' \
  -s '*' -type elem -n 'soapenv:Body' \
  -s '$prev' -type elem -n 'my:Service' \
  --var svc '$prev' \
  -s '$svc'  -type elem -n 'Param1'   -v '1' \
  -s '$prev' -type attr -n 'xsi:type' -v 'integer' \
  -s '$svc'  -type elem -n 'Param2'   -v 'message' \
  -s '$prev' -type attr -n 'xsi:type' -v 'string'

Error messages

A non-exhaustive list of xmlstarlet messages follows.

(No output and no error)

See xmlstarlet user’s guide ch. 5 “Namespaces and default namespace”.

See also: Use a namespace

(xmlstarlet edit displays its usage reminder)

… but offers no other clue

couldn’t read file

See failed to load external entity (re stdin)

failed to load external entity

Triggered by:

FIXME: can’t delete namespace nodes

See Delete a namespace.

FIXME: can’t move namespace nodes

See Move a namespace.

Invalid context position / Invalid context size

See xpath arguments.

move destination is not a single node

The destination operand of xmlstarlet edit’s -m (--move) option does not exist or is not a single element node.

Namespace prefix «name» … is not defined

This is a message from the XML parser: a warning but not necessarily an error. Recall that : (colon) in component names is tolerated but unrecommended as it makes a document not namespace-well-formed.

Example:

<doc><div vid="yo" abc:txt="hello"/></doc>

To copy the value of @abc:txt to @vid, for example:

xmlstarlet -q edit \
  -u '*/*[@*[local-name()="abc:txt"][namespace-uri()=""]]/@vid' \
  -x 'string(../@*[local-name()="abc:txt"][namespace-uri()=""])' \
file.xml

where

See also: Use a namespace | Undefined namespace prefix

Undefined namespace prefix

Triggered by

See also: Use a namespace | Namespace prefix «name» … is not defined

Warning: unrecognized option «option»

edit’s -N option must be the last non-action option.

xmlSAX2Characters: huge text node: out of memory

In libxml2 the XML_PARSE_HUGE option is disabled by default to prevent denial-of-service attacks. This triggers the xmlSAX2Characters: huge text node: out of memory error when loading a text node larger than 10 MB.

For a workaround see this patch.

xmlXPathCompOpEval: function «name» bound to undefined prefix «name»

Triggered by

See also: Use a namespace | select -N.

xmlXPathCompOpEval: function «name» not found

xmlstarlet edit’s xpath arguments do not support XSLT functions.

xsl:extension-element-prefix : undefined namespace «name»

See EXSLT.

xsltParseStylesheetTop: unknown «name» element

A -b (--break) too many was used.

EXSLT and other extensions

EXSLT functions and elements

EXSLT is an extension library for XSLT, mainly for XSLT 1.0. It provides missing language features such as functions to handle strings, math, dates, and sets, as well as nodeset coercion, user-defined functions, and dynamic evaluation of strings containing XPath expressions.

Linked with the libexslt library xmlstarlet’s XSLT processing commands, select and transform, support a larger number of EXSLT functions (and a few elements) whereas xmlstarlet edit supports a subset.

xmlstarlet predefines the EXSLT namespaces with prefixes date, dyn, exslt (not exsl), math, set, and str, as well as the saxon and test namespaces. To use crypto functions (or the func elements) declare the namespace explicitly with -N, for the exslt:document element see example.

Note: In 2012 str:replace was removed as broken when used in an XPath context (xmlstarlet edit) but remains available when used in an XSLT context (select and transform).

See also: List of XSLT extensions | transform --show-ext | select -N

Links: EXSLT docs on github.ioEXSLT project on github.com | EXSLT on stackoverflow.com

Observations:

Extensions not found in EXSLT documentation

Examples

Compute XML tree height

Compute the height of an XML tree as the maximum depth of a branch node of the tree. Root and leaf nodes count as zero.

xmlstarlet select -T -t \
  -v 'math:max(dyn:map(descendant::*,"count(ancestor::*)"))' -n \
"${infile:-file.xml}"

Links: Tree (data structure) on Wikipedia

XML to CSV

A simple recordset enclosed in a root element,

<rs>
  <r id="1" user="3" name="abc" date="2017-08" flag1="false"/>
  <r id="2" user="7" name="defg" date="2019-12" flag1="false"/>
  <r id="3" user="9" name="hijkl" date="2020-02" flag1="true"/>
  <r id="4" user="11" name="mno" date="2022-01" flag1="false"/>
  <r id="5" user="14" name="pqrs" date="2022-01" flag1="false"/>
</rs>

is converted to TSV by:

xmlstarlet select --text -t \
  --var ishdr="${hdr:-1}" \
  --var ofs -o "$(printf '\t')" -b \
  --var ors -n -b \
  --var fnhdr='"concat($ofs,name())"' \
  --var fnrow='"concat($ofs,string())"' \
  -m '*/*[$ishdr and position() = 1]' \
    -v 'substring-after(str:concat(dyn:map(@*,$fnhdr)),$ofs)' -v '$ors' \
  -b \
  -m '*/*' \
    -v 'substring-after(str:concat(dyn:map(@*,$fnrow)),$ofs)' -v '$ors' \
  -b \
"${infile:-file.xml}"

where:

TSV output:

id      user    name    date    flag1
1       3       abc     2017-08 false
2       7       defg    2019-12 false
3       9       hijkl   2020-02 true
4       11      mno     2022-01 false
5       14      pqrs    2022-01 false

If the data items exist as child elements of /rs/r (e.g. after: xmlstarlet sel -t -e '{name(*)}' -m '*/*' -e '{name()}' -m '@*' -e '{name()}' -v . "${infile}"), instead dyn:map(*,…) (2 places) to process child::* rather than attribute::*.

Links: packages.debian.org xml2

From h2 sections with foo titles select div with the most p children

Process an HTML document where each h2 element heads a number of divs:

xmlstarlet select -t \
  --var T='//_:div[_:p][contains(preceding::_:h2[1]/text(),"foo")]' \
  -c '($T[count(_:p) = math:max(dyn:map($T,"count(_:p)"))])[1]' \
file.xhtml

Convert between local and UTC time

Converting between local time (L) and UTC time (Z) in different time zones, the TZ environment variable selecting an entry in /usr/share/zoneinfo, putting EXSLT functions date:date-time, date:add, date:duration, and date:seconds to use.

Links: EXSLT dates-and-times docs | tz database on Wikipedia

for zone in \
  'America/Vancouver' 'Europe/Vatican' 'Asia/Manila' 'Pacific/Chatham'
do
  printf '<v/>\n' | 
  TZ=":$zone" xmlstarlet select --text \
  -t \
    --var ofs -o "$(printf '\t')" -b \
    --var ors -n -b \
    --var tz -o "$zone" -b \
    --var dttodayL -v 'date:date-time()' -b \
    --var tzoffset='substring($dttodayL,20)' \
    --var tzseconds='number(concat(
                     translate(substring($tzoffset,1,1),"-−+","--"),
                     substring($tzoffset,2,2) * 60 * 60 +
                     substring($tzoffset,5,2) * 60))' \
    --var dtepochL='concat("1970-01-01T00:00:00",$tzoffset)' \
    --var dtepochZ -o '1970-01-01T00:00:00+00:00' -b \
    --var dttodayZ='concat(substring-before(date:add($dtepochZ,
                    date:duration(date:seconds($dttodayL))),
                    "Z"),"+00:00")' \
    --var dttoday2L='date:add($dtepochL, 
                     date:duration(date:seconds($dttodayZ)+$tzseconds))' \
    -v 'concat(
       "# ",$tz,$ors
       ,"dttodayL", $ofs, $dttodayL, $ors
       ,"tzoffset", $ofs, $tzoffset, $ors
       ,"tzseconds",$ofs, $tzseconds,$ors
       ,"dttodayZ", $ofs, $dttodayZ, $ors
       ,"dttoday2L",$ofs, $dttoday2L,$ors
       )'
done |
expand -t 16

Using … -v 'date:date-time()' -b (rather than …='date:date-time()') to avoid xmlstarlet select’s EXSLT namespace issue.

Output:

# America/Vancouver
dttodayL        2023-08-15T13:04:07-07:00
tzoffset        -07:00
tzseconds       -25200
dttodayZ        2023-08-15T20:04:07+00:00
dttoday2L       2023-08-15T13:04:07-07:00
# Europe/Vatican
dttodayL        2023-08-15T22:04:07+02:00
tzoffset        +02:00
tzseconds       7200
dttodayZ        2023-08-15T20:04:07+00:00
dttoday2L       2023-08-15T22:04:07+02:00
# Asia/Manila
dttodayL        2023-08-16T04:04:07+08:00
tzoffset        +08:00
tzseconds       28800
dttodayZ        2023-08-15T20:04:07+00:00
dttoday2L       2023-08-16T04:04:07+08:00
# Pacific/Chatham
dttodayL        2023-08-16T08:49:07+12:45
tzoffset        +12:45
tzseconds       45900
dttodayZ        2023-08-15T20:04:07+00:00
dttoday2L       2023-08-16T08:49:07+12:45

Use set:leading and set:trailing

Links: EXSLT set functions on github.io

Using an explicit namespace declaration -N str='…' to avoid xmlstarlet select’s EXSLT namespace issue.

printf '%s\n' '<v s="/fee/fi/fo/fum"/>' |
xmlstarlet select -R \
  -N str='http://exslt.org/strings' \
  -t \
  --var sep='"/"' \
  --var T='str:split(*/@s,$sep)' \
  -n -c '$T' -n \
  -n -c 'set:leading($T,$T[.="fo"])' -n \
  -n -m '$T' -c  'set:leading($T,following-sibling::*[1])' -n -b \
  -n -m '$T' -c 'set:trailing($T,preceding-sibling::*[1])' -n -b \
  -n -e 'foo' -m 'set:trailing($T,$T[.="fee"])' -v 'concat($sep,.)' -b -b -n

Output:

<xsl-select>
<token>fee</token><token>fi</token><token>fo</token><token>fum</token>

<token>fee</token><token>fi</token>

<token>fee</token>
<token>fee</token><token>fi</token>
<token>fee</token><token>fi</token><token>fo</token>
<token>fee</token><token>fi</token><token>fo</token><token>fum</token>

<token>fee</token><token>fi</token><token>fo</token><token>fum</token>
<token>fi</token><token>fo</token><token>fum</token>
<token>fo</token><token>fum</token>
<token>fum</token>

<foo>/fi/fo/fum</foo>
</xsl-select>

See also: Divide a document into sections | Generate a date sequence

Generate a date sequence

Links: ISO 8601 standard on Wikipedia | Daylight saving time (DST) on Wikipedia | TZ env.var. on OpenGroup

This is where the EXSLT strings, sets, and dates-and-times modules come together to compute a datetime series from 3 arguments:

  1. start, default value is today in ISO 8601 extended format
  2. step, default value is 1 day in ISO 8601 format
  3. maxct, the maximum count, default value is 100

XSLT doesn’t do loops but EXSLT lets you create a string of any length and str:split it into a nodeset each member of which contains the step interval. An initial empty time period (PT0S) is added to handle the first item. Using set:leading to collect N steps, date:sum to sum them up, then adding the sum to start, to arrive at a result for each item in the series. It’s an inefficient algorithm so nil points for performance (though probably fast enough for ordinary maxct values).

xsdateseq0() {
  printf '<v start="%s" step="%s" maxct="%s"/>\n' \
          "${1:-$(date '+%Y-%m-%d')}" "${2:-P1D}" "${3:-100}" |
  xmlstarlet select --text \
    -N str='http://exslt.org/strings' \
    -t --var start='*/@start' \
       --var padlen='(*/@maxct - 1) * (1 + string-length(*/@step))' \
       --var D='str:split(concat("PT0S",str:padding($padlen, concat(" ",*/@step))))' \
    -m '$D' -v 'date:add($start, date:sum(set:leading($D,following-sibling::*[1])))' -n 
}

Using an explicit namespace declaration -N str='…' to avoid xmlstarlet select’s EXSLT namespace issue. libexslt doesn’t support date:format-date but there’s an implementation (EXSLT function and XSLT template) by Jeni Tennison.

Print 53 dates starting on January 1st with a step value of 7 days.

TZ=':Europe/Vatican' xsdateseq0 '2023-01-01' P7D 53 | pr -t -8 -s' ' - 
2023-01-01 2023-02-19 2023-04-09 2023-05-28 2023-07-16 2023-09-03 2023-10-15 2023-11-26
2023-01-08 2023-02-26 2023-04-16 2023-06-04 2023-07-23 2023-09-10 2023-10-22 2023-12-03
2023-01-15 2023-03-05 2023-04-23 2023-06-11 2023-07-30 2023-09-17 2023-10-29 2023-12-10
2023-01-22 2023-03-12 2023-04-30 2023-06-18 2023-08-06 2023-09-24 2023-11-05 2023-12-17
2023-01-29 2023-03-19 2023-05-07 2023-06-25 2023-08-13 2023-10-01 2023-11-12 2023-12-24
2023-02-05 2023-03-26 2023-05-14 2023-07-02 2023-08-20 2023-10-08 2023-11-19 2023-12-31
2023-02-12 2023-04-02 2023-05-21 2023-07-09 2023-08-27

Print 10 datetimes with a step value of 1 day, 1 hour, 1 minute, and 5 seconds.
Note the lack of DST adjustment.

TZ=':Europe/Vatican' xsdateseq0 '2022-10-26T07:30:00+01:00' 'P1DT1H1M5S' 10 
2022-10-26T07:30:00+01:00
2022-10-27T08:31:05+01:00
2022-10-28T09:32:10+01:00
2022-10-29T10:33:15+01:00
2022-10-30T11:34:20+01:00
2022-10-31T12:35:25+01:00
2022-11-01T13:36:30+01:00
2022-11-02T14:37:35+01:00
2022-11-03T15:38:40+01:00
2022-11-04T16:39:45+01:00

Print 8 datetimes in email format (RFC 822). Uses GNU date for -R and -f options and DST adjustment.

TZ=':Europe/Vatican' xsdateseq0 '2022-10-26T11:30:00' '' 8 | date -Rf-
Wed, 26 Oct 2022 13:30:00 +0200
Thu, 27 Oct 2022 13:30:00 +0200
Fri, 28 Oct 2022 13:30:00 +0200
Sat, 29 Oct 2022 13:30:00 +0200
Sun, 30 Oct 2022 12:30:00 +0100
Mon, 31 Oct 2022 12:30:00 +0100
Tue, 01 Nov 2022 12:30:00 +0100
Wed, 02 Nov 2022 12:30:00 +0100

Group by element name and merge text

xmlstarlet select doesn’t support xsl:key but grouping can be done using EXSLT functions. As an example, group repeating fields in each record by element name and merge their texts in document order.

<recs>
  <rec>
    <fb>fee</fb>
    <fa>foo</fa>
    <fd>zzz</fd>
    <fc>bat</fc>
    <fa>bar</fa>
    <fb>faw</fb>
    <fd>bat</fd>
    <fb>fum</fb>
    <fa>quux</fa>
  </rec>
  <rec>
    <fa>fee</fa>
    <fc>fo</fc>
    <fc>fum</fc>
    <fa>fi</fa>
  </rec>
</recs>
xmlstarlet select --indent -t \
  --var sfs="'${sfs:- }'" \
  -e '{name(*)}' \
    -m '*/*' \
      --var rec='.' \
      -e '{name()}' \
        -m 'set:distinct(dyn:map(*,"name()"))' \
          -s 'A:T:-' '.' \
          -e '{.}' \
            -v 'substring-after(
                  str:concat(
                    dyn:map($rec/*[name()=current()],"concat($sfs,text())")
                  )
               ,$sfs)' \
"${infile:-file.xml}"

Output:

<recs>
  <rec>
    <fa>foo bar quux</fa>
    <fb>fee faw fum</fb>
    <fc>bat</fc>
    <fd>zzz bat</fd>
  </rec>
  <rec>
    <fa>fee fi</fa>
    <fc>fo fum</fc>
  </rec>
</recs>
See also: Remove all but the latest member of each group example

Code generation

This section takes xmlstarlet off the beaten track.

select as edit script generator

xmlstarlet select doesn’t copy its input to output; edit cannot do xsl:for-each, xsl:choose, or use XSLT functions. In tandem they have a wider range – but so does an XSLT stylesheet.

Links: shell quoting | shell word expansions

Examples

Rename elements

xmlstarlet edit’s rename action requires a literal value for the new name so XPath functions are out. But select can generate the edit command, for example to number elements (here using the XSLT format-number() function):

<Names>
  <Name>fee</Name>
  <Name>faw</Name>
  <Name>fum</Name>
</Names>
# shellcheck shell=sh disable=SC2016
xmlstarlet select --text -t \
  --var sq -o "'" -b \
  -o "xmlstarlet edit --omit-decl \\" -n \
  -o "  --var N 'Names/Name' \\" -n \
  -m '*/*' \
    -o '  -r ' -v 'concat($sq,"$N[",position(),"]",$sq)' \
    -o '  -v ' -v 'concat($sq,name(),format-number(position(),"0000"),$sq)' -o " \\" -n \
  -b \
  -f -n \
"${infile:-file.xml}"

Output:

xmlstarlet edit --omit-decl \
  --var N 'Names/Name' \
  -r '$N[1]'  -v 'Name0001' \
  -r '$N[2]'  -v 'Name0002' \
  -r '$N[3]'  -v 'Name0003' \
file.xml

To execute the output as a shell script:

xmlstarlet-select-command | sh -s > result.xml

Alternatively, replace $N with (Names/Name), or process elements in reverse order by repeatedly renaming Names/Name[last()] – the predicate […] binding to the nearest XPath location step.

Remove all but the latest member of each group

EXSLT functions provide another way to do grouping. Here’s how to create a shell script invoking xmlstarlet edit to delete all but the latest member of each group. The input file has module ID strings on the form: group ID, _ (underscore), major version number, . (dot), minor version number – as shown in this snippet:

<mod>mrR_0.9</mod>
<mod>mrR_0.10</mod>
<mod>mrM_0.19</mod>
<mod>mrM_0.2</mod>
<mod>mrM_0.20</mod>
<mod>mrM_0.3</mod>

Method:

# shellcheck shell=sh disable=SC2016,SC2064
xmlstarlet select --text -t \
  --var dq -o '"' -b \
  --var sep1='"_"' \
  --var sep2='"."' \
  --var fngrpid -o 'substring-before(.,$sep1)' -b \
  --var fnverno -o 'format-number(.,"0000")' -b \
  --var allm='//_:mods/_:mod' \
  -o "xmlstarlet edit \\" -n \
  -o "  --var M '//_:mods/_:mod' \\" -n \
  -o "  --var keep '/.. " \
  -m 'set:distinct(dyn:map($allm,$fngrpid))' \
    --var grpid_='concat(.,$sep1)' \
    -m '$allm[starts-with(.,$grpid_)]' \
      -s 'D:N:-' '0 + str:concat(dyn:map(str:split(substring-after(.,$sep1),$sep2),$fnverno))' \
      --if 'position() = 1' \
        -n -v 'concat("  | $M[.=",$dq,current(),$dq,"]")' \
      -b \
    -b \
  -b \
  -o "' \\" -n \
  -o "  --delete 'set:difference(\$M,\$keep)' \\" -n \
  -f -n \
"${infile:-file.xml}"

See also: select’s --var | -m (--match) | -s (--sort) | -i (--if) | -b (--break) | -f (--inp-name) | Group by element name and merge text example

Links: XSLT functions format-number() | current()

Output:

xmlstarlet edit \
  --var M '//_:mods/_:mod' \
  --var keep '/.. 
  | $M[.="mrR_1.11"]
  | $M[.="mrS_0.7"]
  | $M[.="mrE_2.2"]
  | $M[.="mrM_0.20"]' \
  --delete 'set:difference($M,$keep)' \
file.xml

To execute the output as a shell script:

xmlstarlet-select-command | sh -s > result.xml

See also: edit’s --var | -d (--delete)

Translate XLIFF file

This is the basic “update node with result of shell command” usecase.

Create a shell script to have xmlstarlet edit add missing targets to an XLIFF version 2.0 localization data file by invoking translate-shell to supply translated phrases:

# shellcheck shell=sh disable=SC2016
xmlstarlet select --text -t \
  --var sq -o "'" -b \
  --var dq -o '"' -b \
  --var cmdopt='concat("trans -from ",/_:xliff/@srcLang," -to ",/_:xliff/@trgLang)' \
  -o 'xmlstarlet edit --pf '\\ -n \
  -m '//_:segment[not(_:target)]' \
    --var xpath-a='concat("//_:unit[@id=",$dq,parent::_:unit/@id,$dq,"]/_:segment/_:source")' \
    --var src-e -v 'str:replace(_:source,$sq,concat($sq,"\",$sq,$sq))' -b \
    -o '  -a ' -v 'concat($sq,$xpath-a,$sq)' -o ' -t elem -n target '\\ -n \
    -o "  -u '\$xstar:prev'" -o ' -v "$(' -v 'concat($cmdopt," ",$sq,$src-e,$sq)' -o ')" '\\ -n \
    -o "  -i '\$xstar:prev' -t text -n indent -v '' \\" -n \
    -o "  -u '\$xstar:prev' -x 'preceding-sibling::node()[2][normalize-space()=\"\"]' \\" -n \
    -o '  '\\ -n \
  -b \
  -f -n \
"${infile:-file.xml}"

Snippets from sample data file:

<source>Über "O'ona"$tra"</source>
<source>&amp;Speichern als &lt;.oona&gt;</source>

Sample output:

xmlstarlet edit --pf \
  -a '//_:unit[@id="2"]/_:segment/_:source' -t elem -n target \
  -u '$xstar:prev' -v "$(trans -from de -to en 'Über "O'\''ona"$tra"')" \
  -i '$xstar:prev' -t text -n indent -v '' \
  -u '$xstar:prev' -x 'preceding-sibling::node()[2][normalize-space()=""]' \
  \
  -a '//_:unit[@id="24"]/_:segment/_:source' -t elem -n target \
  -u '$xstar:prev' -v "$(trans -from de -to en '&Speichern als <.oona>')" \
  -i '$xstar:prev' -t text -n indent -v '' \
  -u '$xstar:prev' -x 'preceding-sibling::node()[2][normalize-space()=""]' \
  \
file.xml

The output can be executed directly as a shell script:

xmlstarlet-select-command | sh -s > result.xlf

Snippets from result.xlf:

<target>About "O'ona"$tra"</target>
<target>&amp;Save as &lt;.oona&gt;</target>

select as XSLT stylesheet generator

This section is included for completion.

Links: shell quoting | shell word expansions

Spell it out

To use XSLT or extension elements not supported by xmlstarlet select’s options it’s possible to have select spell out an XSLT stylesheet. This example inserts one document into another, the first xsl:template is the identity transform.

: "${xml1=z1.xml}"  "${xml2=z2.xml}"
test -s "$xml1" || printf '%s\n' '<v><THERE/></v>' > "$xml1"
test -s "$xml2" || printf '%s\n' '<w><x q="what">ever</x></w>' > "$xml2"

echo '<v/>' |
xmlstarlet select -t \
  -e xsl:transform  -a version -o 1.0 -b \
    -e xsl:param  -a name -o xdoc -b  -o /dev/null -b \
    -e xsl:template  -a match -o '@*|node()' -b \
      -e xsl:copy \
        -e xsl:apply-templates  -a select -o '@*|node()' -b  -b \
      -b \
    -b \
    -e xsl:template  -a match -o THERE -b \
      -e xsl:copy-of  -a select -o 'document($xdoc,/)' -b  -b \
    -b \
  -b |
xmlstarlet transform --omit-decl /dev/stdin -s xdoc="$xml2" "$xml1"

Output: <v><w><x q="what">ever</x></w></v>

Notes:

See also: -t (--template) | -e (--elem) | -a (--attr) | -o (--output) | -b (--break) | transform

Links: XSLT document()

Next step: don’t repeat yourself

Take this one step further and create a library of shorthand shell functions (causing shellcheck.net a.o. to vociferate):

xslxfm()  ## xsl:transform(); non-closed
  printf " -e xsl:transform  -a version -v '1.0' -b "

xsltpl()  ## xsl:template(match name?); non-closed
  printf " -e xsl:template  -a match -v '%s' -b%s" \
    "${1:?usage: template(match name?)}" "${2:+  -a name -v '$2' -b }"

xslapt()  ## xsl:apply-templates(select?); closed
  case $# in
  (0) printf ' -e xsl:apply-templates -b ' ;;
  (1) printf " -e xsl:apply-templates -a select -v '%s' -b -b " "$1" ;;
  (*) printf ' usage: xslapt(select?)\n' 1>&2; false ;;
  esac

xslIDN() { ## xsl:template name=identity; closed
  xsltpl '@*|node()' 'identity'
    printf " -e xsl:copy "
      xslapt '@*|node()'
  printf " -b -b "
}

xslhelp() { ## list xsl* functions in this file
  sed -n -e '/^\(xsl[^ ]*\)()[ {]*## \(.*\)/ s//\1	\2/p' "${_pnself_:-$0}" | 
  expand -t 12
}

Produce the same output as above having select, not transform, include the external document,

. "${pathto:-./}xsdefs.sh"
echo '<v/>' |
xmlstarlet sel -I -t $(xslxfm) $(xslIDN) $(xsltpl THERE) -c "document('${xml2}',/)" -b |
xmlstarlet tr --omit-decl /dev/stdin "$xml1"

providing the following stylesheet to the XSLT processor,

<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:template match="@*|node()" name="identity">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>
  <xsl:template match="THERE">
    <w>
      <x q="what">ever</x>
    </w>
  </xsl:template>
</xsl:transform>

as tee /dev/stderr | inserted in the pipeline will show.

Links: The XML version of the XSLT 1.0 Rec contains element syntax and function prototypes

Examples

Create multiple result documents

Use the EXSLT exslt:document element as a template to split an XML document into multiple parts and output them as separate files in an existing directory.

Caution: Leaving out the exslt:nop attribute here triggers an xsl:extension-element-prefix : undefined namespace exslt error, with or without -N exslt='http://exslt.org/common' (exslt is predefined).

printf '%s\n' '<v><x>fee fi</x><y>fo fum</y></v>' |
xmlstarlet select -I -t \
  --var part-prefix -o "${outDir:-/tmp/}part" -b \
  -e 'xsl:transform' \
    -a 'version' -o '1.0' -b \
    -a 'exslt:nop' -o '' -b \
    -a 'extension-element-prefixes' -o 'exslt' -b \
    -e 'xsl:template' \
      -a 'match' -o '/' -b \
      -m '*/*' \
        -e 'exslt:document' \
          -a 'href' -v 'concat($part-prefix,format-number(position(),"000"),".xml")' -b \
          -a 'method' -o 'xml' -b \
          -a 'omit-xml-declaration' -o 'yes' -b \
          -e 'part' \
            -a 'no' -v 'position()' -b \
            -a 'of' -v 'last()' -b \
            -c '.' |
{ printf '%s\n' '<v/>' | xmlstarlet transform /dev/fd/3 /dev/stdin ; } 3<&0

Generated XSLT script:

<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:exslt="http://exslt.org/common" version="1.0" exslt:nop="" extension-element-prefixes="exslt">
  <xsl:template match="/">
    <exslt:document href="/tmp/part001.xml" method="xml" omit-xml-declaration="yes">
      <part no="1" of="2">
        <x>fee fi</x>
      </part>
    </exslt:document>
    <exslt:document href="/tmp/part002.xml" method="xml" omit-xml-declaration="yes">
      <part no="2" of="2">
        <y>fo fum</y>
      </part>
    </exslt:document>
  </xsl:template>
</xsl:transform>

Generated documents:

<part no="1" of="2"><x>fee fi</x></part>
<part no="2" of="2"><y>fo fum</y></part>

See also: Split XML file example

Appendix A: xmlstarlet XSLT extensions

List of registered extension functions, elements, and modules:

Registered XSLT Extensions
--------------------------
Registered Extension Functions:
{http://exslt.org/common}node-set
{http://exslt.org/common}object-type
{http://exslt.org/crypto}md4
{http://exslt.org/crypto}md5
{http://exslt.org/crypto}rc4_decrypt
{http://exslt.org/crypto}rc4_encrypt
{http://exslt.org/crypto}sha1
{http://exslt.org/dates-and-times}add
{http://exslt.org/dates-and-times}add-duration
{http://exslt.org/dates-and-times}date
{http://exslt.org/dates-and-times}date-time
{http://exslt.org/dates-and-times}day-abbreviation
{http://exslt.org/dates-and-times}day-in-month
{http://exslt.org/dates-and-times}day-in-week
{http://exslt.org/dates-and-times}day-in-year
{http://exslt.org/dates-and-times}day-name
{http://exslt.org/dates-and-times}day-of-week-in-month
{http://exslt.org/dates-and-times}difference
{http://exslt.org/dates-and-times}duration
{http://exslt.org/dates-and-times}hour-in-day
{http://exslt.org/dates-and-times}leap-year
{http://exslt.org/dates-and-times}minute-in-hour
{http://exslt.org/dates-and-times}month-abbreviation
{http://exslt.org/dates-and-times}month-in-year
{http://exslt.org/dates-and-times}month-name
{http://exslt.org/dates-and-times}second-in-minute
{http://exslt.org/dates-and-times}seconds
{http://exslt.org/dates-and-times}sum
{http://exslt.org/dates-and-times}time
{http://exslt.org/dates-and-times}week-in-month
{http://exslt.org/dates-and-times}week-in-year
{http://exslt.org/dates-and-times}year
{http://exslt.org/dynamic}evaluate
{http://exslt.org/dynamic}map
{http://exslt.org/math}abs
{http://exslt.org/math}acos
{http://exslt.org/math}asin
{http://exslt.org/math}atan
{http://exslt.org/math}atan2
{http://exslt.org/math}constant
{http://exslt.org/math}cos
{http://exslt.org/math}exp
{http://exslt.org/math}highest
{http://exslt.org/math}log
{http://exslt.org/math}lowest
{http://exslt.org/math}max
{http://exslt.org/math}min
{http://exslt.org/math}power
{http://exslt.org/math}random
{http://exslt.org/math}sin
{http://exslt.org/math}sqrt
{http://exslt.org/math}tan
{http://exslt.org/sets}difference
{http://exslt.org/sets}distinct
{http://exslt.org/sets}has-same-node
{http://exslt.org/sets}intersection
{http://exslt.org/sets}leading
{http://exslt.org/sets}trailing
{http://exslt.org/strings}align
{http://exslt.org/strings}concat
{http://exslt.org/strings}decode-uri
{http://exslt.org/strings}encode-uri
{http://exslt.org/strings}padding
{http://exslt.org/strings}replace
{http://exslt.org/strings}split
{http://exslt.org/strings}tokenize
{http://icl.com/saxon}eval
{http://icl.com/saxon}evaluate
{http://icl.com/saxon}expression
{http://icl.com/saxon}line-number
{http://icl.com/saxon}systemId
{http://xmlsoft.org/XSLT/}test

Registered Extension Elements:
{http://exslt.org/common}document
{http://exslt.org/functions}result
{http://xmlsoft.org/XSLT/}test

Registered Extension Modules:
http://exslt.org/functions
http://icl.com/saxon
http://xmlsoft.org/XSLT/

… as output by:

xmlstarlet transform --show-ext 2>&1 |
awk -F '\n' -v S='sort' '/^Registered|^-*$/{ close(S); print; next } { print | S }'

Caution: Missing from above list is the EXSLT extension element {http://exslt.org/functions}function (aka func:function) which works as documented in libexslt despiteelement-available("func:function") returning false.

Appendix B: xmlstarlet news summary

Snipped from SourceForge news and files sections. Covers versions 1.0.3 through 1.6.1 (in reverse order).


XMLStarlet 1.6.1 Released

1.6.1: August 9, 2014

- handle unicode arguments under Windows

There is no difference for non-Windows platforms.
Posted by Noam Postavsky 2014-08-09


XMLStarlet 1.6.0 Released

Changes:

    get rid of "helpful" message about namespaces
    update user guide
    Enhancements:
        add --stop option to val
        add global option --no-doc-namespace
    Build:
        let the make install target succeed even if docs aren't built.

Posted by Noam Postavsky 2014-06-13


XMLStarlet 1.5.0 is released, changes:

    Bugs:
        avoid segfault on pyx non-existant file
        fix unescaping of entities straddling 4K byte boundary (Bug #102)
    Enhancements:
        unescape hex entities (&#xXX;)
        give a helpful message if doc has default namespace and nothing matched
        add "_" and "DEFAULT" as names for document's top-level default namespace
        Adding a global quiet option
        ed: Allow omitting value argument to create empty element.
        use default attribute values in sel subcommand
    Build:
        fix test variables to work with newer automake (1.11 -> 1.13)
        fix usage2c.awk for mawk
        scripts for building on mingw

Posted by Noam Postavsky 2013-07-07 


1.4.2: Dec 28, 2012

    - pyx: avoid segfault on documents with multiple attributes (Bug 
      #3595212)


1.4.1: Dec 8, 2012

    - avoid segfault when attempting to edit the document node (Bug
      #3575722)

    - Packaging:
      - include doc/xmlstar-fodoc-style.xsl in the dist so that the
        --enable-build-docs option works from the tarball (Bug
        #3580667)
      - AC_SUBST PACKAGE_TARNAME for automake so that documentation is
        installed to the right place (Bug #3561958)

    - Test Suite:
      - avoid test failures due to XML formatting and whitespace
        changes (also fixes Bug #3572789)
      - use automake's parallel test suite
      - make bigxml tests much faster by using whitespace instead of nodes
      - don't test str:replace() with ed: it doesn't work outside of
        xslt in new libxslt
      - ignore extra errors from libxml 2.9.0 bug
      - let tests run using busybox
      - add runAllTests.sh to run tests without make


1.4.0: Aug 26, 2012

    - Documentation:
        - executable name used in documentation now matches
          --transform-program-name (Bug #3283713)
        - added Makefile rules for generating documentation
          (./configure --enable-build-docs)

    - ed subcommand:
        - relative XPaths are now handled correctly (Bug #3527850)
        - the last nodeset inserted by an edit operation can be
          accessed as the XPath variable $prev (or $xstar:prev)
        - add --var option to define XPath variables
        - allow ed -u -x to insert nodesets instead of converting to
          string
        - remove hard limit for number of edit operations (Bug
          #3488240)

    - pyx now handles namespaces correctly


1.3.1: Jan 14, 2012

    - handle multiple values for --value-of properly (Bug #2563866)
    - substitute external entities (Bug #3467320)
    - pyx output needs space between attribute name and value (Bug #3440797)


1.3.0: Oct 7, 2011

    - avoid ASCII CRs in UTF-16/32 text (reported by Ming Chen)
    - --value-of outputs concat values of all nodes (Req #2563866)
    - encode special chars for ed -u -x
    - allow use of exslt functions in ed -u -x
    - add --var to select (allow --var <name>=<value> as well as --var
      <name> <value> --break)
    - work around libxml bug that passes bogus data to error handler
      (Bug #3362217)

Source: README.1.3.0, updated 2011-10-02 


1.2.1: July 07, 2011

    - check for NULL nodeset result (Bugs #3323189, #3323196)
    - "-" was being confused with --elif
    - generated XSLT should also have automatic namespaces
    - allow -N after other option (Bug #3325166)
    - namespace values were being registered as prefixes
    - avoid segfault when asked to move namespace nodes
    - missing newline in ed --help message
    - test scripts portability
      - no bashisms allowed in NetBSD sh
      - make BRE portable: '+' is not allowed
      - deal with msys path conversion properly (Bug #3178657)
    - don't use XML_SAVE_WSNONSIG #if libxml < 2.7.8 (Bug #3310475)

Source: README.1.2.1, updated 2011-07-07


1.2.0: June 1, 2011

    - implement ed --update --expr
    - use top-level namespace definitions from first input file, this
      should remove the need to define namespaces on the command line
      with -N in most cases.
    - select exits with 0 only if result is non-empty (Req #3155702)
    - add -Q to select, like grep's -q
    - add column number to error messages
    - restore input context (lost in version 1.0.3) to error messages
      (Bug #3305659)
    - print extra string information in error messages
    - use entity definitions from dtd (Bug #3305659)
    - add --net option to c14n, ed, fo, and val (Req #1071398)
    - remove --catalog from tr --help message since it isn't actually supported
    - add --elif and --else to sel --help message

Source: README.1.2.0, updated 2011-06-01 


1.1.0: Apr 3, 2011

    - bug fix for BSD/OSX: check that O_BINARY is declared before
      #including io.h (Bug 3211822)
    - select improvements
      - add --elif and --else options
      - sorting on multiple fields
      - correct (for English) lexical sorting instead of ASCIIbetical
      - only outputs namespaces that are actually used
      - only outputs xsl:param inputFile if it's used
      - don't make separate templates if there is only 1
    - link to shared libxml and libxslt libraries by default
    - add library version info to --version output
    - add directory argument for ls; exit status indicates
      failure/success instead of file count
    - stop using old SAX1 interface, xmlstarlet will now link with a
      libxml configured --without-sax1 and --without-legacy

Source: README.1.1.0, updated 2011-04-04 


1.0.6: Mar 13 2011:

    - Bug fixes:
       - c14n: set stdout to binary mode on Windows to avoid carriage
         returns (Bug 840665)
       - fix broken --help options
    - put actual behaviour of -P, -S options in --help output (see
      Bug/Feature Request 2858514)
    - remove unneeded escape of quote in ./configure --help
    - don't distribute xmlstarlet.spec: it's generated by ./configure
    - add src/xml.o depends on version.h to Makefile.am so compile
      will succeed without dependency info (eg after make distclean)
    - add test for subcommands' --help option
    - Portability fixes:
       - yes isn't portable, use an awk program instead
       - neither read -r nor xargs -0 are portable, escape the command
         lines to xargs instead
       - don't use nonportable echo -n option

Source: README.1.0.6, updated 2011-03-13 


1.0.5: Feb 11 2011:

   - Bug fixes:
      - use XSLT_PARSE_OPTIONS, else CDATA nodes can cause corruption (Bug 3158482)
      - fix typo in help message
      - get rid of warnings in -ansi -pedantic mode
      - required libxml2 version is 2.6.23
      - usage strings use argv[0] as program name
      - --help prints to stdout and exits with success
      - double /'s under msys to avoid path conversion
   - Portability fixes:
      - don't use xargs (-d isn't portable)
      - use -Wall only for gcc
   -Build system:
      - use -ansi in configure, and check for strdup declaration
      - seperate list of sources and tests into subdirs
      - check git version during make, not just autoconf
      - tarball releases of configure.ac have actual version number
        instead of querying git

Source: README.1.0.5, updated 2011-02-11 


1.0.4: Jan 16 2011:

   - Bug fixes:
      - encode special XML characters in arguments (can now include quotes in xpath)
      - non-zero exit code when input file is not found (Bug 3158488)
      - ed with --pf/--ps options doesn't reformat output (Bug 3158490)
      - exit() instead of segfault when trying to delete namespace nodes
         (Bug 1120417)
   - added --disable-static-libs ./configure option to use shared libxml2 and libxslt
   - non-recursive make
   - use TESTS and XFAIL_TESTS for testing, nicer output

Source: README, updated 2011-01-16 


1.0.3: Nov 18 2010:

   - Bug fixes:
      escape --value in update mode (Bug 3052978)
      c14n now includes default attributes (Bug 1505579)
      Allow special characters in sel --output literal (Bug 1912978)
      remove warning from xml_trans.c (Bug 1521756)
      Use xmlReader interface so line numbers are 32-bit (Bug 1219072)
      test for error messages on lines past 2^16 (Bug 1219072)
      don't look for embedded dtd if not asked (Bug 1167215)

Source: README, updated 2010-11-18 

Appendix C: xmlstarlet wishlist AD 2003

In 2003 Mikhail Grushinskiy posted his xmlstarlet wishlist.

Mikhail Grushinskiy - 2003-05-14

Here is a list of next steps in XmlStarlet on TODO or wishlist:

1. Editing xml documents with xml 'ed' option must be improved.

2. add --recover to fix broken XML documents

3. Document how to use proxy in XmlStarlet with nanohttp/ftp via http_proxy, ftp_proxy environment variables
ex: export http_proxy=http://192.168.0.1:8080/

4. Add ability to specify xpath expression in XmlStarlet 'el' option

5. -u option of XmlStarlet 'xml el' should work with others too. I.e. sort | uniq equivalent should work when attributes and attributes values are printed out.

6. Think about 'join' analogue

7.  Something like xml sel -t -m <xpath> --exec <shell-cmd> --args <args> is needed

8. How would be possible to insert one XML fragment into another XML document from command line without XInclude?

9. Make use of regular expressions ex: Make all element names uppercase

10. Start thinking about diff and patch. Several tree diff algorithms could be implemented for ordered and non ordered labeled trees. What about creating context diff? How to define context in XML space? Good luck solving NP-Complete problems.

11. What about XUpdate implementation?

12. How about making output with syntax coloring in case if it is running in terminal (not batch) mode. Similar to GNU ls?

13. Convert XML to Lisp S-expressions

14. XML Namespace normalization process (There is a XSLT stylesheet floating on the web which could do it).

15. Make use of performance updates from libxml2. mmap() for document chunks, XMLReader interface, etc.

16. More regression testing test cases required.

17. Better Documentation User Guide and Tutorial is needed. More good and real-world examples.

If you wish to enhance/add something to this list, please, reply.

XmlStarlet home page,
http://xmlstar.sourceforge.net/

Thanks,
--MG
 

    Mikhail Grushinskiy
    Mikhail Grushinskiy - 2003-05-23

    Few additions

    1. Better namespace support.
    2. something like xml head, and xml tail
    3. list directories in XML
    4. Defining variables in xml sel

    Ex: xml sel -t --m / -d var_name -v @elem

    -d would translate into

    <xsl:variable name="var_name">
    </xml:variable>

    and this variable could be referenced as $var_name
    in XPATH

    5. CygWin binaries?

Table of contents