xmlstarlet
usage notesxmlstarlet
(xmlstar.sf.net) is the no-nonsense
XML multitool that lets you write simple queries or edits on the command
line, avoids most of the stylesheet formal stuff, and gives your
<o/><o/>
-looking eyes a moment of relief. It’s
also a cranky minimalist tool which targets the 1.0 versions of XPath /
XSLT / EXSLT still widely used and, it seems, users with little need for
documentation.
This is an edited version of my personal notes on
xmlstarlet
with worked examples – knowledge gained as an
outsider through use, trial and error – focusing on the
select
and edit
commands and EXSLT. It’s not a
tutorial or a FAQ, it requires a grasp of XML tools and the POSIX shell.
Copyright is retained.
xmlstarlet
featuresxmlstarlet
relies on libxml2
and
libxslt
which are limited to XPath 1.0 and XSLT 1.0, plus a
number of EXSLT and extension functionsxmlstarlet
is currently at version 1.6.1 which appeared
in 2014 (that’s more or less a century ago in internet years); a number
of FIXME
s and TODO
s remain in the source
codexmlstarlet select
supports XPath 1.0, a subset of XSLT
1.0 (but not xsl:apply-templates
, xsl:key
a.o.), and EXSLTxmlstarlet edit
supports XPath 1.0 – plus some EXSLT
but no XSLT – functions in its xpath
argumentsxmlstarlet transform
is a regular XSLT 1.0 processor
with extensions, like xsltproc
xmlstarlet
commands do document formatting,
canonicalization, validation, structure display, conversion of PYX and
special characters, and file directory listingThere should be no limit on input XML (apart from available memory on your system)– forum posting by the original
xmlstarlet
developerxmlstarlet
is Copyright (c) 2002-2004 Mikhail Grushinskiy. All Rights Reserved.(cf. SourceForge or Fossies)
-q (--quiet)
means either short option -q
or long option --quiet
can be used.
«name»
in a command or message is a placeholder for the
actual name used,
e.g. xmlXPathCompOpEval: function «name» not found
.
Links look like this: external, internal, internal link appearing in a navigation link cloud, ditto* linking to a larger section with a local link cloud, [ sel ] linking into the table of contents. On mouseover headers display a permalink icon, on level 2 and 3 also navigation link icons, on level 4 a section link icon.
Code looks like this: test -s file.xml || log …
,
occasionally with an …
(ellipsis) inside for brevity. For
readability longer commands are usually split
across lines and indented.
Admonitions look like this: Caution.
All shell code samples were made for a POSIX shell (dash
0.5.12) with xmlstarlet
1.6.1 (linked with libxml2 20914
and libxslt 10139) from the Debian
distribution.
[T]he use of SGML syntax for stylesheets was proposed as long ago as 1994, and it seems that this idea gradually became the accepted wisdom. It’s difficult to trace exactly what the overriding arguments were, and when you find yourself writing something like:
<xsl:variable name="y"> <xsl:call-template name="f"> <xsl:with-param name="x"/> </xsl:call-template> </xsl:variable>
to express what in other languages would be written as
y = f(x);
, then you may find yourself wondering how such a decision came to be made.
– Michael Kay, XSLT Programmer’s Reference, Ch.1, ISBN 1861005067
man xmlstarlet
xmlstarlet --help
xmlstarlet «command-name» --help
xmlstarlet
on SourceForge: homepage | docs | user’s guide | news | source | files
| discussion
| bugsdoc/xmlstarlet.txt
there is not the latest version as it doesn’t mention $prev
, --var
, -L (--inplace)
, and -E (--embed)
– the user’s
guide is still silent on thesexmlstarlet
on Fossies
– an accessible presentation of source, examples, and more:xmlstarlet-1.6.1.tar.gz
contents | xmlstarlet
user’s guide (1-page) | doc/xmlstarlet.txt
(latest version)xmlstarlet
forums on StackExchange: stackoverflow.com
| unix.stackexchange.comgithub.io
: EXSLT
docsgnome.org
: libxml2 Wiki
Home (with links to standards, API, utilities a.o.) | libxml2
source | libxslt Wiki
Home (with links to XSLT + EXSLT API a.o.) | libxslt+libexslt
source | libxslt extensionsxmlsoft.org
now redirects to gnome.org
)packages.debian.org
: xmlstarlet | xsltproc | libxml2-utils
(xmllint
) | html-xml-utils | xml2
| tidystackoverflow.com
forums: XPath | XSLT 1.0 | EXSLT | XMLrepology.org
versions pages for xmlstarlet
| libxml2
| libxslt
xmlstarlet select -C
select
’s -C (--comp)
option lists the
stylesheet the current command line will generate – it requires no input
file – e.g.
xmlstarlet select -T -C -t -m 'str:tokenize("Hello, world",",o")' -v '.' -n
Output:
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:str="http://exslt.org/strings" xmlns:exslt="http://exslt.org/common" version="1.0" extension-element-prefixes="exslt str">
<xsl:output omit-xml-declaration="yes" indent="no" method="text"/>
<xsl:template match="/">
<xsl:for-each select="str:tokenize("Hello, world",",o")">
<xsl:call-template name="value-of-template">
<xsl:with-param name="select" select="."/>
</xsl:call-template>
<xsl:value-of select="' '"/>
</xsl:for-each>
</xsl:template>
<xsl:template name="value-of-template">
<xsl:param name="select"/>
<xsl:value-of select="$select"/>
<xsl:for-each select="exslt:node-set($select)[position()>1]">
<xsl:value-of select="' '"/>
<xsl:value-of select="."/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
xmlstarlet
commandsCaution: Other
features need some work too
warns the user’s
guide.
Perhaps they’re thinking of these:
edit
’s “no kidding”
issueselect
’s EXSLT
namespace issuepyx
and depyx
’ several
issuesselect
, c14n
, and transform
while edit
returns non-zero printing nothing but
its lengthy usage reminderThe special XML characters are &<>'"
or – as
references to predefined general entities –
&
<
>
'
"
. With few
exceptions[1] they are never entered as entity references in
an xmlstarlet
command, and they are output as literals when
xmlstarlet select
’s -T (--text)
option is in
effect.
For example, in XPath predicates xmlstarlet
wants
<
(less-than) on the command line, as in
factor[. < 2.19]
, whereas XSLT stylesheets require
<
inside attribute values, or >
(greater-than) with operands swopped. Likewise, a numeric character
reference such as 	
for a tab character belongs in
an XML file, not on the xmlstarlet
command line.
[1] Exceptions
The predefined entity ref:s – as well as character ref:s below
Ā
– are recognized in the following which
therefore require &
to represent an
&
(ampersand) character:
-v (--value)
clause of
xmlstarlet edit
’s -i (--insert)
, -a (--append)
, and -s (--subnode)
options
when used with -t (--type) elem
xmlstarlet unescape
conversion
utilityCaution: The example in
the user’s guide section 4.1
meant to convert newlines to blanks using a character reference for the
newline – xml sel … -v "translate(. , ' ', ' ')" …
– in fact converts &
(ampersand) characters to blanks,
and strips #
, 1
, 0
, and
;
characters. The translate(…)
expression
would work as intended in an XSLT stylesheet but means something
rather different on the xmlstarlet select
command
line.
See also: xmlstarlet esc
/
xmlstarlet unesc
| Replace
text sample
xmlstarlet edit
handling special characters:
printf '%s' '<e/>'|
xmlstarlet edit -O \
-s '*' -t elem -n esuv -v '' -u '$prev' -v '&Save as <.oona>' \
-a '$prev' -t elem -n eaux -v '' -u '$prev' -x '"&Save as <.oona>"' \
-s '*' -t elem -n esv1 -v '&Save as <.oona>' \
-a '$prev' -t elem -n eav1 -v '&Save as <.oona>' \
-i '$prev' -t elem -n eiv1 -v "$(xmlstarlet escape '&Save as <.oona>')"
-u (--update)
follows xmlstarlet
’s general rule for a literal
(esuv
) or an XPath expression (eaux
): special
characters are encoded behind the scenes-i (--insert)
, -a (--append)
, and -s (--subnode)
used with
-t (--type) elem
are special: certain entity/character
ref:s are recognized so &
is required for
&
(esv1
, eav1
,
eiv1
) whereas <>'"
may be used to
represent themselves-i
, -a
, -s
, and -r (--rename)
accept their name
argument without modificationOutput:
<e>
<esuv>&Save as <.oona></esuv>
<eaux>&Save as <.oona></eaux>
<esv1>&Save as <.oona></esv1>
<eiv1>&Save as <.oona></eiv1>
<eav1>&Save as <.oona></eav1>
</e>
Using variables with xmlstarlet select
, e.g.
--var newline -n -b
--var tab -o "$(printf '\t')" -b
--var sq -o "'" -b
…
-v 'concat($sq,"//_:",local-name(),$sq)'
--var eurosign='"€"'
--var eurosign -o "$(python3 -c 'print("{0:c}".format(0x20ac))')" -b
Using variables with xmlstarlet edit
, e.g.
--var sq '"'\''"'
--var dq "'\"'"
printf
argument, and inner double quotes to make an XPath
string expression--var tab "$(printf '"\t"')"
--var tab '" "'
--var nl 'substring-before('"$(printf '"\nA"')"',"A")'
See also: select
--var
| edit
--var
The various xmlstarlet
commands each handle the XML
declaration in their own way but all print it with a trailing newline if
requested:
xmlstarlet edit … -O (--omit-decl) …
xmlstarlet format … -o (--omit-decl) …
xmlstarlet list
never outputs an XML declarationxmlstarlet pyx
always strips the XML declaration if
presentxmlstarlet select … -D (--xml-decl) …
outputs an XML
declarationxmlstarlet transform … --omit-decl …
Adding an XML declaration using select
:
$ printf '<v w="x"/>' |
xmlstarlet select -D -E 'ISO-8859-2' -t -c '/'
<?xml version="1.0" encoding="ISO-8859-2"?>
<v w="x"/>
Several xmlstarlet
commands allow selected options to be
passed to the libxml2
XML
parser (cf. API
reference and source
code) or the libxml2
XML serializer (cf. API
reference and source
code).
select
--net
: clears
XML_PARSE_NONET
-B (--noblanks)
:
sets XML_PARSE_NOBLANKS
-D (--xml-decl)
:
clears XML_SAVE_NO_DECL
; sets
omit-xml-declaration="no"
on xsl:output
-E (--encode)
: sets
encoding="«encoding»"
on xsl:output
, with
-D (--xml-decl)
also in the XML declaration-I (--indent)
: sets
XML_SAVE_FORMAT
; sets indent="yes"
on
xsl:output
-T (--text)
: sets
method="text"
on xsl:output
edit
--net
: clears
XML_PARSE_NONET
-O (--omit-decl)
:
sets XML_SAVE_NO_DECL
-P (--pf)
: sets
XML_SAVE_FORMAT
-S (--ps)
: sets
XML_SAVE_WSNONSIG
(requires libxml2
2.7.8+)format
--net
: clears
XML_PARSE_NONET
-C (--nocdata)
: sets
XML_PARSE_NOCDATA
-N (--nsclean)
: sets
XML_PARSE_NSCLEAN
-R (--recover)
: sets
XML_PARSE_RECOVER
-o (--omit-decl)
:
sets XML_SAVE_NO_DECL
-n (--noindent)
, -s (--indent-spaces)
,
and -t (--indent-tab)
offer an alternative to the default indentationc14n
--net
: clears
XML_PARSE_NONET
validate
--net
: clears
XML_PARSE_NONET
-E (--embed)
: sets
XML_PARSE_DTDVALID
XML_PARSE_DTDLOAD
and XML_PARSE_DTDATTR
are set by default (src/validate.c#valMain())transform
--net
: clears
XML_PARSE_NONET
--omit-decl
:
sets XML_SAVE_NO_DECL
format
’s -H (--html)
and
transform
’s --html
options substitute
the libxml2
HTML 4.0
parser.
The c14n
command converts an XML
document to a normal format.
To expand empty-element tags, changing <p/>
to
<p></p>
, for example:
xmlstarlet edit --pf -s '//*[not(node())]' -t text -n ignored -v '' file.xml
See also: network access | Try out
edit
’s formatting options example
The xmlEscapeEntities
function in libxml2
’s xmlsave.c
serialization module gives special treatment to characters
&<>
(output as &
,
<
, and >
) but neither apostrophe
nor double quote ('"
). xmlstarlet
has no
option to override this.
Using a CDATA section to keep the serializer from applying default rules:
$ printf '%s\n' '<v><w>x</w><x>🧩</x></v>' |
xmlstarlet edit -O -P -d '*/w'
<v><x>🧩</x></v>
$ :
$ printf '%s\n' '<v><w>x</w><x><![CDATA[🧩]]></x></v>' |
xmlstarlet edit -O -P -d '*/w'
<v><x><![CDATA[🧩]]></x></v>
Caution: Of
xmlstarlet
’s commands only c14n
, select
, and transform
seem to understand an
entity reference like <doc>&e;</doc>
,
according to the following test script. This makes pre/post-processing a
requirement if using xmlstarlet
’s other commands to handle
external entities.
#!/bin/sh
# Test xmlstarlet commands with external general parsed entity.
# - ${xdata} holds data file contents, defaults to a few <e>N</e>
# - ${keepf} non-empty to keep temporary files in $TMPDIR
# - ${dryrun} non-empty to print but not execute commands
# - ${doecho} non-empty to also print commands before executing
skelf=$(mktemp -t "xsskel-$$-XXXXXXXXXX.xml")
dataf=$(mktemp -t "xsdata-$$-XXXXXXXXXX.xml")
idxff=$(mktemp -t "xsidxf-$$-XXXXXXXXXX.xsl")
test "${keepf}" ||
trap "rm '${skelf}' '${dataf}' '${idxff}'" INT EXIT
printf '%s\n' \
'<!DOCTYPE skel [<!ENTITY e SYSTEM "'"${dataf}"'">]><doc>&e;</doc>' \
> "${skelf}"
printf '%s' \
"${xdata:-<e>1</e><e>2</e><e>3</e><e>4</e>}" \
> "${dataf}"
printf '<v/>' | xmlstarlet select -t \
-e xsl:transform -a version -o 1.0 -b \
-e xsl:template -a match -o '@*|node()' -b \
-e xsl:copy -e xsl:apply-templates -a select -o '@*|node()' \
> "${idxff}" ## identity transform
for cmd in c14n ed el fo pyx sel tr val
do
case ${cmd} in
(c14n|el|fo|pyx|val)
set -- ;;
(ed) set -- -d '*/*[3]' ;;
(sel) set -- -T -t -c / -n ;;
(tr) set -- "${idxff}" ;;
(*) break ;;
esac
set -- xmlstarlet "${cmd}" "$@" "${skelf}"
if test "${dryrun}${doecho}"; then
printf '\n\n# command:'; printf " '%s'" "$@"; printf '\n'
fi
if ! test "${dryrun}"; then
"$@"; printf '\n## %s returned %d\n\n' "${cmd}" "$?"
fi
done
Given a data file containing
<e>1</e><e>2</e><e>3</e><e>4</e>
(making it well-formed XML) pyx
returns 4 (outputs the
doctype but says Entity 'e' not defined
) while
c14n
, ed
, el
, fo
,
sel
, tr
, and val
all return zero.
But ed
, el
, and fo
(plus
val
, presumably) fail to expand the entity reference.
Given a data file containing <a>B</c>
(clearly making it non-XML) the ed
, el
, and
val
commands all return zero – and val
even
pronouncing «datafile» - valid
– while c14n
,
fo
, pyx
, sel
, and tr
return 3, 2, 4, 3, and 6, respectively.
xmllint
from the libxml2-utils
package has a
--noent
option to substitute entity values for entity
references (e.g. xmllint --noent --dropdtd file.xml
).
src/xmlstar.h
defines the following exit values for xmlstarlet
:
EXIT_SUCCESS
EXIT_FAILURE
EXIT_BAD_ARGS
EXIT_BAD_FILE
EXIT_LIB_ERROR
EXIT_INTERNAL_ERROR
but mind these:
xmlstarlet select
exits
grep
-style returning the query result as 0 or 1xmlstarlet format
’s --omit-decl
option exits,
er, shuf
-styledocument()
failure,
unescape – both producing stderr
output)XPath 1.0 does not support numbers expressed in scientific notation, cf. W3C recommendation
(Number ::= Digits ('.' Digits?)? | '.' Digits
and
Digits ::= [0-9]+
).
Tools based on libxml2
do support it, however, cf. xmlXPathFormatNumber()
(snprintf(work, sizeof(work),"%*.*e", integer_place, fraction_place, number);
).
Here are a few examples of libxml2
handling XPath
computations – and libxslt
handling the XSLT format-number()
function.
printf '%s\n' '<v>1240057409536</v>' |
xmlstarlet select -T -t \
-v '*' -n \
-v '0 + *' -n \
-v '* div 1' -n \
-v '* div 1000 * 1E3' -n \
-v '* div 1.240057409536e+12' -n \
-o '---' -n \
-v 'round(* div 1)' -n \
-v 'round(* div 10)' -n \
-v 'round(* div 100)' -n \
-v 'round(* div 1000)' -n \
-o '---' -n \
-v 'format-number(* div 1,"#")' -n \
-v 'format-number(* div 1,"#,###")' -n
Output:
1240057409536
1.240057409536e+12
1.240057409536e+12
1.240057409536e+12
1
---
1.240057409536e+12
1.24005740954e+11
1.2400574095e+10
1240057410
---
1240057409536
1,240,057,409,536
In this document longer commands are usually split across lines and indented, like this:
xmlstarlet select -T -t \
--var sq -o "'" -b \
-o 'xmlstarlet edit --omit-decl '\\ -n \
-o " --var N 'Names/Name' \\" -n \
-m '*/*' \
-o ' -r ' -v 'concat($sq,"$N[",position(),"]",$sq)' \
-o ' -v ' -v 'concat($sq,name(),format-number(position(),"0000"),$sq)' -o ' '\\ -n \
-b \
-f -n \
"${infile:-file.xml}"
To strip such a command of line continuation characters and leading
whitespace pipe it through following sed
command (changing
one line, not an entire shell script),
sed -e ':1' -e 's/^[[:blank:]]*//' -e '/\\$/!b' -e '$b' -e 'N' -e 's/\\\n[[:blank:]]*//' -e 'b1'
or, as an alias, silently using xsel
to paste from the clipboard, call sed
, have
paste
add a trailing newline if needed, and return the
result to the clipboard:
alias mfyoi="xsel -b -o |
sed -e 's/^[[:blank:]]*//' -e ':1' -e '/\\\\\$/!b' \
-e '\$b' -e 'N' -e 's/\\\\\\n[[:blank:]]*//' -e 'b1' |
paste -s -d '\\n' |
xsel -b -i"
Thus minified:
xmlstarlet select -T -t --var sq -o "'" -b -o 'xmlstarlet edit --omit-decl '\\ -n -o " --var N 'Names/Name' \\" -n -m '*/*' -o ' -r ' -v 'concat($sq,"$N[",position(),"]",$sq)' -o ' -v ' -v 'concat($sq,name(),format-number(position(),"0000"),$sq)' -o ' '\\ -n -b -f -n "${infile:-file.xml}"
makefile
notes (GNU
Make)$
(dollar sign) starts expansion of a variable /
parameter
make
, e.g. $< $T $(mvar) ${mvar}
,
use $$
for a literal$# $$ $svar ${svar}
$xvar
\
(backslash) is make
’s (and the shell’s)
escape character, it has no special meaning in XPath or XSLTmake
functions and variables are expanded before
the shell is invoked to execute a recipexmlstarlet
’s exit values
aren’t all orthodoxLinks: GNU Make manual | Ask Mr. Make article on GNU Make escaping
Sample makefile
:
SHELL := /bin/sh
space := $(info) $(info)
tab := $(shell printf '\t')
define newline =
endef
# next line defines U+0023 NUMBER SIGN (aka \043, pound sign, hashtag, …)
\H := \#
.RECIPEPREFIX = >
.PHONY: all
all:
> printf '%s' '<v a="fee" b="fi" c="fo" d="fum"/>' | \
xmlstarlet select -T -t --var x='*/@*' -v '$$x' -n | \
paste -s -d '$$ ' -
> printf '%s\n' '$(space)x$(tab)\$(newline)'"$${OLDPWD$(\H)$(\H)*/}" \
"process $$$$ exiting"
Output from make -s
:
fee$fi fo$fum
x \
incubator
process 20965 exiting
Global options go before the command, as in
xmlstarlet -q format file
.
An input filename starting with -
(dash) – unless it’s
short for stdin
– must be prefixed with ./
(dot slash) otherwise it will be parsed as an option, possibly causing
select
(Caution) to ignore the file.
Beware of known bugs for filenames containing (#123 )
'
(single quote), or (#110)
urlencoded characters, e.g. %20
.
See also: couldn’t read file | failed to load external entity | invalid expression: ‘«pathname»’
--help
xmlstarlet --help
shows the general usage reminder,
xmlstarlet «command» -h (--help)
the command-specific
ditto.
--version
Prints version information and terminates.
Sample output from xmlstarlet --version
:
1.6.1
compiled against libxml2 2.9.4, linked with 20910
compiled against libxslt 1.1.33, linked with 10134
-q (--quiet)
: suppress error
outputError messages from libxml2
or libxslt
are suppressed by
this option.
Caution: this option
also suppresses ordinary output (to stdout
) from
xmlstarlet select
.
See also: select
-Q (--quiet)
local option | format -Q (--quiet)
local
option
--no-doc-namespace
:
don’t use namespace bindings from input’s root element--doc-namespace
: extract
namespace bindings from input’s root element (default)By default (--doc-namespace
being in effect) namespaces
declared in the root element (the outermost element aka
the document element) of the first input file can be referred to without
explicit -N
options; if the default namespace is declared
there it is bound to the _
(underscore) (aka
DEFAULT
) prefix.
Although --no-doc-namespace
and
--doc-namespace
are global options only
xmlstarlet select
and xmlstarlet edit
use
them. select
and edit
both support multiple
input files.
See also: User’s guide ch. 5
| Use a namespace | select -N
| edit -N
--net
)Several xmlstarlet
commands - select
,
edit
, format
, c14n
,
validate
, and transform
- have a
--net
option to allow network access, to fetch remote DTDs
and entities. --net
clears the XML_PARSE_NONET
flag for the libxml2
XML parser (API
ref).
For security, network access is disallowed by default, cf. article on XML external entity attack.
uri
replacing input filexmlstarlet --help
says,
Wherever file name mentioned in command help it is assumed that URL can be used instead as well.
Should work with HTTP and FTP protocols, not HTTPS (due to libxml2 limitations). (Distribution-dependent?)
See also: --net
xmlstarlet elements
xmlstarlet elements
(aka el
) displays the
structure of an XML document by listing the paths of elements and
optionally attributes and attribute values.
elments [option] [«xml-file»]
At most one option and one input file is accepted.
-a
- include attributes-v
- include attribute values-u
- sorted unique lines-dN
- sorted unique lines to depth
N$ : "${infile=recently-used.xbel}"
$ :
$ xmlstarlet elements -u "${infile}"
xbel
xbel/bookmark
xbel/bookmark/info
xbel/bookmark/info/metadata
xbel/bookmark/info/metadata/bookmark:applications
xbel/bookmark/info/metadata/bookmark:applications/bookmark:application
xbel/bookmark/info/metadata/bookmark:groups
xbel/bookmark/info/metadata/bookmark:groups/bookmark:group
xbel/bookmark/info/metadata/mime:mime-type
$ :
$ xmlstarlet el -d3 "${infile}"
xbel
xbel/bookmark
xbel/bookmark/info
$ :
$ # Skip repetitions
$ xmlstarlet el -a "${infile}" | awk '!seen[$1]++' | head -n 10
xbel
xbel/@xmlns:bookmark
xbel/@xmlns:mime
xbel/@version
xbel/bookmark
xbel/bookmark/@href
xbel/bookmark/@added
xbel/bookmark/@modified
xbel/bookmark/@visited
xbel/bookmark/info
$ :
$ xmlstarlet el -v "${infile}" | sed '2d;9q'
xbel[@xmlns:bookmark='http://www.freedesktop.org/standards/desktop-bookmarks' and @xmlns:mime='http://www.freedesktop.org/standards/shared-mime-info' and @version='1.0']
xbel/bookmark/info
xbel/bookmark/info/metadata[@owner='http://freedesktop.org']
xbel/bookmark/info/metadata/mime:mime-type[@type='image/jpeg']
xbel/bookmark/info/metadata/bookmark:groups
xbel/bookmark/info/metadata/bookmark:groups/bookmark:group
xbel/bookmark/info/metadata/bookmark:applications
xbel/bookmark/info/metadata/bookmark:applications/bookmark:application[@name='Image Viewer' and @exec="'eog %u'" and @modified='2022-03-28T07:27:27Z' and @count='1']
$ :
$ # Compute tree height as maximum branch node depth
$ xmlstarlet el -u "${infile}" | awk -F / '{d=NF-1;if(d>h)h=d}END{print 0+h}'
5
See also: Print XPath of selected elements or attributes example
Using awk
to indent the output from
elements -u
:
xmlstarlet elements -u "${infile}" |
awk -v FS='/' -v indent=2 '
{ for ( i = 0; ++i <= NF; )
if ( i > prevlen || $(i) != prev[i] )
printf("%*s%s\n",(i-1)*indent,"",$(i))
prevlen = split($0,prev,FS)
}'
Output:
xbel
bookmark
info
metadata
bookmark:applications
bookmark:application
bookmark:groups
bookmark:group
mime:mime-type
With graphviz
installed use awk
and dot
to make a
diagram:
xmlstarlet elements -u "${infile}" |
awk -F/ 'BEGIN{print "digraph{rankdir=\"TB\";"}END{print "}"}
NF!=1{printf("\"%s\" -> \"%s\"\n",$(NF-1),$NF)}' |
dot -Tsvg -Nshape=plain -o "${outfile}"
TB
is for top-bottom, LR
left-right;
there’s also BT
and RL
plus a generous amount
of options.
xmlstarlet select
xmlstarlet select
(aka sel
) is basically a
shorthand XSLT generator that can either process or print the stylesheet
it generates. Typically used to extract and format data it supports a
subset of XSLT 1.0 elements, all XPath 1.0 and XSLT 1.0 functions, plus
the EXSLT functions offered by
libexslt
.
select
implements 8 XSLT instruction elements –
xsl:attribute
, xsl:choose
,
xsl:copy-of
, xsl:element
,
xsl:for-each
, xsl:sort
, xsl:text
,
xsl:value-of
– plus xsl:variable
(and
xsl:stylesheet
, xsl:template
,
xsl:output
partially) but note the absence of
xsl:apply-templates
, xsl:key
a.o. This means
recursion and identity transforms are off-limits (unless
resorting to code generation).
xmlstarlet select
returns the same system-property()
values as xmlstarlet transform
. A
stylesheet generated by select
appears as located in the
current directory.
Like grep
xmlstarlet select
returns an exit
value of 1 if no nodes were selected, e.g.
xmlstarlet select -T -t -m '(//xsl:document)[1]' -f *.xsl
returns 0 if at least one input file matches the XPath expression,
otherwise 1 (with or without the -Q (--quiet)
option).
See also: XML parsing and serialization
Caution:
xmlstarlet select
does not flag invalid non-template
options (src/xml_select.c#selParseOptions())
and ignores characters following the first letter in short template
options (src/xml_select.c#selGenTemplate()).
Next command outputs:
optfuscation
xmlstarlet select --nonet --rsn -:=% -C -t -i\*r 2=2 -eR_W- x -a'!e'ee y -omit z -bar -b:rrrf -newln | {
xmlstarlet select -C -t -i 2=2 -e x -a y -o z -b -b -n |
cmp -s - /dev/fd/3
} 3<&0 && echo 'optfuscation' || echo 'returned non-zero'
select
[option …] template …
[«xml-file» …]-h (--help)
- display help-Q (--quiet)
- do not write
anything to standard outputSee also: global option -q (--quiet)
(lowercase
-q
)
-C (--comp)
- display generated
XSLTLists the XSLT stylesheet that will be generated from the current template options. No input file is required for this option. It produces no output other than a stylesheet or an error message.
Usage samples: -t -m …
| --output …
| --value-of …
| --xinclude
See also: Introspection example
-R (--root)
- print root element
<xsl-select>
Wraps a container element named xsl-select
around
output. It includes namespace nodes declared with -N «prefix»=«value»
except the predefined namespaces.
-T (--text)
- output is text
(default is XML)Sets method="text"
on the xsl:output
element.
$ cat file.xml
<v><w>a&</w><w>l<</w><w>q"</w><w>g></w></v>
$ :
$ xmlstarlet select -t -c '*/*[position()>2]' -n file.xml
<w>q"</w><w>g></w>
$ :
$ xmlstarlet select --text -t -c '*/*[position()<3]' -n file.xml
a&l<
See also: Special
characters | -o (--output)
-I (--indent)
- indent
outputSets indent="yes"
on the xsl:output
element.
To re-indent, for example:
xmlstarlet select -B -I -t -c / in.xml > out.xml
See also: XML
parsing and serialization | -B (--noblanks)
-D (--xml-decl)
- do not omit
XML declaration lineSets omit-xml-declaration="no"
on the
xsl:output
element.
Use with -E (--encode)
to specify encoding.
See also: XML declaration
-B (--noblanks)
- remove
nonsignificant whitespace from XML treeTo strip nonsignificant whitespace, for example:
xmlstarlet select -B -t -c / in.xml > out.xml
See also: XML parsing and serialization
-E (--encode) «encoding»
-
output in the given encodingSets encoding="«encoding»"
(e.g. UTF-8
,
ISO-8859-2
) on the xsl:output
element, with -D (--xml-decl)
also in the
XML declaration.
See also: XML declaration
-N «prefix»=«value»
- declare
namespacesThis option is repeatable. E.g.
-N xsql='urn:oracle-xsql' -N X='http://www.w3.org/1999/xhtml'
.
Either side of the equal sign may be empty[1],
e.g. -N ''=''
(or -N =
) for
xmlns=""
.
Not needed for predefined namespaces or
those declared in the root element (see --doc-namespace
) of
the first input file but required
document()
function)--no-doc-namespace
global option is in effectselect
’s --var «name»=«value»
namespace issuecrypto
(see
EXSLT)See also: Use a namespace | -R (--root)
| edit -N
[1] -N foo=''
is not
allowed;
echo '<v/>' | xmlstarlet sel -N foo= -t -e a -e foo:k
outputs <a><k/></a>
.
--net
- allow fetch DTDs or
entities over networkSee also: network access
-t (--template)
is
<xsl:template match="/">
The -t (--template)
option marks the beginning of an
xsl:template
element which ends at a following
-t
option (i.e. non-nestable) or at the last option after
-t
. -t
must be followed by at least one
template option. NB:
<xsl:template match="pattern">
cannot be
generated by combining -t
and -m
options.
-t (--template)
makes the root node (/
, not
the root element /*
) the current node
so XPath expressions can be relative,
xmlstarlet select -t -m '*/*/r' -v '@id' -n file
even obscure,
echo '<q>2</q>' | xmlstarlet sel -t -v '*******************'
with thanks to Michael Kay for his original Christmas cracker
the output of which is 1024
.
As xmlstarlet select --help
shows, two or more
--template
s are implemented as:
<xsl:template match="/">
<xsl:call-template name="t1"/>
<xsl:call-template name="t2"/>
…
</xsl:template>
See also: List the generated XSLT -C (--comp)
-m (--match)
is
<xsl:for-each select="xpath-expr">
-m (--match)
is a rare misnomer among
xmlstarlet
’s option names: it translates to the
xsl:for-each
element and has nothing to do with an
xsl:template
pattern. -m (--match)
is nestable
and can be explicitly terminated with -b (--break)
.
Links: XSLT current node |
XSLT xsl:for-each
| XSLT current()
| XPath context node
xsl:for-each
changes the current node. The XPath
functions position()
and last()
return the context position and context size, respectively.
$ printf '<v w="a:b:c:d:e:f:g:h:i:j"/>' |
xmlstarlet select --text -t \
-m 'str:split(v/@w,":")' \
--if 'position() mod 3 = 0' \
-v 'concat(position()," ",.," ")'
3 c 6 f 9 i
whereas
-m 'str:split(v/@w,":")[position() mod 3 = 0]' -v 'concat(…)'
outputs 1 c 2 f 3 i
.
Keeping a reference to root for node changes.
$ cat file.xml
<r><e id="a">fee</e><e id="b">fi</e><e id="c">fo</e><e id="d">fum</e></r>
$ :
$ xmlstarlet select -T -t \
-m 'str:split("a b c d")' \
-v 'concat(//e[@id=current()],". ")' \
-b -n \
file.xml
. . . .
$ :
$ xmlstarlet select -T -t \
--var R='/' \
-m 'str:split("a b c d")' \
-v 'concat($R//e[@id=current()],". ")' \
-b -n \
file.xml
fee. fi. fo. fum.
-s (--sort)
is
<xsl:sort …/>
To process a nodeset in sorted order add one or more
-s (--sort) 'X:Y:Z' 'xpath'
options immediately after
-m (--match)
.
X
is one of A | D | -
to set
order
ascending | descending | unspecifiedY
is one of N | T | -
to set
data-type
number | text | unspecifiedZ
is one of U | L | -
to set
case-order
upper-first | lower-first | unspecified-
(unspecified)For example:
-s 'A:N:-' '.'
-s 'D:N:-' 'position()'
DD.MM.YYYY
dates by year, month, date:-s 'A:T:-' 'concat(substring(.,7,4), substring(.,4,2), substring(.,1,2))'
See also: examples/sort* | Query Euro rates | Remove all but the latest member of each group
--var «name» «value» --break
is
<xsl:variable name="…">«value»</xsl:variable
>--var «name»=«value»
is
<xsl:variable name="…" select="«value»"/>
xmlstarlet select
has 2 forms of --var
,
cf. xsl:variable
:
--var name=value
, e.g.
--var n='5'
--var s='"fee fi fo fum"'
--var f='true()'
--var V='//_:abc[@class="def"]'
--var W='$V/_:ghi[boolean(@jkl)]'
-m 'str:split($ws)' --var w='.' …
--var lut='document("")//xsl:variable[@name="rtf"]/*'
--var name value --break
-b (--break)
, e.g.
--var nl -n -b
--var s -o '<f&g>' -b
--var stuff -e doranc -c 'a[c] | d[c]' -b -b
--var lines -m '$expr' -v '…' -n -b -b
--var reply --if '$v > 4' -o 'yes' --elif '$v < 2' -o 'no' --else -o 'maybe' -b -b
(xmlstarlet edit
has 1 form: --var name xpath
.)
See also: --var «name»=«value»
namespace issue
Result tree fragment (RTF) demo:
printf '<v/>\n' |
xmlstarlet select -t \
--var rtf \
-e x -a k -o 1st -b -o First. -b \
-e x -a k -o 2nd -b -o Second. -b \
-e x -a k -o 3rd -b -o Third. -b \
-b \
--var tbl='exslt:node-set($rtf)' \
-v 'exslt:object-type($rtf)' -o ' rtf ' -v '$rtf' -n -c '$rtf' -n \
-v 'exslt:object-type($tbl)' -o ' tbl ' -v '$tbl' -n -c '$tbl' -n
Output:
RTF rtf First.Second.Third.
<x k="1st">First.</x><x k="2nd">Second.</x><x k="3rd">Third.</x>
node-set tbl First.Second.Third.
<x k="1st">First.</x><x k="2nd">Second.</x><x k="3rd">Third.</x>
$tbl/x[@k="2nd"]
is a valid XPath expression,
$rtf/x[@k="2nd"]
is not and triggers an
Invalid type
run-time error.
See also: RTF examples file list | accumulation
Links: exslt:node-set
| nodeset
vs. RTF by David Carlisle, Jörg Pietschmann | RTF background by
Michael Kay
--var «name»=«value»
namespace issueCaution: An EXSLT namespace prefix (other than exslt
(?)) used only inside xmlstarlet select
’s
--var name='…'
triggers runtime error
xmlXPathCompOpEval: function «func» bound to undefined prefix «ns»
unless option -N ns=…
is given.
Workaround: use -N ns=…
or use the prefix outside
--var name='…'
, e.g. in -v
or -m
or (for string content) --var name … -b
.
$ printf '%s\n' '<v s="a b c"/>' |
xmlstarlet select -t \
--var d='str:split(v/@s)' \
-v '$d' -n
xmlXPathCompOpEval: function split bound to undefined prefix str
runtime error: element variable
Failed to evaluate the expression of variable 'd'.
no result for -
$ :
$ printf '%s\n' '<v s="a b c"/>' |
xmlstarlet select -t \
-m 'str:split(v/@s)' \
-v . -b -n
abc
-o (--output)
is
<xsl:text>«value»</xsl:text>
$ xmlstarlet select -T -C -t -o 'A<&'\''">z'
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output omit-xml-declaration="yes" indent="no" method="text"/>
<xsl:template match="/">
<xsl:text>A<&'">z</xsl:text>
</xsl:template>
</xsl:stylesheet>
-o ''
translates to an empty
<xsl:text/>
element.
See also: Special characters
To manage parameters of the xsl:output
element, see XML parsing and
serialization.
-e (--elem)
is
<xsl:element name="…">
-e
is nestable and can be explicitly terminated with -b (--break)
.
See also: Create a namespace | Create a SOAP envelope example
-a (--attr)
is
<xsl:attribute name="…">
-a
can be explicitly terminated with -b (--break)
.
In XSLT, the latter of two same-named attributes is accepted, e.g.
$ echo '<v/>' |
xmlstarlet select -t -e doc -a f -o n -b -a f -o y
<doc f="y"/>
-c (--copy-of)
is
<xsl:copy-of select="xpath-expr"/>
See examples at: -T (--text)
| -I (--indent)
-v (--value-of)
is
string-join((xpath-expr),newline)
With zero or one nodeset members in xpath-expr
-v (--value-of)
works exactly as XSLT 1.0’s
<xsl:value-of select="xpath-expr"/>
, otherwise (like
string-join()
in XSLT 2.0) all members are output,
stringified and separated by newlines.
$ echo '<v><w>fee</w><w>fi</w><w>fo</w><w>fum</w></v>' |
xmlstarlet select -T -t -v '*/*' -t -n
fee
fi
fo
fum
Adding -C (--comp)
option
to list the XSLT code for the value-of-template
:
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:exslt="http://exslt.org/common" version="1.0" extension-element-prefixes="exslt">
<xsl:output omit-xml-declaration="yes" indent="no" method="text"/>
<xsl:template match="/">
<xsl:call-template name="t1"/>
<xsl:call-template name="t2"/>
</xsl:template>
<xsl:template name="t1">
<xsl:call-template name="value-of-template">
<xsl:with-param name="select" select="*/*"/>
</xsl:call-template>
</xsl:template>
<xsl:template name="t2">
<xsl:value-of select="' '"/>
</xsl:template>
<xsl:template name="value-of-template">
<xsl:param name="select"/>
<xsl:value-of select="$select"/>
<xsl:for-each select="exslt:node-set($select)[position()>1]">
<xsl:value-of select="' '"/>
<xsl:value-of select="."/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
-i (--if) [--elif …] [--else]
is
<xsl:when> … [<xsl:otherwise>]
-i (--if)
is nestable and can be explicitly terminated
with -b (--break)
. It
translates to an xsl:choose
element.
-b (--break)
ends current
container element-b (--break)
closes the currently open container
element, one of:
-m (--match)
(nestable)-i (--if, --elif, --else)
(nestable)-e (--elem)
(nestable)-a (--attr)
-a 'data-dec' -o 'pre' -v '.' -o 'suf' -b
--var
without =
(nestable)--var idls -m 'was[not(was)]' -v 'concat(@id,$sep,@class)' -n -b -b
--var
s are local to the enclosing
--var
but must have unique variable names.-t (--template)
These can be followed by a variable number of options and so must be terminated explicitly unless followed by one of:
-t (--template)
optionxmlstarlet
commandclosing all open elements. In other words, trailing -b
s
may be omitted if they’re the last options in the current template.
A -b (--break)
too many can trigger compilation error:
xsltParseStylesheetTop: unknown «name» element
.
-n (--nl)
prints a newline-f (--inp-name)
prints
pathname / URI of current inputShorthand for -v '$inputFile'
(a predefined variable).
Outputs -
(dash) for standard input
(stdin
).
Download (< 2K) and convert the European Central Bank’s Euro rates
sorted by currency in A
scending order as T
ext,
U
pper-first:
wget -qO- 'https://www.ecb.europa.eu/stats/eurofxref/eurofxref-daily.xml' |
xmlstarlet select --text -t \
-m '//_:Cube[@currency]' \
-s 'A:T:U' '@currency' \
-v 'concat(@currency," ",@rate)' -n
See also: -s (--sort)
List files in current dir and subdirs containing at least one
milk
element (returns non-zero if no match):
find . -type f -name '*.xml' -exec \
xmlstarlet select -T -t -m '(//*[local-name()="milk"])[1]' -f -n {} +
Return zero if at least one XML element text exactly matches
milk
, otherwise non-zero (no output is produced):
find . -type f -name '*.xml' -exec \
xmlstarlet select -Q -T -t -m '(//*[text()="milk"])[1]' -f -n {} +
find
’s
{} +
fills up the command line with pathnames.
See also: -f (--inp-name)
| -Q (--quiet)
| exit values
Note: This handles element or attribute nodes but no other node types.
: ${fileglob:=/usr/share/*/xslt/docbook/common/db-common.xsl}
: ${target:='//xsl:param[string(@select)]'}
xmlstarlet select --text -t \
-m "${target}" \
-m 'ancestor-or-self::*' \
--var pos='1+count(preceding-sibling::*[name() = name(current())])' \
-v 'concat("/",name(),"[",$pos,"]")' \
-b \
--if 'count(. | ../@*) = count(../@*)' \
-v 'concat("/@",name())' \
-b \
-n \
${fileglob}
where:
${varname:=…}
assigns a default value by shell parameter
expansion via the built-in colon
utility-m (--match)
option specifies the target, with optional search conditions given as
XPath predicates-m (--match)
builds the XPath of elements
from root to target, calculating position by counting siblings using the
XSLT 1.0 current()
functionancestor-or-self::*
– the -i (--if)
clause adjusts the
XPathOutput:
/xsl:stylesheet[1]/xsl:template[1]/xsl:param[1]
/xsl:stylesheet[1]/xsl:template[4]/xsl:param[1]
/xsl:stylesheet[1]/xsl:template[5]/xsl:param[1]
/xsl:stylesheet[1]/xsl:template[5]/xsl:param[2]
/xsl:stylesheet[1]/xsl:template[6]/xsl:param[1]
Output if called with
target='//xsl:*/@test[contains(.,"position")]'
:
/xsl:stylesheet[1]/xsl:template[2]/xsl:for-each[1]/xsl:if[1]/@test
/xsl:stylesheet[1]/xsl:template[3]/xsl:for-each[1]/xsl:if[1]/@test
/xsl:stylesheet[1]/xsl:template[7]/xsl:for-each[1]/xsl:choose[1]/xsl:when[1]/@test
/xsl:stylesheet[1]/xsl:template[7]/xsl:for-each[1]/xsl:choose[1]/xsl:when[3]/@test
See also: xmlstarlet elements
If the plaintext input is uncomplicated perhaps EXSLT’s string
functions can do the conversion. Note that str:replace
,
str:split
,
and str:tokenize
are available for xmlstarlet select
(and transform
), but not for edit
.
<root>
A;2022-08-10;db #1
B;sortie bidon;50.0
A;2022-08-12;db Cth
B;mali climber;40.0
C;fray illumine;9.75
</root>
ifs
, iss
, and irs
,
respectivelystr:split
function, while applying XML markupA
in inputxmlstarlet select --indent -t \
--var ifs -o ';' -b \
--var iss -n -b \
--var irs='concat($iss,"A")' \
-e recs \
-m 'str:split(*,$irs)' \
-e rec \
--var sr='str:split(.,$iss)' \
--var hd='str:split($sr[1],$ifs)' \
-e hd \
-e dt -v '$hd[1]' -b \
-e wd -v '$hd[2]' -b \
-b \
-e bd \
-m '$sr[position()!=1]' \
--var f='str:split(.,$ifs)' \
-e fld \
-a typ -v '$f[1]' -b \
-e dsc -v '$f[2]' -b \
-e amt -v '$f[3]' -b \
"${infile:-file.xml}"
See also: --var
| -m (--match)
| -e (--elem)
| -b (--break)
Output:
<recs>
<rec>
<hd>
<dt>2022-08-10</dt>
<wd>db #1</wd>
</hd>
<bd>
<fld typ="B">
<dsc>sortie bidon</dsc>
<amt>50.0</amt>
</fld>
</bd>
</rec>
<rec>
<hd>
<dt>2022-08-12</dt>
<wd>db Cth</wd>
</hd>
<bd>
<fld typ="B">
<dsc>mali climber</dsc>
<amt>40.0</amt>
</fld>
<fld typ="C">
<dsc>fray illumine</dsc>
<amt>9.75</amt>
</fld>
</bd>
</rec>
</recs>
document()
functionLinks: document()
in W3C rec
The XSLT document()
function
xmlstarlet select
)xmlstarlet
abends with an error message such as
Extra content at the end of the document
and (Caution) exit value 0document("")
Examples: merge 2 XML files | extract and merge records | introspection | external lookup table
Insert child nodes of ${partfile}
’s root element into
${infile}
’s ${destination}
element – using a
3-stage pipeline:
xmlstarlet select -R -t \
--var part -o "${partfile:-file2.xml}" -b \
-c ' / | document($part)' "${infile:-file.xml}" |
xmlstarlet edit -m '/xsl-select/*[2]/node()' '/xsl-select'"${destination:-/..}" |
xmlstarlet select -B -I -t -c '/xsl-select/*[1]'
select
to copy the 2 documents and wrap them
(-R
) as /xsl-select/*[1]
and
/xsl-select/*[2]
, using document()
to access the ${partfile}
– either ${infile}
or ${partfile}
can be /dev/stdin
edit
to -m (--move)
children of
${partfile}
’s root element to ${destination}
–
an XPath expression locating an element in ${infile}
–
incoming nodes will be appended as last nodes there${destination}
(/..
) causes an
error to be generated and must be overriddenselect
to extract and format the merged
documentSee also: -R (--root)
| --var … -b
| -B (--noblanks)
| -I (--indent)
| -c (--copy-of)
If called with this ${partfile}
<items>
<item>1</item><item>2</item><item>3</item>
</items>
and this ${infile}
<doc><g><g1/><g2/><g3/></g></doc>
and destination=/doc//g1
, then output becomes:
<doc>
<g>
<g1>
<item>1</item>
<item>2</item>
<item>3</item>
</g1>
<g2/>
<g3/>
</g>
</doc>
See also: transform --xinclude
Given a number of similar XML input files each containing a simple record set,
echo '<v/>' |
xmlstarlet select -R -I -t \
--var fls \
-e f -o 'data/rs1.xml' -b \
-e f -o 'data/rs2.xml' -b \
-e f -o 'data/rs3.xml' -b \
-b \
-c 'document(exslt:node-set($fls)/f) /*/r'
select --var
)exslt:node-set()
function…/f
) to
document()
document()
returns the root nodes of the XML trees
parsed from the input files-c (--copy-of)
copies
the r
elements (…/*/r
) from the source trees
to the result treeSee also: -R (--root)
| -I (--indent)
| select --var
| -e (--elem)
Output:
<xsl-select>
<r a1="x" a2="42" a3="-2"/>
<r a1="x" a2="41" a3="-2"/>
<!-- etc. -->
</xsl-select>
Also possible:
-c '(document("…1.xml") | document("…2.xml") | document("…3.xml")) /*/r'
-c 'document("…1.xml")/*/r
-c 'document("…2.xml")/*/r' …
If file order determined by sort
is sufficient the EXSLT
str:split()
function can split the newline-separated output from find
into a nodeset:
echo '<v/>' |
xmlstarlet sel -R -I -t \
--var sep -n -b \
--var fls2 -o "$(find 'data' -type f -name 'rs*.xml' | sort)" -b \
-c 'document(str:split($fls2,$sep)) /*/r'
With a different -c (--copy-of)
argument in the previous example,
-c 'document("")'
outputs the stylesheet like the -C (--comp)
option (but inside
a wrapper element here because of -R (--root)
).
With
-c 'document("")//xsl:variable[@name="fls"]'
the file list variable is copied:
<xsl-select>
<xsl:variable xmlns:xsl="http://www.w3.org/1999/XSL/Transform" name="fls">
<xsl:element name="f">data/rs1.xml</xsl:element>
<xsl:element name="f">data/rs2.xml</xsl:element>
<xsl:element name="f">data/rs3.xml</xsl:element>
</xsl:variable>
</xsl-select>
A simple food composition table lists – per 100 gram food – the
calorie count (kcal
) as well as the amount in grams of
protein, fat, and carbohydrate:
<fc:foodcomp xmlns:fc="urn:foodcomp-subset">
<fc:nutrient nid="n0893" kcal="297" prot="24.3" fat="1.9" carb="48.8" name="Lentils, green, dried, raw"/>
<fc:nutrient nid="n2443" kcal="98" prot="7.9" fat="0.6" carb="16.3" name="Garlic, raw"/>
<!-- etc. -->
</fc:foodcomp>
With an input file containing a culinary recipe on the form
<recipe servings="4" name="Lentil and goats' cheese salad">
<ingredients>
<ing foodid="n0893" grams="200" name="green lentils"/>
<ing foodid="n2443" grams="10" name="garlic"/>
<!-- etc. -->
</ingredients>
<method><!-- etc. --></method>
</recipe>
specify the calorie count per serving per ingredient:
-N «prefix»=«value»
to
declare lookup table’s namespacedocument()
to access the external lookup tableRP
) and lookup
(FC
) documents as -m (--match)
changes the
current nodeformat-number()
xmlstarlet select --text -N fc='urn:foodcomp-subset' -t \
--var fcfile -o "${lutfile:-file2.xml}" -b \
--var FC='document($fcfile)/fc:foodcomp' \
--var RP='/recipe' \
-m '//ing' \
--var kcal='$FC/*[@nid = current()/@foodid]/@kcal' \
--var kcal-per-serv='$kcal div 100.0 * @grams div $RP/@servings' \
-v 'str:align(current()/@name,str:padding(20," ."),"left")' \
-o ' : ' \
-v 'str:align(format-number($kcal-per-serv,"0 kcal"),str:padding(8),"right")' \
-n \
-b \
"${infile:-file.xml}"
Output:
green lentils. . . . : 149 kcal
garlic . . . . . . . : 2 kcal
lemon juice. . . . . : 0 kcal
extra virgin olive o : 56 kcal
fresh basil. . . . . : 4 kcal
goats' cheese. . . . : 96 kcal
black pepper . . . . : 0 kcal
salt . . . . . . . . : 0 kcal
To compute nutritional values for an entire recipe collect the
gram-weighted food composition data – here in a result tree
fragment (RTF, cf. select --var
) as data
size is modest – and sum(…)
vertically, along the lines
of:
…
--var attrib='str:split("kcal prot fat carb")' \
--var nutr-weighted-rtf \
-m '//ing' \
--var ing='.' \
-e data \
-c '@foodid' \
-m '$attrib' \
-a '{.}' -v '$FC/*[@nid = $ing/@foodid]/@*[name() = current()] div 100.0 * $ing/@grams' -b \
-b \
-b \
-b \
-b \
--var nutr-wt='exslt:node-set($nutr-weighted-rtf)' \
-o 'Nutrition per serving: ' \
-m '$attrib' \
--var sum-per-serv='sum($nutr-wt/data/@*[name() = current()]) div $RP/@servings' \
-v 'concat(.," ",format-number($sum-per-serv,"0"))' \
…
Output:
Nutrition per serving: kcal 307, prot 19g, fat 15g, carb 26g
xmlstarlet edit
xmlstarlet edit
(aka ed
) copies its input
to output, supporting basic create, update, delete, rename, and move
actions (operations).
Note that edit
/
) as current node$prev
variable
as a back reference to the most recently created nodexpath
argumentsTo do conditional updates, or to dynamically
create -n
names or -v
values for an
edit
command, it may be worthwhile having
xmlstarlet select
generate
it.
edit
option […] [action …]
[«xml-file-or-uri» …]-h (--help)
- display help-O (--omit-decl)
- omit XML
declaration-P (--pf)
- preserve original
formatting-S (--ps)
- preserve non-significant
spaces-O (--omit-decl)
, -P (--pf)
, and
-S (--ps)
set/unset libxml2
flags, cf. XML parsing and
serialization.
See also: Try out
edit
’s formatting options | select -I (--indent)
-L (--inplace)
- edit input
file(s) in-placeThis option
stdout
– if input is
stdin
%20
, cf. Global options and
parameters-P (--pf)
is given--net
- allow network accessSee also: network access
-N «prefix»=«value»
- declare
namespacesThis option is repeatable; must be last non-action option(s). E.g.
-N xsql='urn:oracle-xsql'
.
Not needed for predefined namespaces or
those declared in the root element (--doc-namespace
) of
the first input file but required
--no-doc-namespace
global option is in effectSee: Use a namespace | select -N
Caution:
xmlstarlet edit
isn’t an XSLT processor so with or without
the -N …
option,
printf '%s' '<a/>' |
xmlstarlet edit --pf -O -N b='https://www.example.org/b' \
-s '*' -t elem -n 'b:c' -v 'd'
generates:
<a><b:c>d</b:c></a>
See also: Create a SOAP envelope
-i (--insert)
- add node
before-a (--append)
- add node
after-s (--subnode)
- add node as
childThere are 3 ways to add an element, an attribute, or a text node to each member of a nodeset:
xmlstarlet edit OP xpath -t node-type -n node-name -v value
where
xpath
is an xpath
argumentOP
is one of:
-i (--insert)
- insert before xpath
as
preceding sibling-a (--append)
- insert after xpath
as
following sibling-s (--subnode)
- append as last child of
xpath
$prev
(aka $xstar:prev
)
variable-i
and -a
accept the root element (document
element) as xpath
-t (--type) node-type
selects one of these node types:
elem
- elementattr
- attribute; -i
, -a
, and
-s
all create an attribute in the xpath
element but do not influence attribute ordertext
- text-n (--name) node-name
selects an XML QName,
e.g. item
or svg:g
; required (and ignored) for
text
nodesxmlstarlet edit
will accept names such as !--
and , <&> .
without turning a hair, cf. -r (--rename)
.xmlstarlet edit
will create a namespace-prefixed element
like svg:g
but it will not be available as such in
following edit
actions (explanation); workaround: use $prev
.-v (--value) value
may be omitted if creating an empty
elem
or attr
node but it’s required for
text
, e.g.
-a 'bean' -t text -n ignored -v ''
."
,
<
a.o.) and certain numeric character references
(	
a.o.) are recognized in the value
text – when used with -t (--type) elem
– in which
case &
is required to represent &
(ampersand), cf. example at special
characters. (For value
used with type attr
or type text
the general rule applies.)Basic usecase (v/e
may replace $prev
here):
$ printf '%s' '<v/>' |
xmlstarlet edit -O \
-s 'v' -t elem -n 'e' -v '42' \
-s '$prev' -t attr -n 'a' -v 'y'
<v>
<e a="y">42</e>
</v>
Examples: examples/ed-append
| examples/ed-insert
| examples/ed-subnode
| Insert HTML
<link …/>
See also: -u (--update)
$prev
variable (aka
$xstar:prev
)The $prev
(aka $xstar:prev
) variable refers
to the nodeset created by the most recent -i (--insert)
, -a (--append)
, or -s (--subnode)
option, which
all define or redefine it. To reset $prev
(to avoid a false
match later) for example -a '/..' -t elem -n nil
which
fails as the root node has no parent.
$prev
isn’t mentioned in the user’s guide; examples are
given in doc/xmlstarlet.txt
and in this section.
--var name 'xpath'
The --var name xpath
option to define an
xmlstarlet edit
variable is mentioned in doc/xmlstarlet.txt
but not in the user’s guide. It uses a different format than
select
’s --var
.
Examples:
xmlstarlet edit --inplace \
--var T '//_:p[@class="eyg"] | //_:span[contains(@class,"eyg_")]' \
--var res "$((3 * 7 * 2))" \
-u '$T' -x 'concat(.,", currently ",$res)' \
file.xhtml
xmlstarlet edit \
-s '/doc/abc' -t elem -n 'ns:nd' \
--var nsnd '$prev' \
# ...
See also: xpath
arguments | -u (--update)
| $prev
-u (--update) 'xpath' -v (--value) 'value'
-u (--update) 'xpath' -x (--expr) 'xpath'
There are 2 ways to modify the value of each member of a nodeset:
xmlstarlet edit -u (--update) 'xpath' -v (--value) 'value'
xmlstarlet edit -u (--update) 'xpath' -x (--expr) 'xpath'
where
-u 'xpath'
is an xpath
argument describing
the destination nodeset-v
expects a literal value, e.g. hello
,
'a b c<&d>'
, or
"$(grep -e '^rev' abc.txt)"
-x 'xpath'
is one of:
xpath
argument: all -u
nodes updated with same value /
nodeset / variablexpath
argument: -u
nodes updated with whatever the XPath
resolves to,-u '…' -x '. * 1.25'
-u '…' -x 'concat("prefix-",.,"-suffix")'
-u '…' -x '../../@name'
-u '…' -x 'string(../../@name)'
-x
makes a deep copy of its argument. Given an element
e
,
<e a="v"><c1/><c2/></e>
,
-x 'e'
copies the entire thing whereas
-x 'e/node() | e/@*'
copies e
’s child nodes
and e
’s attribute nodes (cf. the many-to-many move
example). (Attributes are
not children of their parent – background.)
Creation of a new (empty) node is often followed by an update using
$prev
, for example to enable an
-x
expression:
xmlstarlet edit --inplace \
-s '*' -t elem -n entry \
-u '$prev' -x 'date:date-time()' \
-s '$prev' -t attr -n user -v "${LOGNAME}" \
log.xml
See also: Moving nodes
-d (--delete) 'xpath'
See also: xpath
arguments | Delete a namespace
-r (--rename) 'xpath' -v (--value) 'new-name'
new-name
is an XML QName such as
item
or svg:g
.
Caution: The
-v (--value)
clause of this option admits its
new-name
argument unmodifed – accepting an empty string or
one containing tab, newline, XML special
characters a.o. – ignoring XML QName requirements. In this respect
it works like the -n (--name)
clause of the
-i
, -a
, and -s
options.
See also: xpath
arguments | Rename elements
example
-m (--move) 'xpath1' 'xpath2'
The source (xpath1
) of the -m (--move)
action can be nodes other than root and namespace: element, attribute,
text, comment, or processing instruction.
The destination (xpath2
) must be a single element (the
only container node) – otherwise xmlstarlet
exits with an
error message
and a non-zero return code. Source nodes will be appended as last nodes
at destination.
Caution: With
overlapping source and destination --move
completes with
exit value 0 and no messages.
Caution:
-m (--move)
causes a segmentation fault (exit value 139) if
attempting a many-to-many move.
See also: xpath
arguments | Moving nodes | Move a namespace
xpath
argumentsFor the xpath
argument – in the -i
,
-a
, -s
, -d
, -r
,
-m
, -u
, -x
, and
--var
options – xmlstarlet edit
can use an
XPath 1.0 expression, incl. XPath 1.0 functions[1] and selected EXSLT functions, but
using XSLT functions such as current()
,
document()
, generate-id()
, or
format-number()
triggers (as expected) the error
xmlXPathCompOpEval: function «name» not found
.
[1] The XPath functions position()
and
last()
rely on an evaluation
context. With xmlstarlet edit
they can be used inside
an XPath predicate (e.g. --move '…' '…[last()]'
) –
last()
returning the context size – but used outside (as in
-u '…' -x 'substring("abcdef",position(),1)'
) triggering an
Invalid context position
error or an
Invalid context size
error.
position()
alternative:
1 + count(preceding-sibling::«node»)
or
count(preceding::«node»)
.
xpath
argumentsBased on the exslt«name»XpathCtxtRegister
functions in
libexslt
xmlstarlet edit
supports selected functions from the EXSLT modules dates-and-times
,
math
, sets
, and strings
in its xpath
arguments:
dates-and-times
in namespace date
,
cf. libexslt/date.c,add
, add-duration
, date
,
date-time
, day-abbreviation
,
day-in-month
, day-in-week
,
day-in-year
, day-name
,
day-of-week-in-month
, difference
,
duration
, hour-in-day
, leap-year
,
minute-in-hour
, month-abbreviation
,
month-in-year
, month-name
,
second-in-minute
, seconds
, sum
,
time
, week-in-month
,
week-in-year
, year
math
in namespace math
, cf. libexslt/math.c,abs
, acos
, asin
,
atan
, atan2
, constant
,
cos
, exp
, highest
,
log
, lowest
, max
,
min
, power
, random
,
sin
, sqrt
, tan
sets
in namespace set
, cf. libexslt/sets.c,difference
, distinct
,
has-same-node
, intersection
,
leading
, trailing
strings
in namespace str
, cf. libexslt/strings.c,align
, concat
, decode-uri
,
encode-uri
, padding
All date
, math
, and set
are
there but note the absence of str:replace
(removed from
xmlstarlet edit
in 2012),
str:split
, and str:tokenize
.
Hello, EXSLT
:
printf '%s' '<v y="." e="." pi="." z="." r="."/>' |
xmlstarlet edit -O \
--var e 'math:constant("E",10)' \
-u '*/@*[math:power(1e3,0)]' -x 'date:day-name("2011-09-24")' \
-u '*/@e' -x '$e' \
-u '*/@z' -x 'count(set:distinct(/*/@*))' \
-u '*/@pi' -x 'math:constant("PI",10)' \
-u '*/@r' -x '$e*/*/@z*/*/@pi' \
-r '*/@*[starts-with(.,str:align(25," "))]' -v 'ezpi'
The number 1e3
(in scientific notation) isn’t XPath 1.0
but understood by libxml2
.
Output:
<v y="Saturday" e="2.71828182" pi="3.14159265" z="3" ezpi="25.6192025590219"/>
See also: Divide a document into sections example
Caution:
xmlstarlet edit -u '…' -x '…'
silently
deletes the first child node at destination (whether
absolute or relative -x
XPath expression),
-u '…' -v '…'
silently deletes all child
nodes. This suggests that the -u (--update)
option is
intended to modify leaf nodes (aka external nodes) but
xmlstarlet
’s silence in this matter extends to both
documentation and source code.
Links: src/xml_edit.c#edUpdate()
Given this input,
<r><a><a1 k="v1">V1</a1>
<a2 k="v2">V2</a2></a>
<b><b1><D/></b1><b2/><b3><F1/><F2/></b3></b>
</r>
xmlstarlet edit -O -P \
-u 'r/b/*' -x '../../a/a1/text()' \
"${infile:-file.xml}"
produces:
<r><a><a1 k="v1">V1</a1>
<a2 k="v2">V2</a2></a>
<b><b1>V1</b1><b2>V1</b2><b3><F2/>V1</b3></b>
</r>
Workaround for -x
: insert a sacrificial element (or
non-whitespace text) node as first child. The -i (--insert)
option has no
effect if no nodes exist on destination’s child axis.
xmlstarlet edit -O -P \
-i 'r/b/*/node()[1]' -t elem -n 'herenow' \
-u 'r/b/*' -x '../../a/a1/text()' \
"${infile:-file.xml}"
Output:
<r><a><a1 k="v1">V1</a1>
<a2 k="v2">V2</a2></a>
<b><b1><D/>V1</b1><b2>V1</b2><b3><F1/><F2/>V1</b3></b>
</r>
If instead -i … -u 'r/b/*' -v 'gonethere'
:
… <b><b1>gonethere</b1><b2>gonethere</b2><b3>gonethere</b3></b> …
<link …/>
xmlstarlet edit --omit-decl --pf \
-s '/_:html/_:head' -t elem -n link \
--var lk '$prev' \
-s '$lk' -t attr -n 'rel' -v 'stylesheet' \
-s '$lk' -t attr -n 'type' -v 'text/css' \
-s '$lk' -t attr -n 'href' -v 'style/www.css' \
in.xhtml > out.xhtml
…
<link rel="stylesheet" type="text/css" href="style/www.css"/>
…
See also: -P (--pf)
| -s (--subnode)
| --var
| $prev
| Use
a namespace
$ cat file.xml
<doc><e f="0">false<c>false</c></e></doc>
$ :
$ xmlstarlet edit -L -O --pf --var T 'doc/e/@f' -u '$T' -x '($T+1) mod 2' file.xml
$ cat file.xml
<doc><e f="1">false<c>false</c></e></doc>
$ :
$ xmlstarlet edit -O --pf --var T 'doc/e/text()' -u '$T' -x 'not($T="true")' file.xml
<doc><e f="1">true<c>false</c></e></doc>
The -L (--inplace)
option edits the input file in-place.
See also: -P (--pf)
| --var
| -u (--update)
The EXSLT function str:replace
was removed from
xmlstarlet edit
in 2012 so it’s either
straight XPath 1.0:
xmlstarlet edit -O \
-u 'doc/e' \
-x 'concat(substring-before(.,"&text=Ulysses"),
substring-after(.,"&text=Ulysses"))' \
"${infile:-file.xml}"
or invoke xmlstarlet select
first to apply
str:replace
:
avar="$(xmlstarlet select --text -t \
-v 'str:replace(doc/e,"&text=Ulysses","")' "${infile:-file.xml}")"
xmlstarlet edit -O -u 'doc/e' -v "${avar}" "${infile:-file.xml}"
With this XML input:
<doc>
<e>abcdefghi/q?sid=ry12345&text=Ulysses&ofmt=x'"ml</e>
</doc>
both commands produce:
<doc>
<e>abcdefghi/q?sid=ry12345&ofmt=x'"ml</e>
</doc>
See also: pyx
,
depyx
If a record rec
has a desc
element add the
text of the sibling name
element to it, otherwise add a new
desc
element and set its value to the text of
name
.
xmlstarlet edit \
-u '//recs/rec/desc' -x 'concat(.," - ",../name/text())' \
-a '//recs/rec[not(desc)]/name' -t elem -n 'desc' \
-u '$prev' -x '../name/text()' \
"${infile:-file.xml}"
See also: -u (--update)
| -a (--append)
| $prev
| -s (--subnode)
If desc
and name
are attributes,
instead:
xmlstarlet edit \
-u '//recs/rec/@desc' -x 'concat(.," - ",../@name)' \
-s '//recs/rec[not(@desc)]' -t attr -n 'desc' \
-u '$prev' -x 'string(../@name)' \
"${infile:-file.xml}"
Note that -x 'string(../@name)'
– and
../@name
as a concat()
string argument –
copies the attribute value, -x '../@name'
the attribute
node; the latter fails as attributes cannot contain other nodes (causing
an empty value to be assigned to @desc
).
xmlstarlet edit
has no if-then-else
construct so the following snippets use standard XPath 1.0 expressions
and edit
’s back reference $prev
variable to apply conditions.
All use the following nodeset variable:
--var T '/Server/Service[@name="Catalina"]'
Add $T/Connector
after last ditto:
-a '$T/Connector[last()]' -t elem -n Connector \
--var C '$prev' -s '$C' -t attr -n port -v '7654'
Nothing is added if no $T/Connector
node exists – in
which case $prev
becomes null
and
-s '$C' …
has no effect – otherwise a
Connector
element is appended as first following sibling
(to become the new last Connector
) and given a
port
attribute.
Add $T/Connector
if not exists, as last child of
$T
:
-s '$T[not(Connector)]' -t elem -n Connector \
--var C '$prev' -s '$C' -t attr -n port -v '8765'
Nothing is added if a $T/Connector
node exists – if the
first -s …
matches nothing then the second
-s …
(due to a null
$prev
) will
match nothing.
Add $T/Connector
if not exists, after
$T/Executor[1]
if exists:
-a '$T[not(Connector)]/Executor[1]' -t elem -n Connector \
--var C '$prev' -s '$C' -t attr -n port -v '9876'
Nothing is added if a $T/Connector
node exists or no
$T/Executor
node exists, otherwise appended as first
following sibling of first $T/Executor
.
See also: --var
|
-a (--append)
| -s (--subnode)
| $prev
| -u (--update)
This code duplicates a bean
element from a formatted
input file, inserts the copy right after the original, changing its
@id
, and restores the inter-element whitespace that was.
Use with edit
’s -P (--pf)
or -S (--ps)
option.
xmlstarlet edit --ps \
--var N '/beans/bean[@id="bean4"]' \
--var ws '$N/following::text()[1][normalize-space()=""]' \
-a '$N' -t elem -n bean \
-u '$prev' -x '$N/node() | $N/@*' \
-u '$prev/@id' -v 'bean4a' \
-i '$prev' -t text -n whitespace -v '' \
-u '$prev' -x '$ws' \
file.xml > newfile.xml
Notes:
--var N
references the original bean
elementws
holds the first text node after original,
provided it’s all whitespace (otherwise empty)-a …
appends a new empty bean
element as a
following-sibling of the original$prev
s refer to the new element-u …
makes a deep copy of the original’s child
and attribute nodes-i … -u …
copies the whitespace following the original
bean
, to keep the formattingtext
requires a (dummy) name and an
initial value$prev
refers to the text node created by
-i
See also: --var
|
-a (--append)
| -u (--update)
| $prev
| -i (--insert)
xmlstarlet edit -m or -a, -u, -r, -d
The basic usecases for moving XML nodes (except namespace nodes) from source to destination are:
The ground rules are:
-m (--move)
action can
handle the one-to-one or many-to-one usecases if nodes
are to be appended at destination (an element node)--var name 'xpath'
collects
a nodeset in a named variable$prev
variable refers to the node created by the most recent
-i (--insert)
, -a (--append)
, or
-s (--subnode)
action-x
) xpath
of the -u (--update)
action can
be relative or non-relativeIn this context one-to-many is a copy (update) operation
handled by -u … -v …
or -u … -x …
followed by
-d …
.
See also: xpath
arguments | Namespaces
The following examples (one-to-one | many-to-one | many-to-many | move to position N) use this input XML file:
<div>
<a>anchor</a>
<p><span><a id="a1">value 1</a></span></p>
<p><span><a id="a2">value 2</a></span></p>
<p id="vol"/>
<p/>
</div>
Moving one node to another:
xmlstarlet edit --omit-decl \
-m '/div/p[4]' '/div/p[3]' \
-m '/div/p[3]/@id' '/div/p[2]' \
-m '/div/a' '/div' \
"${infile:-file.xml}"
Note the placement of a
as the last element in
destination.
See also: -m (--move)
Output:
<div>
<p>
<span>
<a id="a1">value 1</a>
</span>
</p>
<p id="vol">
<span>
<a id="a2">value 2</a>
</span>
</p>
<p>
<p/>
</p>
<a>anchor</a>
</div>
Moving the first 3 p
elements to the 4th
p
:
xmlstarlet edit --omit-decl \
-m '/div/p[position() <= 3]' '/div/p[4]' \
"${infile:-file.xml}"
Destination can also be expressed as (//p)[4]
, being the
4th p
in document order.
See also: -m (--move)
Output:
<div>
<a>anchor</a>
<p>
<p>
<span>
<a id="a1">value 1</a>
</span>
</p>
<p>
<span>
<a id="a2">value 2</a>
</span>
</p>
<p id="vol"/>
</p>
</div>
Whereas overlapping
-m '/div/p[position() <= 3]' '/div/p[3]'
gives:
<div>
<a>anchor</a>
<p/>
</div>
Move all a
children of span
elements up one
level, then remove the emptied span
s (untagging).
xmlstarlet edit --omit-decl --pf \
--var N '//span[a]' \
-a '$N' -t elem -n 'a' \
-u '$prev' -x 'preceding-sibling::span[1]/a/node() | preceding-sibling::span[1]/a/@*' \
-d '$N' \
"${infile:-file.xml}"
N
variable contains the
nodeset of all span
elements which have an a
element as an immediate child-a (--append)
creates a
sibling a
element for each span
element in
$N
-u (--update)
inserts values in each newly created a
element ($prev
) using a relative XPath
expression (-x
) to make a deep copy of a
’s
child and attribute nodes (span
being the first
preceding-sibling
of the new a
element)-d (--delete) …
deletes
each element in $N
, after conversionIn the general case use the following-sibling
axis with
the -i (--insert)
action and the
preceding-sibling
axis with -a (--append)
. In
this specific case preceding::span[1]
or
../span
also refers to
preceding-sibling::span[1]
.
Output:
<div>
<a>anchor</a>
<p><a id="a1">value 1</a></p>
<p><a id="a2">value 2</a></p>
<p id="vol"/>
<p/>
</div>
Alternatively, untag using an identity transform plus a template such as:
<xsl:template match="span[a]">
<xsl:xsl:apply-templates/>
</xsl:template>
Move the 3rd p
element to position 2, to become the new
1st p
.
The move destination cannot be a list position so work around:
xmlstarlet edit --omit-decl \
--var src '/div/p[3]' \
--var tgt '/div/*[2]' \
-i '$tgt' -t elem -n 'p_TMP' \
-u '$prev' -x '$src/node() | $src/@*' \
-d '$src' \
-r '$prev' -v 'p' \
"${infile:-file.xml}"
Both -u '$prev' -x '…'
and -m '…' '$prev'
work here.
See also: -i (--insert)
| -u (--update)
| $prev
| -d (--delete)
| -r (--rename)
| -m (--move)
Output:
<div>
<a>anchor</a>
<p id="vol"/>
<p>
<span>
<a id="a1">value 1</a>
</span>
</p>
<p>
<span>
<a id="a2">value 2</a>
</span>
</p>
<p/>
</div>
This example – a many-to-many move operation – uses xpath
arguments with
relative expressions and the EXSLT set:leading
function to group by h2
and move elements into
div
s.
See also: Moving nodes | -i (--insert)
| $prev
| -u … -x …
| -d (--delete)
| EXSLT in xpath
arguments
<doc>
<h1/>
<h2/><p1/><p2/>
<h2/><p3/><p4/><p5/>
<h2/><p id="p6"><v/></p><p/>
</doc>
xmlstarlet edit -O \
-i 'doc/h2' -t elem -n div \
-u '$prev' -x 'set:leading(following-sibling::*, following-sibling::div[1])' \
-d 'doc/div/following-sibling::*[not(self::div)]' \
"${infile:-file.xml}"
Output:
<doc>
<h1/>
<div>
<h2/>
<p1/>
<p2/>
</div>
<div>
<h2/>
<p3/>
<p4/>
<p5/>
</div>
<div>
<h2/>
<p id="p6">
<v/>
</p>
<p/>
</div>
</doc>
See also: Use
set:leading
and set:trailing
example
xmlstarlet edit
’s formatting optionsTest edit
’s various formatting options on this input
file:
<div>
<a>anchor</a>
<p><span><a id="a1">value 1</a></span></p>
<p>
<span> <a
id="a2"
> value 2 </a
> </span>
</p>
<p id="empty"></p>
<p/>
</div>
See also: XML
parsing and serialization | Duplicate an element,
keep the formatting | -O (--omit-decl)
| -P (--pf)
| -S (--ps)
Default formatting:
xmlstarlet edit -O "${infile:-file.xml}"
<div>
<a>anchor</a>
<p>
<span>
<a id="a1">value 1</a>
</span>
</p>
<p>
<span>
<a id="a2"> value 2 </a>
</span>
</p>
<p id="empty"/>
<p/>
</div>
See also: select -I (--indent)
xmlstarlet edit -O --pf "${infile:-file.xml}"
<div>
<a>anchor</a>
<p><span><a id="a1">value 1</a></span></p>
<p>
<span> <a id="a2"> value 2 </a> </span>
</p>
<p id="empty"/>
<p/>
</div>
xmlstarlet edit -O --ps "${infile:-file.xml}"
<div>
<a>anchor</a>
<p><span><a id="a1">value 1</a></span></p>
<p>
<span> <a id="a2"> value 2 </a> </span>
</p>
<p id="empty"/>
<p/>
</div>
This combination appears to match that of
xmllint --pretty 2 file.xml
. It could prove a temptation to
a regex user (but beware).
Swopping --pf
and --ps
makes no
difference.
xmlstarlet edit -O --pf --ps "${infile:-file.xml}"
<div
>
<a
>anchor</a
>
<p
><span
><a
id="a1"
>value 1</a
></span
></p
>
<p
>
<span
> <a
id="a2"
> value 2 </a
> </span
>
</p
>
<p
id="empty"
/>
<p
/>
</div
>
Add a subnode:
xmlstarlet edit -O -s '*/p[4]' -t elem -n added "${infile:-file.xml}"
<div>
<a>anchor</a>
<p>
<span>
<a id="a1">value 1</a>
</span>
</p>
<p>
<span>
<a id="a2"> value 2 </a>
</span>
</p>
<p id="empty"/>
<p>
<added/>
</p>
</div>
xmlstarlet edit -O --pf -s '*/p[4]' -t elem -n added "${infile:-file.xml}"
<div>
<a>anchor</a>
<p><span><a id="a1">value 1</a></span></p>
<p>
<span> <a id="a2"> value 2 </a> </span>
</p>
<p id="empty"/>
<p><added/></p>
</div>
xmlstarlet edit -O --ps -s '*/p[4]' -t elem -n added "${infile:-file.xml}"
<div>
<a>anchor</a>
<p><span><a id="a1">value 1</a></span></p>
<p>
<span> <a id="a2"> value 2 </a> </span>
</p>
<p id="empty"/>
<p><added/></p>
</div>
Delete whitespace-only text nodes:
xmlstarlet edit -O --pf -d '//text()[normalize-space()=""]' "${infile:-file.xml}"
<div><a>anchor</a><p><span><a id="a1">value 1</a></span></p><p><span><a id="a2"> value 2 </a></span></p><p id="empty"/><p/></div>
See also: select -B (--noblanks)
xmlstarlet edit -O --ps -d '//text()[normalize-space()=""]' "${infile:-file.xml}"
<div>
<a>anchor</a>
<p>
<span>
<a id="a1">value 1</a>
</span>
</p>
<p>
<span>
<a id="a2"> value 2 </a>
</span>
</p>
<p id="empty"/>
<p/>
</div>
xmlstarlet edit -O --pf --ps -d '//text()[normalize-space()=""]' "${infile:-file.xml}"
<div
><a
>anchor</a
><p
><span
><a
id="a1"
>value 1</a
></span
></p
><p
><span
><a
id="a2"
> value 2 </a
></span
></p
><p
id="empty"
/><p
/></div
>
xmlstarlet format
The format
(aka fo
) command is an XML code formatter which accepts one
input file, default is stdin
.
See also: XML
parsing and serialization | select -I (--indent)
| Try out
edit
’s formatting options
format
[option …]
[«xml-file»]-h (--help)
- display help-e (--encode) «encoding»
-
output in the given encoding-n (--noindent)
- do not
indentSets indentation to zero spaces, left-aligning the output. Does not strip nonsignificant whitespace from input.
See also: select -B (--noblanks)
-o (--omit-decl)
- omit XML
declarationCaution: Setting this
option causes xmlstarlet format
to return an exit value equal to the number of bytes written
(or -1 in case of error) modulo 256 (src/xml_format.c#foProcess(),
cf. <libxml/xmlIO.h>
).
See also: XML declaration
-s (--indent-spaces) «N»
- indent output with N spacesDefault indentation per level is 2 spaces.
-t (--indent-tab)
- indent
output with tabulation-C (--nocdata)
- replace CDATA
section with text nodes$ infile=$(mktemp)
$ printf '<v><![CDATA[A\t%s\nZ]]></v>' '"&'\''<>' > "${infile}"
$ :
$ xmlstarlet format -o -C "${infile}"
<v>A "&'<>
Z</v>
$ :
$ xmlstarlet format -o "${infile}"
<v><![CDATA[A "&'<>
Z]]></v>
$ :
$ xmlstarlet pyx "${infile}"
(v
[A\t"&'<>\nZ
)v
See also: pyx
-D (--dropdtd)
- remove the
DOCTYPE of the input docAlternatives: xmllint --dropdtd a.xml | …
or
xsltproc --novalid b.xsl a.xml
.
-H (--html)
- input is HTMLReads input using the libxml2
HTML 4.0 parser,
cf. API
reference.
Attempt to convert HTML – or broken XML – to usable XHTML:
wget -qO- "${url}" |
xmlstarlet -q format --html --recover --dropdtd --omit-decl > output
See also: global -q (--quiet)
option
Links: HTML Tidy | W3C’s
html-xml-utils | xmllint
-N (--nsclean)
- remove
redundant namespace declarationsSee also: Remove namespace declarations
-Q (--quiet)
[undocumented] -
suppress error outputDoes what the -q (--quiet)
global option
does.
-R (--recover)
- try to recover
what is parsableSee also: -H (--html)
If the input file is not well-formed XML (the typical usecase for
-R
) it cannot be indented in the same process; use 2 steps
instead:
xmlstarlet format -R -H file | xmlstarlet format -o -
.
--net
- allow network accessSee also: network access
xmlstarlet c14n
The c14n
(aka canonic
) command is used to
convert an XML document to Canonical XML, a normal format
intended to allow relatively simple comparison of pairs of XML documents
for equivalence.
The W3C recommendations list examples of XML canonicalization.
Examples of the c14n
command are given in the source code’s
examples/c14n*.
Links: Canonical XML - Wikipedia | Canonical XML - W3C rec | Exclusive XML Canonicalization - W3C rec
Caution
xmlstarlet c14n
does not flag invalid options, cf. src/xml_C14N.c#c14nMain().
c14n [option] [«mode»] «xml-file» [«xpath-file»] [«inclusive-ns-list»]
-h (--help)
- display help--net
- allow network accessSee also: network access
«mode»
- canonicalization
mode«mode»
is one of the following:
--with-comments
- canonicalization with comments (this
is the default mode)--without-comments
- canonicalization without
comments--exc-with-comments
- exclusive canonicalization with
comments--exc-without-comments
- exclusive canonicalization
without comments«xml-file»
- input XML
document file name (stdin is used if -
)Basic use case:
xml-generator-command | xmlstarlet c14n |
{ xmlstarlet c14n expected.xml | diff -b -C 1 - /dev/fd/3; } 3<&0 || log …
«xpath-file»
- XML file with
document subset expressionCf. document subset in the W3C recommendation.
Sample xpath-file, from examples/xml/c14n.xpath:
<?xml version="1.0"?>
<XPath xmlns:n0="http://a.example.com" xmlns:n1="http://b.example">
(//. | //@* | //namespace::*)[ancestor-or-self::n1:elem1]
</XPath>
The InclusiveNamespaces PrefixList as a comma-separated (Caution: the user’s guide says blank-separated) list of namespace prefixes; for exclusive canonicalization only.
xmlstarlet validate
The validate
(aka val
) command performs
validation on XML documents. Examples of the val
command
are given in the source code’s examples/valid1.
NB: XML Schemas (XSD) are not fully supported due to
incomplete support in libxml2
.
Wikipedia links: XML schemas in general | XSD (W3C) | RELAX NG | DTD
See also: XML parsing and serialization | External entities
validate
[option …]
[«xml-file-or-uri» …]-w (--well-formed)
-
validate well-formedness only (default)-d (--dtd) «dtd-file»
- validate
against DTD--net
- allow network accessSee also: network access
-s (--xsd) «xsd-file»
- validate
against XSD schema-E (--embed)
- validate using
embedded DTD-r (--relaxng) «rng-file»
-
validate against Relax-NG schema-e (--err)
- print verbose error
messages on stderr-S (--stop)
- stop on first
error-b (--list-bad)
- list only
files which do not validate-g (--list-good)
- list
only files which validate-q (--quiet)
- do not list
files (return result code only)xmlstarlet pyx
, depyx
xmlstarlet
’s pyx
(aka xmln
)
and depyx
(aka p2x
) commands are used to
convert XML to PYX during processing. PYX is a simple line-oriented
text-based format usable with standard text tools such as
grep
, sed
, or awk
. Given
xmlstarlet
’s lack of native support for regular
expressions this type of processing is occasionally useful, but
beware of side effects: a pyx | depyx
pipeline does not guarantee an accurate roundtrip.
pyx
uses a SAX parser.
PYX’s simplicity and lack of structure (and namespaces) makes it a good choice for certain types of operations – e.g. queries or editing of non-complex data like config files or database record sets – and a poor choice for handling complex documents or operations.
The PYX format lives a quiet life these days; xml.com
still carries its article on Pyxie
whereas IBM’s intro is now at archive.org.
The first character of each line of PYX indicates the type of parsing event:
char event
---- -----
( start-tag
) end-tag
A attribute or namespace
- character data
? processing instruction
C comment
[ CDATA section
D DTD declaration
N notation declaration
U unparsed entity
& external entity
Caution:
pyx
strips an XML declaration if present.
Caution:
&
(ampersand) is buggy, e.g. external entities, cf. src/xml_pyx.c.
Caution:
depyx
outputs non-collapsed empty elements, e.g.
<void></void>
.
Caution:
depyx
outputs XML special
characters inside comments as entity references,
e.g. &
as &
.
Caution:
depyx
may output spurious newlines, for example after a
comment, cf. src/xml_depyx.c.
Links: packages.debian.org
xml2
#### Usage: pyx
[–help] [«xml-file»] {#pyx-usage} ####
Usage: depyx
[–help] [«pyx-file»] {#depyx-usage}
xmlstarlet pyx "${infile:-pom.xml}" | head -n 40
Output:
(project
Axmlns http://maven.apache.org/POM/4.0.0
Axmlns:xsi http://www.w3.org/2001/XMLSchema-instance
Axsi:schemaLocation http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd
-\n
(modelVersion
-4.0.0
)modelVersion
-\n\n
(groupId
-com.github.example8
)groupId
-\n
(artifactId
-maven-simple
)artifactId
-\n
(version
-0.2-SNAPSHOT
)version
-\n
(packaging
-jar
)packaging
-\n\n
(name
-Simple Maven example
)name
-\n
(url
-https://example8.io/#example8/maven-simple/0.1
)url
-\n\n
(dependencies
-\n
(dependency
-\n
(groupId
-junit
)groupId
Dvorak
in
tags’ textxmlstarlet pyx '/usr/share/X11/xkb/rules/base.xml' | grep '^-.*Dvorak'
xmlstarlet pyx index.xhtml | sed -n 's/^Ahref //p'
date
attributes to extended ISO 8601 formatUsing GNU sed (for
e
flag of s
command) and GNU date (for
-d (--date)
option and %F
format):
xmlstarlet pyx "${infile:-file.xml}" |
sed -E '1v;/^(Adate )(.*)/ s//date -d "\2" "+\1%F"/e' |
xmlstarlet depyx
The do-nothing v
command fails on non-GNU
sed
s. The empty regex causes the last applied regex to be
reused.
datetime
elements from basic to extended ISO 8601
formatxmlstarlet pyx "${infile:-file.xml}" |
sed '/^(datetime$/,/^)datetime$/ { /^-\(....\)\(..\)\(..\)\(..\)\(..\)\(..\)/ s//-\1-\2-\3T\4:\5:\6/; }' |
xmlstarlet depyx
Assuming non-nested datetime
elements. The
/^-…/
condition leaves non-text nodes (incl. CDATA
sections) unmodified.
foo
elementsxmlstarlet pyx "${infile:-file.xml}" |
awk -v FS='\n' '
$0=="(foo" {flag++; next;}
$0==")foo" {flag--; next;}
!flag
' |
xmlstarlet depyx
!flag
prints current line if flag
is
zero.
parent/element
This awk
script reads a PYX
-format file and
extracts each group
element having a parent
glist
element (including any nested ditto) to a separate
numbered file, converting each chunk from PYX
to XML by
invoking xmlstarlet depyx
. Alternatively, output
PYX
-format files and convert them in parallel.
Caution: Doesn’t
understand namespaces, and beware of pyx/depyx
side effects.
xmlstarlet pyx "${infile:-file.xml}" |
awk -v FS='\n' -v partfmt='./part%04d.xml' -v element='group' -v parent='glist' '
/^\(/ { E[++level] = substr($0,2) }
$0 ~ "[()]" element "$" && E[level-1] == parent {
if ( !flag && "(" == substr($0,1,1) ) {
fxml = sprintf(partfmt,++partnum)
fpyx = fxml ".pyx.tmp"
flag=1
} else if ( flag ) {
print >> fpyx
close(fpyx)
system("xmlstarlet depyx " fpyx " > " fxml " && rm " fpyx)
flag=0
}
}
/^\)/ { --level }
flag { print >> fpyx }
'
See also: Create multiple result documents example
xmlstarlet escape
,
unescape
The escape
(aka esc
) command converts
&<>
to the equivalent &
<
>
entity referencestaking its input from the first text string on the command line, or
stdin
if it’s -
(dash) or absent.
The unescape
(aka unesc
) command does the
inverse. Caution:
unesc
leaves longer references such as
€
unmodified
(cf. MAX_ENTITY_NAME = 1+4
in src/xml_escape.c),
prints a diagnostic message, and returns zero.
See also: Special
characters | --xinclude
(for
parse="text"
)
escape
[–help] [«text»]unescape
[–help]
[«text»]$ xmlstarlet escape 'a&<>'\''"z'
a&<>'"z
$ :
$ xmlstarlet unescape 'a&<>'"z'; printf '\n'
a&<>'"z
$ :
$ # Unicode U+20AC EURO SIGN
$ xmlstarlet esc '€'
€
$ :
$ xmlstarlet unesc 'a€	100z
'
entity name too long: €
a€ 100z
$ :
$ xmlstarlet esc < "${infile:-file.xml}"
<foo>if they treat children as they do <bar>documentation</bar> they'll be <bat>prosecuted</bat></foo>
$ :
$ xmlstarlet esc < "${infile:-file.xml}" | xmlstarlet unesc
<foo>if they treat children as they do <bar>documentation</bar> they'll be <bat>prosecuted</bat></foo>
xmlstarlet list
The list
(aka ls
) command prints the
contents of a file system directory in XML format. It accepts a
directory name as its only argument; default is current dir. No
recursion option and no -h (--help)
option available.
list
[«directory-name»]xmlstarlet list /etc/sgml
Output:
<dir>
<f p="rw-r--r--" a="20220704T194637Z" m="20211001T213455Z" s="376" n="docbook-xml.cat"/>
<d p="rwxr-xr-x" a="20220706T075545Z" m="20220419T100344Z" s="4096" n="docbook-xml"/>
<l p="rwxrwxrwx" a="20220706T075532Z" m="20220702T102624Z" s="31" n="catalog"/>
<f p="rw-r--r--" a="20220704T194637Z" m="20201229T232017Z" s="652" n="sgml-data.cat"/>
<f p="rw-r--r--" a="20220704T194637Z" m="20190227T001849Z" s="45" n="xml-core.cat"/>
</dir>
Elements inside the dir
document element have a one-char
name indicating the file type,
f regular file
d directory
c character device
b block device
l symlink
p FIFO
s socket
u unknown
and attributes as returned by stat
:
p read-write-execute permissions for user, group, and other
a UTC time of last access in ISO 8601 basic format
m UTC time of last modification in ISO 8601 basic format
s file size in bytes
n filename
See man 7 inode
for permissions s
(S_ISUID,
S_ISGID) and t
(S_ISVTX).
xmlstarlet transform
The transform
(aka tr
) command is an XSLT
processor supporting XSLT 1.0 plus several EXSLT,
crypto
, and saxon
extensions.
xmlstarlet transform
returns the same system-property()
values as xmlstarlet select
and xsltproc
:
xsl:vendor libxslt
xsl:vendor-url http://xmlsoft.org/XSLT/
xsl:version 1.0
Caution:
xmlstarlet transform
doesn’t flag invalid options (src/xml_trans.c#trParseOptions()).
Caution: The
--catalogs
option mentioned in the user’s
guide was never implemented, it seems; not listed by
xmlstarlet transform --help
.
transform [option …] «xsl-file» [-p|-s «name»=«value» …] [«xml-file-or-uri» …]
-h (--help)
- display help--omit-decl
- omit XML
declarationSee also: XML declaration
-E (--embed)
- allow applying
embedded stylesheetLinks: <?xml-stylesheet?>
- W3C recommendation | Embedding
stylesheets - W3C XSLT 1.0 Rec
With an e.xml
XML document containing an
<?xml-stylesheet type="text/xml" href="e.xsl"?>
processing instruction before the document element, the
following command will run the XSLT stylesheet e.xsl
on
e.xml
.
xmlstarlet tr -E e.xml > output
This option is mentioned in doc/xmlstarlet.txt but not in the user’s guide.
--show-ext
- show list of
extensionsPrints a list of
registered XSLT extensions to stderr
and terminates.
--val
- allow validate against
DTDs or schemas--net
- allow fetch DTDs or
entities over networkSee also: network access
--xinclude
- do XInclude
processing on document inputLinks: XML Inclusions XInclude - W3C recommendation
See also: the XSLT document()
function
Basic XInclude example: include file2.xml
in
file1.xml
.
cat << 'HERE' > 'file1.xml'
<root>
<gs>
<xi:include
xmlns:xi="http://www.w3.org/2001/XInclude"
href="file2.xml"
xpointer="xpointer(//g[@id='items']/*)"
parse="xml"
/>
</gs>
</root>
HERE
cat << 'HERE' > 'file2.xml'
<doc><g id="items"><g1/><g2/><g3/><g4/></g></doc>
HERE
xmlstarlet select -C -t -c / |
xmlstarlet transform --xinclude /dev/stdin 'file1.xml'
xmlstarlet transform --xinclude
does the XInclude
processing using an XSLT stylesheet (generated on the fly by
xmlstarlet select
) which duplicates its input by copying
the root node (/
)xi:include
elements may appear in both the including
and the included file(s)
parse="xml"
is the default and may be omitted, the only
other option is parse="text"
(sample output below)href
attribute must refer to an XML
documenthref
attribute is absent (or empty) when
parse="xml"
it refers to the including document in which
case the xpointer
attribute must be presentxpointer
attribute is given, or its value set to
xpointer(/)
, the entire inclusion target will be
includedxml:base
attributes will appear in output (cf. XML
Base - W3C rec) unless the inclusion source and target(s) use a
shared include location (hint: use absolute pathnames)xmlns:xi
namespace node will appear in output if
given outside the xi:include
element (and may prove rather
sticky)Output:
<root>
<gs>
<g1/><g2/><g3/><g4/>
</gs>
</root>
If instead parse="text"
:
<root>
<gs>
<doc><g id="items"><g1/><g2/><g3/><g4/></g></doc>
</gs>
</root>
An alternative to XInclude or document()
:
The hxincl
utility from the W3C html-xml-utils
package is
HTML/XML-aware and expands certain embedded comments – or prints a
makefile
rule listing the dependent include files –
e.g. hxincl -x -s incfnm=file2.xml file1.xml
.
--maxdepth value
- increase
the maximum depthUsed to detect template loops, cf. variable xsltMaxDepth
in xslt.h
.
--html
- input document(s) are
in HTML formatReads input using the libxml2
HTML 4.0 parser,
cf. XML parsing and
serialization.
«xsl-file»
- main XSLT
stylesheet for transformationCf. option -E (--embed)
.
-p
- parameter is an XPath
expression-s
- parameter is a string
literal-p
and -s
are repeatable, up to a maximum
of 256 key-value pairs.
«name»=«value»
-
name and value of the parameter passed to XSLT processorE.g.
… -p m1='"Hello, XSLT"' -s m2='0xab 0xbb' file.xml
«xml-file»
- input XML
document file name (stdin
is used if missing)This parameter is repeatable.
Links: Understanding XML namespaces - Evan Lenz | Namespaces in XML 1.0 / 1.1 - W3C rec | The “xml:” namespace - W3C memo | Namespaces at Pawson Q&A
xmlstarlet
predefines the namespaces xml
,
xsl
, and those used with EXSLT
functions and elements minus crypto
plus
saxon
. By default (global option --doc-namespace
being
in effect) select
and edit
can use the
namespaces declared in the root element (outermost
element) of the first input file without explicit -N «prefix»=«value»
options; if
the default namespace is declared there it is bound to the
_
(underscore) (aka DEFAULT
) prefix.
A QName (qualified name) with no prefix appearing in an XPath expression uses the null namespace, not the default namespace.
Prefixed namespace:
xmlstarlet select --text -t \
-m 'set:distinct(//mime:mime-type/@type)' -v '.' -n \
recently-used.xbel
Default namespace:
xmlstarlet edit --inplace --pf \
-u '/_:html/_:head/_:link/@href[.="www.css"]' -v 'solarized.css' \
-d '//_:*[contains(@class,"pull-quote"] | //_:aside' \
article.xhtml
Null namespace:
xmlstarlet select -T -t \
-m 'recs/rec' -v '@date' -n \
file.xml
See also: User’s guide ch. 5 | Undefined namespace prefix error | name bound to undefined prefix error
$ cat nspre.xml
<p:r xmlns:p="urn:ns1">r1
<r xmlns="urn:ns2">r2
<p:e>e1</p:e>
<e>e2</e>
</r>
</p:r>
$ :
$ xmlstarlet select -t -m '//p:*' -v 'normalize-space(text())' -n nspre.xml
r1
e1
$ :
$ xmlstarlet select -N p='urn:ns2' -t -m '//p:*' -v 'normalize-space(text())' -n nspre.xml
r2
e2
A namespace declaration cannot be created directly with XSLT
1.0[1]. It’s done by adding element and attribute nodes which
have a (possibly null) namespace and a local name. Hint: In the
following examples, add -C (--comp)
before
select
’s -t
option to list the generated XSLT
code.
[1] xmlstarlet edit
isn’t so picky: see edit -N
| Create a SOAP envelope
See also: select -R (--root)
echo '<v/>' |
xmlstarlet select -N m=urn:ssssssssskeyssssstickingagain:local -t \
-e m:doc -a flag -o 1 -b -a m:flag -o 0
<m:doc xmlns:m="urn:ssssssssskeyssssstickingagain:local" flag="1" m:flag="0"/>
echo '<v/>' |
xmlstarlet select -N ''='https://www.example.org' -t \
-e 'doc' -a 'flag' -v '"x"'
<doc xmlns="https://www.example.org" flag="x"/>
printf '<fi/>' |
xmlstarlet select -t \
-e fee -a faw -o fum -b -e fi -e fo -b -o fum
<fee faw="fum"><fi><fo/>fum</fi></fee>
Input file:
<h:rs id="hrs" xmlns="urn:e" xmlns:f="urn:f" xmlns:g="urn:g" xmlns:h="urn:h">
<f:r id="fr"/><g:r id="gr"/><h:r id="hr"/>
</h:rs>
Query:
xmlstarlet select -t \
-m 'h:rs' -e '{local-name()}' -c '@*' -b -n \
"${infile:-file.xml}"
<rs xmlns="urn:e" id="hrs"/>
Use -N ''=''
for xmlns=""
:
xmlstarlet select -N = -t \
-m 'h:rs' -e '{local-name()}' -c '@*' -b -n \
"${infile:-file.xml}"
<rs id="hrs"/>
Edit:
xmlstarlet edit --omit-decl --pf \
-s 'h:rs' -t elem -n 'foo' -v 'bar' \
-s '$prev' -t attr -n 'xmlns' -v '' \
"${infile:-file.xml}"
<h:rs xmlns="urn:e" xmlns:f="urn:f" xmlns:g="urn:g" xmlns:h="urn:h" id="hrs">
<f:r id="fr"/><g:r id="gr"/><h:r id="hr"/>
<foo xmlns="">bar</foo></h:rs>
xmlstarlet edit -m '//namespace::xsi' '/_:doc/_:el' examples/xml/S0.xml
returns non-zero and the error message
FIXME: can't move namespace nodes
.
Links: examples/xml/S0.xml
xmlstarlet edit -d '//namespace::xsi' examples/xml/S0.xml
returns non-zero and the error message
FIXME: can't delete namespace nodes
.
Links: examples/xml/S0.xml
Tools to remove redundant namespace declarations include
xmlstarlet format
’s --nsclean
option, xmlstarlet c14n
, the--nsclean
option of xmllint
– all
with side effects – but they won’t remove xmlns:xi
nodes
left by XInclude
processing.
xml2/2xml
or pyx/depyx
and grep
can do the
doctoring (Caution: no
questions asked):
xml2 < file.xml | grep -v '^/doc/@xmlns:xi' | 2xml > newfile.xml
Caution:
xmlstarlet edit
silently ignores the namespace of an
inserted node referencing a previously inserted node having a namespace
prefix.
For instance, to insert an element such as
<ns1:c class="caveat"/>
it’s logical to say,
xmlstarlet edit \
-s '/a/b' -t elem -n 'ns1:c' \
-s '/a/b/ns1:c' -t attr -n 'class' -v 'caveat' \
file.xml
but the output will not contain the attribute node as the following
-s
(or -i
or -a
) option returns
an empty nodeset. In other words ns1:c
gets inserted but is
not available as such in following edit
actions. This is on
the to-do list as hinted by NULL /* TODO: NS */
in src/xml_edit.c#edInsert().
Workaround: Use the $prev
back
reference instead, as in … -s '$prev' -t attr …
.
See also: -s (--subnode)
Clark notation
Links: XPath recommendation: [Namespace
nodes][xpath-namspaces-nodes] | namespace
axis
<doc xmlns="http://www.example.org"
xmlns:xi="http://www.w3.org/2001/XInclude">
a
<xi:include href="b.xml"/>
b
<c xmlns="urn:my:local"/>
<d xmlns="">In no namespace</d>
</doc>
xmlstarlet select -T -t \
-m 'set:distinct(//namespace::*)' \
-v 'concat("{",.,"}",name())' -n \
"${infile:-file.xml}"
Output:
{http://www.w3.org/XML/1998/namespace}xml
{http://www.w3.org/2001/XInclude}xi
{http://www.example.org}
{urn:my:local}
{}
Links: SOAP
on
Wikipedia
printf '%s' '<v/>' |
xmlstarlet select --xml-decl --indent \
-N xsi='http://www.w3.org/2001/XMLSchema-instance' \
-N soapenv='http://schemas.xmlsoap.org/soap/envelope/' \
-N my='http://www.example.org/myService' \
-t \
-e 'soapenv:Envelope' \
-e 'soapenv:Header' -o '' -b \
-e 'soapenv:Body' \
-e 'my:Service' \
-e 'Param1' -a 'xsi:type' -o 'integer' -b -o '1' -b \
-e 'Param2' -a 'xsi:type' -o 'string' -b -o 'message' -b
<?xml version="1.0"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
<soapenv:Header/>
<soapenv:Body>
<my:Service xmlns:my="http://www.example.org/myService">
<Param1 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="integer">1</Param1>
<Param2 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="string">message</Param2>
</my:Service>
</soapenv:Body>
</soapenv:Envelope>
To force namespace nodes into the root element (namespace normalization) append dummy attributes there, then strip them:
printf '%s' '<v/>' |
xmlstarlet select \
-N xsi='http://www.w3.org/2001/XMLSchema-instance' \
-N soapenv='http://schemas.xmlsoap.org/soap/envelope/' \
-N my='http://www.example.org/myService' \
-t \
-e 'soapenv:Envelope' -a 'xsi:nslift' -b -a 'my:nslift' -b \
-e 'soapenv:Header' -o '' -b \
-e 'soapenv:Body' \
-e 'my:Service' \
-e 'Param1' -a 'xsi:type' -o 'integer' -b -o '1' -b \
-e 'Param2' -a 'xsi:type' -o 'string' -b -o 'message' -b \
| xmlstarlet edit -d 'soapenv:*/@xsi:nslift | soapenv:*/@my:nslift'
<?xml version="1.0"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:my="http://www.example.org/myService">
<soapenv:Header/>
<soapenv:Body>
<my:Service>
<Param1 xsi:type="integer">1</Param1>
<Param2 xsi:type="string">message</Param2>
</my:Service>
</soapenv:Body>
</soapenv:Envelope>
To have xmlstarlet edit
produce the latter version, for
example:
printf '%s' '<v/>' |
xmlstarlet edit \
-r '*' -v 'soapenv:Envelope' \
-a '*' -type attr -n 'xmlns:soapenv' -v 'http://schemas.xmlsoap.org/soap/envelope/' \
-a '*' -type attr -n 'xmlns:xsi' -v 'http://www.w3.org/2001/XMLSchema-instance' \
-a '*' -type attr -n 'xmlns:my' -v 'http://www.example.org/myService' \
-s '*' -type elem -n 'soapenv:Header' -v '' \
-s '*' -type elem -n 'soapenv:Body' \
-s '$prev' -type elem -n 'my:Service' \
--var svc '$prev' \
-s '$svc' -type elem -n 'Param1' -v '1' \
-s '$prev' -type attr -n 'xsi:type' -v 'integer' \
-s '$svc' -type elem -n 'Param2' -v 'message' \
-s '$prev' -type attr -n 'xsi:type' -v 'string'
A non-exhaustive list of xmlstarlet
messages
follows.
See xmlstarlet
user’s guide ch. 5
“Namespaces and default namespace”.
See also: Use a namespace
xmlstarlet edit
displays its usage reminder)… but offers no other clue
text
node
without a -n (--name)
or -v (--value)
clause?-v (--value)
clause?Triggered by:
--net
option,
e.g. when the input document has a DOCTYPE declaration (DTD)See also: --dropdtd
See failed to load external
entity (re stdin
)
Only select
, transform
, and
canonic
seem to understand an entity reference; see XML external entities.
Triggered by:
--net
optionuri
parameter using
HTTPS protocolstdin
add a -
(dash) to
the command line to work around parsing issues, e.g. if
format
’s -e (--encode) «encoding»
is
the last option (explanation: «encoding»
mistaken for
filename in src/xml_format.c#foProcess(),
similarly in src/xml_validate.c#valParseOptions())See Delete a namespace.
See Move a namespace.
See xpath
arguments.
xmlstarlet select
fails with a run-time error if input
file pathnames on the command line contain single quotes. This is a
known bug #123.
Workaround: use %27
(as in foo%27bar.xml
),
stdin
or load via document()
.
See also: Global options
The destination operand of xmlstarlet edit
’s -m (--move)
option does not exist
or is not a single element node.
This is a message from the XML parser: a warning but not necessarily
an error. Recall that :
(colon) in component names is tolerated but unrecommended
as it makes a document not namespace-well-formed.
Example:
<doc><div vid="yo" abc:txt="hello"/></doc>
To copy the value of @abc:txt
to @vid
, for
example:
xmlstarlet -q edit \
-u '*/*[@*[local-name()="abc:txt"][namespace-uri()=""]]/@vid' \
-x 'string(../@*[local-name()="abc:txt"][namespace-uri()=""])' \
file.xml
where
-q
option
suppresses messages from the parser about the missing namespace
definitionlocal-name()
and namespace-uri()
are used as a workaround in the special case where an unprefixed name
contains a :
(colon), because @abc:txt
would
cause the parser to look for the non-existing abc
namespaceSee also: Use a namespace | Undefined namespace prefix
Triggered by
--doc-namespace
and
prefix not declared in input’s root element or with -N
option--no-doc-namespace
and prefix not declared with -N
optionSee also: Use a namespace | Namespace prefix «name» … is not defined
edit
’s -N
option
must be the last non-action option.
In libxml2
the XML_PARSE_HUGE
option is
disabled by default to prevent denial-of-service attacks. This triggers
the xmlSAX2Characters: huge text node: out of memory
error
when loading a text node larger than 10 MB.
For a workaround see this patch.
Triggered by
select
’s --var «name»=«value»
namespace issuecrypto
namespace prefix; see EXSLTSee also: Use a namespace | select -N
.
xmlstarlet edit
’s xpath
arguments do not
support XSLT functions.
See EXSLT.
A -b (--break)
too many
was used.
EXSLT is an extension library for XSLT, mainly for XSLT 1.0. It provides missing language features such as functions to handle strings, math, dates, and sets, as well as nodeset coercion, user-defined functions, and dynamic evaluation of strings containing XPath expressions.
Linked with the libexslt library
xmlstarlet
’s XSLT processing commands, select
and transform
, support a larger number of EXSLT functions (and a
few elements) whereas xmlstarlet edit
supports a subset.
xmlstarlet
predefines the EXSLT namespaces with prefixes
date
, dyn
, exslt
(not
exsl
), math
, set
, and
str
, as well as the saxon
and
test
namespaces. To use crypto
functions (or
the func
elements) declare the namespace explicitly with -N
, for the exslt:document
element see example.
Note: In 2012 str:replace
was removed as broken
when used in an XPath context (xmlstarlet edit
) but remains
available when used in an XSLT context (select
and
transform
).
See also: List of XSLT
extensions | transform --show-ext
|
select -N
Links: EXSLT docs on github.io | EXSLT project on github.com | EXSLT on stackoverflow.com
Observations:
libexslt
’s math:random
appears to
use the standard C random generator, returning the same number sequence
(beginning with 0.84018771715471
) in every
xmlstarlet
sessionlibexslt
(not EXSLT) limits the length of str:padding
to 100000
(one hundred thousand)crypto
- 5 functions in namespace
http://exslt.org/crypto
(md4
,
md5
, rc4_decrypt
, rc4_encrypt
,
sha1
), cf. libexslt/crypto.c
source codesaxon
- 5 functions in namespace
http://icl.com/saxon
(eval
,
evaluate
, expression
,
line-number
, systemId
), from the saxon 6.5.5
extensionstest
- function and element in namespace
http://xmlsoft.org/XSLT/
- echoes its argumentCompute the height of an XML tree as the maximum depth of a branch node of the tree. Root and leaf nodes count as zero.
-t (--template)
makes /
(root node) the current nodedyn:map
function maps each XML element (root node descendants) to its depth: the
number of its ancestor elementsmath:max
returns the maximum numberxmlstarlet select -T -t \
-v 'math:max(dyn:map(descendant::*,"count(ancestor::*)"))' -n \
"${infile:-file.xml}"
Links: Tree (data structure) on Wikipedia
A simple recordset enclosed in a root element,
<rs>
<r id="1" user="3" name="abc" date="2017-08" flag1="false"/>
<r id="2" user="7" name="defg" date="2019-12" flag1="false"/>
<r id="3" user="9" name="hijkl" date="2020-02" flag1="true"/>
<r id="4" user="11" name="mno" date="2022-01" flag1="false"/>
<r id="5" user="14" name="pqrs" date="2022-01" flag1="false"/>
</rs>
is converted to TSV by:
xmlstarlet select --text -t \
--var ishdr="${hdr:-1}" \
--var ofs -o "$(printf '\t')" -b \
--var ors -n -b \
--var fnhdr='"concat($ofs,name())"' \
--var fnrow='"concat($ofs,string())"' \
-m '*/*[$ishdr and position() = 1]' \
-v 'substring-after(str:concat(dyn:map(@*,$fnhdr)),$ofs)' -v '$ors' \
-b \
-m '*/*' \
-v 'substring-after(str:concat(dyn:map(@*,$fnrow)),$ofs)' -v '$ors' \
-b \
"${infile:-file.xml}"
where:
ofs
and ors
variables hold output
field and record separators, respectivelyishdr
flag controls output of a header line with
attribute namesfnhdr
is the function argument (as text) to the EXSLT
dyn:map
function mapping an attribute node to its name()
preceded
by a field separator; dyn:map
returns a nodeset which is
stringified by the EXSLT str:concat
function, then stripped of the initial extra separator by
substring-after()
fnrow
is the ditto to map to its text value, in this
case .
(dot) can replace string()
*/*
and @*
only (both in 2 places)TSV output:
id user name date flag1
1 3 abc 2017-08 false
2 7 defg 2019-12 false
3 9 hijkl 2020-02 true
4 11 mno 2022-01 false
5 14 pqrs 2022-01 false
If the data items exist as child elements of /rs/r
(e.g. after:
xmlstarlet sel -t -e '{name(*)}' -m '*/*' -e '{name()}' -m '@*' -e '{name()}' -v . "${infile}"
),
instead dyn:map(*,…)
(2 places) to process
child::*
rather than attribute::*
.
Links: packages.debian.org
xml2
h2
sections with foo
titles select
div
with the most p
childrenProcess an HTML document where each h2
element heads a
number of div
s:
xmlstarlet select -t \
--var T='//_:div[_:p][contains(preceding::_:h2[1]/text(),"foo")]' \
-c '($T[count(_:p) = math:max(dyn:map($T,"count(_:p)"))])[1]' \
file.xhtml
T
collects the div
nodes with least
one p
child following each h2
title containing
the text foo
dyn:map
maps each div
to the count of its p
children,
math:max
picks the maximum count($T[…])[1]
selects the first of possibly more
div
s with a maximum p
countConverting between local time (L
) and UTC time
(Z
) in different time zones, the TZ
environment variable selecting an entry in /usr/share/zoneinfo
,
putting EXSLT functions date:date-time
,
date:add
, date:duration
, and
date:seconds
to use.
Links: EXSLT dates-and-times
docs | tz
database on Wikipedia
for zone in \
'America/Vancouver' 'Europe/Vatican' 'Asia/Manila' 'Pacific/Chatham'
do
printf '<v/>\n' |
TZ=":$zone" xmlstarlet select --text \
-t \
--var ofs -o "$(printf '\t')" -b \
--var ors -n -b \
--var tz -o "$zone" -b \
--var dttodayL -v 'date:date-time()' -b \
--var tzoffset='substring($dttodayL,20)' \
--var tzseconds='number(concat(
translate(substring($tzoffset,1,1),"-−+","--"),
substring($tzoffset,2,2) * 60 * 60 +
substring($tzoffset,5,2) * 60))' \
--var dtepochL='concat("1970-01-01T00:00:00",$tzoffset)' \
--var dtepochZ -o '1970-01-01T00:00:00+00:00' -b \
--var dttodayZ='concat(substring-before(date:add($dtepochZ,
date:duration(date:seconds($dttodayL))),
"Z"),"+00:00")' \
--var dttoday2L='date:add($dtepochL,
date:duration(date:seconds($dttodayZ)+$tzseconds))' \
-v 'concat(
"# ",$tz,$ors
,"dttodayL", $ofs, $dttodayL, $ors
,"tzoffset", $ofs, $tzoffset, $ors
,"tzseconds",$ofs, $tzseconds,$ors
,"dttodayZ", $ofs, $dttodayZ, $ors
,"dttoday2L",$ofs, $dttoday2L,$ors
)'
done |
expand -t 16
Using … -v 'date:date-time()' -b
(rather than
…='date:date-time()'
) to avoid
xmlstarlet select
’s EXSLT namespace issue.
Output:
# America/Vancouver
dttodayL 2025-03-12T13:37:19-07:00
tzoffset -07:00
tzseconds -25200
dttodayZ 2025-03-12T20:37:19+00:00
dttoday2L 2025-03-12T13:37:19-07:00
# Europe/Vatican
dttodayL 2025-03-12T21:37:19+01:00
tzoffset +01:00
tzseconds 3600
dttodayZ 2025-03-12T20:37:19+00:00
dttoday2L 2025-03-12T21:37:19+01:00
# Asia/Manila
dttodayL 2025-03-13T04:37:19+08:00
tzoffset +08:00
tzseconds 28800
dttodayZ 2025-03-12T20:37:19+00:00
dttoday2L 2025-03-13T04:37:19+08:00
# Pacific/Chatham
dttodayL 2025-03-13T10:22:19+13:45
tzoffset +13:45
tzseconds 49500
dttodayZ 2025-03-12T20:37:19+00:00
dttoday2L 2025-03-13T10:22:19+13:45
set:leading
and
set:trailing
Links: EXSLT set
functions on github.io
Using an explicit namespace declaration -N str='…'
to
avoid xmlstarlet select
’s EXSLT namespace issue.
printf '%s\n' '<v s="/fee/fi/fo/fum"/>' |
xmlstarlet select -R \
-N str='http://exslt.org/strings' \
-t \
--var sep='"/"' \
--var T='str:split(*/@s,$sep)' \
-n -c '$T' -n \
-n -c 'set:leading($T,$T[.="fo"])' -n \
-n -m '$T' -c 'set:leading($T,following-sibling::*[1])' -n -b \
-n -m '$T' -c 'set:trailing($T,preceding-sibling::*[1])' -n -b \
-n -e 'foo' -m 'set:trailing($T,$T[.="fee"])' -v 'concat($sep,.)' -b -b -n
Output:
<xsl-select>
<token>fee</token><token>fi</token><token>fo</token><token>fum</token>
<token>fee</token><token>fi</token>
<token>fee</token>
<token>fee</token><token>fi</token>
<token>fee</token><token>fi</token><token>fo</token>
<token>fee</token><token>fi</token><token>fo</token><token>fum</token>
<token>fee</token><token>fi</token><token>fo</token><token>fum</token>
<token>fi</token><token>fo</token><token>fum</token>
<token>fo</token><token>fum</token>
<token>fum</token>
<foo>/fi/fo/fum</foo>
</xsl-select>
See also: Divide a document into sections | Generate a date sequence
Links: ISO
8601 standard on Wikipedia | Daylight saving time (DST) on
Wikipedia | TZ
env.var. on OpenGroup
This is where the EXSLT strings
,
sets
,
and dates-and-times
modules come together to compute a datetime series from 3 arguments:
start
, default value is today in ISO 8601 extended
formatstep
, default value is 1 day in ISO 8601 formatmaxct
, the maximum count, default value is 100XSLT doesn’t do loops but EXSLT lets you create a string of any
length and str:split
it into a nodeset each member of which
contains the step
interval. An initial empty time period
(PT0S
) is added to handle the first item. Using
set:leading
to collect N step
s,
date:sum
to sum them up, then adding the sum to
start
, to arrive at a result for each item in the series.
It’s an inefficient algorithm so nil points for performance (though
probably fast enough for ordinary maxct
values).
xsdateseq0() {
printf '<v start="%s" step="%s" maxct="%s"/>\n' \
"${1:-$(date '+%Y-%m-%d')}" "${2:-P1D}" "${3:-100}" |
xmlstarlet select --text \
-N str='http://exslt.org/strings' \
-t --var start='*/@start' \
--var padlen='(*/@maxct - 1) * (1 + string-length(*/@step))' \
--var D='str:split(concat("PT0S",str:padding($padlen, concat(" ",*/@step))))' \
-m '$D' -v 'date:add($start, date:sum(set:leading($D,following-sibling::*[1])))' -n
}
Using an explicit namespace declaration -N str='…'
to
avoid xmlstarlet select
’s EXSLT namespace issue.
libexslt
doesn’t support date:format-date
but
there’s an implementation
(EXSLT function and XSLT template) by Jeni Tennison.
Print 53 dates starting on January 1st with a step value of 7 days.
TZ=':Europe/Vatican' xsdateseq0 '2025-01-01' P7D 53 | pr -t -8 -s' ' -
2025-01-01 2025-02-19 2025-04-09 2025-05-28 2025-07-16 2025-09-03 2025-10-15 2025-11-26
2025-01-08 2025-02-26 2025-04-16 2025-06-04 2025-07-23 2025-09-10 2025-10-22 2025-12-03
2025-01-15 2025-03-05 2025-04-23 2025-06-11 2025-07-30 2025-09-17 2025-10-29 2025-12-10
2025-01-22 2025-03-12 2025-04-30 2025-06-18 2025-08-06 2025-09-24 2025-11-05 2025-12-17
2025-01-29 2025-03-19 2025-05-07 2025-06-25 2025-08-13 2025-10-01 2025-11-12 2025-12-24
2025-02-05 2025-03-26 2025-05-14 2025-07-02 2025-08-20 2025-10-08 2025-11-19 2025-12-31
2025-02-12 2025-04-02 2025-05-21 2025-07-09 2025-08-27
Print 10 datetimes with a step value of 1 day, 1 hour, 1 minute, and
5 seconds.
Note the lack of DST adjustment.
TZ=':Europe/Vatican' xsdateseq0 '2022-10-26T07:30:00+01:00' 'P1DT1H1M5S' 10
2022-10-26T07:30:00+01:00
2022-10-27T08:31:05+01:00
2022-10-28T09:32:10+01:00
2022-10-29T10:33:15+01:00
2022-10-30T11:34:20+01:00
2022-10-31T12:35:25+01:00
2022-11-01T13:36:30+01:00
2022-11-02T14:37:35+01:00
2022-11-03T15:38:40+01:00
2022-11-04T16:39:45+01:00
Print 8 datetimes in email format (RFC 822). Uses GNU
date
for -R
and -f
options
and DST adjustment.
TZ=':Europe/Vatican' xsdateseq0 '2022-10-26T11:30:00' '' 8 | date -Rf-
Wed, 26 Oct 2022 13:30:00 +0200
Thu, 27 Oct 2022 13:30:00 +0200
Fri, 28 Oct 2022 13:30:00 +0200
Sat, 29 Oct 2022 13:30:00 +0200
Sun, 30 Oct 2022 12:30:00 +0100
Mon, 31 Oct 2022 12:30:00 +0100
Tue, 01 Nov 2022 12:30:00 +0100
Wed, 02 Nov 2022 12:30:00 +0100
xmlstarlet select
doesn’t support xsl:key
but grouping can be done using EXSLT functions. As an example, group
repeating fields in each record by element name and merge their texts in
document order.
<recs>
<rec>
<fb>fee</fb>
<fa>foo</fa>
<fd>zzz</fd>
<fc>bat</fc>
<fa>bar</fa>
<fb>faw</fb>
<fd>bat</fd>
<fb>fum</fb>
<fa>quux</fa>
</rec>
<rec>
<fa>fee</fa>
<fc>fo</fc>
<fc>fum</fc>
<fa>fi</fa>
</rec>
</recs>
xmlstarlet select --indent -t \
--var sfs="'${sfs:- }'" \
-e '{name(*)}' \
-m '*/*' \
--var rec='.' \
-e '{name()}' \
-m 'set:distinct(dyn:map(*,"name()"))' \
-s 'A:T:-' '.' \
-e '{.}' \
-v 'substring-after(
str:concat(
dyn:map($rec/*[name()=current()],"concat($sfs,text())")
)
,$sfs)' \
"${infile:-file.xml}"
sfs
, takes its value from a
shell variable of the same name, defaulting to a single space
character-e (--elem)
opens a
named element using an attribute value template,
duplicating the input structure-m (--match)
iterates over rec
elements, the 2nd over unique field names
(also used as sort key): dyn:map
maps fields to their name and set:distinct
eliminates duplicatesrec
elements with the same name as
the current field dyn:map
adds a sub-field separator to
each text, returning a nodeset which is stringified by str:concat
then stripped of the initial extra separatorOutput:
<recs>
<rec>
<fa>foo bar quux</fa>
<fb>fee faw fum</fb>
<fc>bat</fc>
<fd>zzz bat</fd>
</rec>
<rec>
<fa>fee fi</fa>
<fc>fo fum</fc>
</rec>
</recs>
See also: Remove all but
the latest member of each group example
This section takes xmlstarlet
off the beaten track.
select
as
edit
script generatorxmlstarlet select
doesn’t copy its
input to output; edit
cannot do
xsl:for-each
, xsl:choose
, or use XSLT
functions. In tandem they have a wider range – but so does an XSLT
stylesheet.
Links: shell quoting | shell word expansions
xmlstarlet edit
’s rename
action requires a literal value for the new name so XPath functions are
out. But select
can generate the edit command, for example
to number elements (here using the XSLT format-number()
function):
<Names>
<Name>fee</Name>
<Name>faw</Name>
<Name>fum</Name>
</Names>
# shellcheck shell=sh disable=SC2016
xmlstarlet select --text -t \
--var sq -o "'" -b \
-o "xmlstarlet edit --omit-decl \\" -n \
-o " --var N 'Names/Name' \\" -n \
-m '*/*' \
-o ' -r ' -v 'concat($sq,"$N[",position(),"]",$sq)' \
-o ' -v ' -v 'concat($sq,name(),format-number(position(),"0000"),$sq)' -o " \\" -n \
-b \
-f -n \
"${infile:-file.xml}"
Output:
xmlstarlet edit --omit-decl \
--var N 'Names/Name' \
-r '$N[1]' -v 'Name0001' \
-r '$N[2]' -v 'Name0002' \
-r '$N[3]' -v 'Name0003' \
file.xml
To execute the output as a shell script:
xmlstarlet-select-command | sh -s > result.xml
Alternatively, replace $N
with
(Names/Name)
, or process elements in reverse order by
repeatedly renaming Names/Name[last()]
– the predicate
[…]
binding to the nearest XPath location
step.
EXSLT functions provide another way to do
grouping. Here’s how to create a shell script invoking
xmlstarlet edit
to delete all but the latest member of each
group. The input file has module ID strings on the form: group ID,
_
(underscore), major version number, .
(dot),
minor version number – as shown in this snippet:
<mod>mrR_0.9</mod>
<mod>mrR_0.10</mod>
<mod>mrM_0.19</mod>
<mod>mrM_0.2</mod>
<mod>mrM_0.20</mod>
<mod>mrM_0.3</mod>
Method:
dyn:map
a module ID string to its group ID (invoking $fngrpid
)set:distinct
eliminates duplicatesstr:split
the version into major and minor numberdyn:map
each number to an N-digit string (invoking $fnverno
)str:concat
stringifies the nodeset returned by dyn:map
creating
strings on the form 00010011
(i.e. version
1.11
) to be passed to -s (--sort)
/..
(root node
has no parent)edit
actions-n
before
-v
set:difference
($M
minus $keep
)
$
s (dollar signs) to guard against shell
word expansions# shellcheck shell=sh disable=SC2016,SC2064
xmlstarlet select --text -t \
--var dq -o '"' -b \
--var sep1='"_"' \
--var sep2='"."' \
--var fngrpid -o 'substring-before(.,$sep1)' -b \
--var fnverno -o 'format-number(.,"0000")' -b \
--var allm='//_:mods/_:mod' \
-o "xmlstarlet edit \\" -n \
-o " --var M '//_:mods/_:mod' \\" -n \
-o " --var keep '/.. " \
-m 'set:distinct(dyn:map($allm,$fngrpid))' \
--var grpid_='concat(.,$sep1)' \
-m '$allm[starts-with(.,$grpid_)]' \
-s 'D:N:-' '0 + str:concat(dyn:map(str:split(substring-after(.,$sep1),$sep2),$fnverno))' \
--if 'position() = 1' \
-n -v 'concat(" | $M[.=",$dq,current(),$dq,"]")' \
-b \
-b \
-b \
-o "' \\" -n \
-o " --delete 'set:difference(\$M,\$keep)' \\" -n \
-f -n \
"${infile:-file.xml}"
See also: select
’s
--var
| -m (--match)
| -s (--sort)
| -i (--if)
| -b (--break)
| -f (--inp-name)
| Group by element name and
merge text example
Links: XSLT functions format-number()
| current()
Output:
xmlstarlet edit \
--var M '//_:mods/_:mod' \
--var keep '/..
| $M[.="mrR_1.11"]
| $M[.="mrS_0.7"]
| $M[.="mrE_2.2"]
| $M[.="mrM_0.20"]' \
--delete 'set:difference($M,$keep)' \
file.xml
To execute the output as a shell script:
xmlstarlet-select-command | sh -s > result.xml
See also: edit
’s
--var
| -d (--delete)
This is the basic “update node with result of shell command” usecase.
Create a shell script to have xmlstarlet edit
add
missing targets to an XLIFF
version 2.0 localization data file by invoking translate-shell
to supply translated phrases:
# shellcheck shell=sh disable=SC2016
xmlstarlet select --text -t \
--var sq -o "'" -b \
--var dq -o '"' -b \
--var cmdopt='concat("trans -from ",/_:xliff/@srcLang," -to ",/_:xliff/@trgLang)' \
-o 'xmlstarlet edit --pf '\\ -n \
-m '//_:segment[not(_:target)]' \
--var xpath-a='concat("//_:unit[@id=",$dq,parent::_:unit/@id,$dq,"]/_:segment/_:source")' \
--var src-e -v 'str:replace(_:source,$sq,concat($sq,"\",$sq,$sq))' -b \
-o ' -a ' -v 'concat($sq,$xpath-a,$sq)' -o ' -t elem -n target '\\ -n \
-o " -u '\$xstar:prev'" -o ' -v "$(' -v 'concat($cmdopt," ",$sq,$src-e,$sq)' -o ')" '\\ -n \
-o " -i '\$xstar:prev' -t text -n indent -v '' \\" -n \
-o " -u '\$xstar:prev' -x 'preceding-sibling::node()[2][normalize-space()=\"\"]' \\" -n \
-o ' '\\ -n \
-b \
-f -n \
"${infile:-file.xml}"
--text
option to
generate a shell script--var
to define quote
characters and longer substrings as XPath variables-m
processes
segment
elements not having a target
src-e
variable escapes single quotes in the
source
text, str:replace
converting '
to '\''
,--var … -b
to avoid select
’s --var «name»=«value»
namespace issue-o
and -v
generate text quoting
correctly for both the shell and XPath, escaping $
(dollar
sign) inside double quotes to guard against shell
word expansionsSnippets from sample data file:
<source>Über "O'ona"$tra"</source>
<source>&Speichern als <.oona></source>
Sample output:
xmlstarlet edit --pf \
-a '//_:unit[@id="2"]/_:segment/_:source' -t elem -n target \
-u '$xstar:prev' -v "$(trans -from de -to en 'Über "O'\''ona"$tra"')" \
-i '$xstar:prev' -t text -n indent -v '' \
-u '$xstar:prev' -x 'preceding-sibling::node()[2][normalize-space()=""]' \
\
-a '//_:unit[@id="24"]/_:segment/_:source' -t elem -n target \
-u '$xstar:prev' -v "$(trans -from de -to en '&Speichern als <.oona>')" \
-i '$xstar:prev' -t text -n indent -v '' \
-u '$xstar:prev' -x 'preceding-sibling::node()[2][normalize-space()=""]' \
\
file.xml
--pf
preserves original
formatting-a
appends an empty
target
element (after source
)trans
is invoked through shell
command substitution-u … -v "$(…)"
adds the translated text to the new target
element – this
is faster than -a … -v "$(…)"
which requires
&
(ampersand) encoded
as &
(i.e. avoids adding xmlstarlet escape
to the pipeline)-i …
and -u … -x …
indent
target
to the same column as source
$xstar:prev
reference the newly appended target
element, the 3rd the
indentation textThe output can be executed directly as a shell script:
xmlstarlet-select-command | sh -s > result.xlf
Snippets from result.xlf
:
<target>About "O'ona"$tra"</target>
<target>&Save as <.oona></target>
select
as XSLT
stylesheet generatorThis section is included for completion.
Links: shell quoting | shell word expansions
To use XSLT or extension elements not supported by
xmlstarlet select
’s options
it’s possible to have select
spell out an XSLT stylesheet.
This example inserts one document into another, the first
xsl:template
is the identity transform.
: "${xml1=z1.xml}" "${xml2=z2.xml}"
test -s "$xml1" || printf '%s\n' '<v><THERE/></v>' > "$xml1"
test -s "$xml2" || printf '%s\n' '<w><x q="what">ever</x></w>' > "$xml2"
echo '<v/>' |
xmlstarlet select -t \
-e xsl:transform -a version -o 1.0 -b \
-e xsl:param -a name -o xdoc -b -o /dev/null -b \
-e xsl:template -a match -o '@*|node()' -b \
-e xsl:copy \
-e xsl:apply-templates -a select -o '@*|node()' -b -b \
-b \
-b \
-e xsl:template -a match -o THERE -b \
-e xsl:copy-of -a select -o 'document($xdoc,/)' -b -b \
-b \
-b |
xmlstarlet transform --omit-decl /dev/stdin -s xdoc="$xml2" "$xml1"
Output:
<v><w><x q="what">ever</x></w></v>
Notes:
document($xdoc,/)
keeps the XSLT
processor from resolving $xdoc
relative to the stylesheet’s
location and attempting to open the probably nonexisting
/dev/z2.xml
-b (--break)
s may be omitted as they’re not
followed by any template options$var
– unlike ${var}
– can be an XSLT or a
shell variableSee also: -t (--template)
| -e (--elem)
| -a (--attr)
| -o (--output)
| -b (--break)
| transform
Links: XSLT document()
Take this one step further and create a library of shorthand shell functions (causing shellcheck.net a.o. to vociferate):
xslxfm() ## xsl:transform(); non-closed
printf " -e xsl:transform -a version -v '1.0' -b "
xsltpl() ## xsl:template(match name?); non-closed
printf " -e xsl:template -a match -v '%s' -b%s" \
"${1:?usage: template(match name?)}" "${2:+ -a name -v '$2' -b }"
xslapt() ## xsl:apply-templates(select?); closed
case $# in
(0) printf ' -e xsl:apply-templates -b ' ;;
(1) printf " -e xsl:apply-templates -a select -v '%s' -b -b " "$1" ;;
(*) printf ' usage: xslapt(select?)\n' 1>&2; false ;;
esac
xslIDN() { ## xsl:template name=identity; closed
xsltpl '@*|node()' 'identity'
printf " -e xsl:copy "
xslapt '@*|node()'
printf " -b -b "
}
xslhelp() { ## list xsl* functions in this file
sed -n -e '/^\(xsl[^ ]*\)()[ {]*## \(.*\)/ s//\1 \2/p' "${_pnself_:-$0}" |
expand -t 12
}
Produce the same output as above having select
, not
transform
, include the external document,
. "${pathto:-./}xsdefs.sh"
echo '<v/>' |
xmlstarlet sel -I -t $(xslxfm) $(xslIDN) $(xsltpl THERE) -c "document('${xml2}',/)" -b |
xmlstarlet tr --omit-decl /dev/stdin "$xml1"
providing the following stylesheet to the XSLT processor,
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="@*|node()" name="identity">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="THERE">
<w>
<x q="what">ever</x>
</w>
</xsl:template>
</xsl:transform>
as tee /dev/stderr |
inserted in the pipeline will
show.
Links: The XML version of the XSLT 1.0 Rec contains element syntax and function prototypes
Use the EXSLT exslt:document
element as a template to split an XML document into multiple parts and
output them as separate files in an existing directory.
Caution: Leaving out
the exslt:nop
attribute here triggers an
xsl:extension-element-prefix : undefined namespace exslt
error, with or without -N exslt='http://exslt.org/common'
(exslt
is predefined).
printf '%s\n' '<v><x>fee fi</x><y>fo fum</y></v>' |
xmlstarlet select -I -t \
--var part-prefix -o "${outDir:-/tmp/}part" -b \
-e 'xsl:transform' \
-a 'version' -o '1.0' -b \
-a 'exslt:nop' -o '' -b \
-a 'extension-element-prefixes' -o 'exslt' -b \
-e 'xsl:template' \
-a 'match' -o '/' -b \
-m '*/*' \
-e 'exslt:document' \
-a 'href' -v 'concat($part-prefix,format-number(position(),"000"),".xml")' -b \
-a 'method' -o 'xml' -b \
-a 'omit-xml-declaration' -o 'yes' -b \
-e 'part' \
-a 'no' -v 'position()' -b \
-a 'of' -v 'last()' -b \
-c '.' |
{ printf '%s\n' '<v/>' | xmlstarlet transform /dev/fd/3 /dev/stdin ; } 3<&0
Generated XSLT script:
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:exslt="http://exslt.org/common" version="1.0" exslt:nop="" extension-element-prefixes="exslt">
<xsl:template match="/">
<exslt:document href="/tmp/part001.xml" method="xml" omit-xml-declaration="yes">
<part no="1" of="2">
<x>fee fi</x>
</part>
</exslt:document>
<exslt:document href="/tmp/part002.xml" method="xml" omit-xml-declaration="yes">
<part no="2" of="2">
<y>fo fum</y>
</part>
</exslt:document>
</xsl:template>
</xsl:transform>
Generated documents:
<part no="1" of="2"><x>fee fi</x></part>
<part no="2" of="2"><y>fo fum</y></part>
See also: Split XML file example
xmlstarlet
XSLT extensionsList of registered extension functions, elements, and modules:
Registered XSLT Extensions
--------------------------
Registered extension functions:
{http://exslt.org/common}node-set
{http://exslt.org/common}object-type
{http://exslt.org/crypto}md4
{http://exslt.org/crypto}md5
{http://exslt.org/crypto}rc4_decrypt
{http://exslt.org/crypto}rc4_encrypt
{http://exslt.org/crypto}sha1
{http://exslt.org/dates-and-times}add
{http://exslt.org/dates-and-times}add-duration
{http://exslt.org/dates-and-times}date
{http://exslt.org/dates-and-times}date-time
{http://exslt.org/dates-and-times}day-abbreviation
{http://exslt.org/dates-and-times}day-in-month
{http://exslt.org/dates-and-times}day-in-week
{http://exslt.org/dates-and-times}day-in-year
{http://exslt.org/dates-and-times}day-name
{http://exslt.org/dates-and-times}day-of-week-in-month
{http://exslt.org/dates-and-times}difference
{http://exslt.org/dates-and-times}duration
{http://exslt.org/dates-and-times}hour-in-day
{http://exslt.org/dates-and-times}leap-year
{http://exslt.org/dates-and-times}minute-in-hour
{http://exslt.org/dates-and-times}month-abbreviation
{http://exslt.org/dates-and-times}month-in-year
{http://exslt.org/dates-and-times}month-name
{http://exslt.org/dates-and-times}second-in-minute
{http://exslt.org/dates-and-times}seconds
{http://exslt.org/dates-and-times}sum
{http://exslt.org/dates-and-times}time
{http://exslt.org/dates-and-times}week-in-month
{http://exslt.org/dates-and-times}week-in-year
{http://exslt.org/dates-and-times}year
{http://exslt.org/dynamic}evaluate
{http://exslt.org/dynamic}map
{http://exslt.org/math}abs
{http://exslt.org/math}acos
{http://exslt.org/math}asin
{http://exslt.org/math}atan
{http://exslt.org/math}atan2
{http://exslt.org/math}constant
{http://exslt.org/math}cos
{http://exslt.org/math}exp
{http://exslt.org/math}highest
{http://exslt.org/math}log
{http://exslt.org/math}lowest
{http://exslt.org/math}max
{http://exslt.org/math}min
{http://exslt.org/math}power
{http://exslt.org/math}random
{http://exslt.org/math}sin
{http://exslt.org/math}sqrt
{http://exslt.org/math}tan
{http://exslt.org/sets}difference
{http://exslt.org/sets}distinct
{http://exslt.org/sets}has-same-node
{http://exslt.org/sets}intersection
{http://exslt.org/sets}leading
{http://exslt.org/sets}trailing
{http://exslt.org/strings}align
{http://exslt.org/strings}concat
{http://exslt.org/strings}decode-uri
{http://exslt.org/strings}encode-uri
{http://exslt.org/strings}padding
{http://exslt.org/strings}replace
{http://exslt.org/strings}split
{http://exslt.org/strings}tokenize
{http://icl.com/saxon}eval
{http://icl.com/saxon}evaluate
{http://icl.com/saxon}expression
{http://icl.com/saxon}line-number
{http://icl.com/saxon}systemId
{http://xmlsoft.org/XSLT/}test
Registered top-level extension elements:
{http://exslt.org/functions}function
Registered instruction extension elements:
{http://exslt.org/common}document
{http://exslt.org/functions}result
{http://xmlsoft.org/XSLT/}test
Registered extension modules:
http://exslt.org/functions
http://icl.com/saxon
http://xmlsoft.org/XSLT/
… as output by:
xmlstarlet transform --show-ext 2>&1 |
awk -F '\n' -v S='sort' '/^Registered|^-*$/{ close(S); print; next } { print | S }'
xmlstarlet
news summarySnipped from SourceForge news and files sections. Covers versions 1.0.3 through 1.6.1 (in reverse order).
XMLStarlet 1.6.1 Released
1.6.1: August 9, 2014
- handle unicode arguments under Windows
There is no difference for non-Windows platforms.
Posted by Noam Postavsky 2014-08-09
XMLStarlet 1.6.0 Released
Changes:
get rid of "helpful" message about namespaces
update user guide
Enhancements:
add --stop option to val
add global option --no-doc-namespace
Build:
let the make install target succeed even if docs aren't built.
Posted by Noam Postavsky 2014-06-13
XMLStarlet 1.5.0 is released, changes:
Bugs:
avoid segfault on pyx non-existant file
fix unescaping of entities straddling 4K byte boundary (Bug #102)
Enhancements:
unescape hex entities (&#xXX;)
give a helpful message if doc has default namespace and nothing matched
add "_" and "DEFAULT" as names for document's top-level default namespace
Adding a global quiet option
ed: Allow omitting value argument to create empty element.
use default attribute values in sel subcommand
Build:
fix test variables to work with newer automake (1.11 -> 1.13)
fix usage2c.awk for mawk
scripts for building on mingw
Posted by Noam Postavsky 2013-07-07
1.4.2: Dec 28, 2012
- pyx: avoid segfault on documents with multiple attributes (Bug
#3595212)
1.4.1: Dec 8, 2012
- avoid segfault when attempting to edit the document node (Bug
#3575722)
- Packaging:
- include doc/xmlstar-fodoc-style.xsl in the dist so that the
--enable-build-docs option works from the tarball (Bug
#3580667)
- AC_SUBST PACKAGE_TARNAME for automake so that documentation is
installed to the right place (Bug #3561958)
- Test Suite:
- avoid test failures due to XML formatting and whitespace
changes (also fixes Bug #3572789)
- use automake's parallel test suite
- make bigxml tests much faster by using whitespace instead of nodes
- don't test str:replace() with ed: it doesn't work outside of
xslt in new libxslt
- ignore extra errors from libxml 2.9.0 bug
- let tests run using busybox
- add runAllTests.sh to run tests without make
1.4.0: Aug 26, 2012
- Documentation:
- executable name used in documentation now matches
--transform-program-name (Bug #3283713)
- added Makefile rules for generating documentation
(./configure --enable-build-docs)
- ed subcommand:
- relative XPaths are now handled correctly (Bug #3527850)
- the last nodeset inserted by an edit operation can be
accessed as the XPath variable $prev (or $xstar:prev)
- add --var option to define XPath variables
- allow ed -u -x to insert nodesets instead of converting to
string
- remove hard limit for number of edit operations (Bug
#3488240)
- pyx now handles namespaces correctly
1.3.1: Jan 14, 2012
- handle multiple values for --value-of properly (Bug #2563866)
- substitute external entities (Bug #3467320)
- pyx output needs space between attribute name and value (Bug #3440797)
1.3.0: Oct 7, 2011
- avoid ASCII CRs in UTF-16/32 text (reported by Ming Chen)
- --value-of outputs concat values of all nodes (Req #2563866)
- encode special chars for ed -u -x
- allow use of exslt functions in ed -u -x
- add --var to select (allow --var <name>=<value> as well as --var
<name> <value> --break)
- work around libxml bug that passes bogus data to error handler
(Bug #3362217)
Source: README.1.3.0, updated 2011-10-02
1.2.1: July 07, 2011
- check for NULL nodeset result (Bugs #3323189, #3323196)
- "-" was being confused with --elif
- generated XSLT should also have automatic namespaces
- allow -N after other option (Bug #3325166)
- namespace values were being registered as prefixes
- avoid segfault when asked to move namespace nodes
- missing newline in ed --help message
- test scripts portability
- no bashisms allowed in NetBSD sh
- make BRE portable: '+' is not allowed
- deal with msys path conversion properly (Bug #3178657)
- don't use XML_SAVE_WSNONSIG #if libxml < 2.7.8 (Bug #3310475)
Source: README.1.2.1, updated 2011-07-07
1.2.0: June 1, 2011
- implement ed --update --expr
- use top-level namespace definitions from first input file, this
should remove the need to define namespaces on the command line
with -N in most cases.
- select exits with 0 only if result is non-empty (Req #3155702)
- add -Q to select, like grep's -q
- add column number to error messages
- restore input context (lost in version 1.0.3) to error messages
(Bug #3305659)
- print extra string information in error messages
- use entity definitions from dtd (Bug #3305659)
- add --net option to c14n, ed, fo, and val (Req #1071398)
- remove --catalog from tr --help message since it isn't actually supported
- add --elif and --else to sel --help message
Source: README.1.2.0, updated 2011-06-01
1.1.0: Apr 3, 2011
- bug fix for BSD/OSX: check that O_BINARY is declared before
#including io.h (Bug 3211822)
- select improvements
- add --elif and --else options
- sorting on multiple fields
- correct (for English) lexical sorting instead of ASCIIbetical
- only outputs namespaces that are actually used
- only outputs xsl:param inputFile if it's used
- don't make separate templates if there is only 1
- link to shared libxml and libxslt libraries by default
- add library version info to --version output
- add directory argument for ls; exit status indicates
failure/success instead of file count
- stop using old SAX1 interface, xmlstarlet will now link with a
libxml configured --without-sax1 and --without-legacy
Source: README.1.1.0, updated 2011-04-04
1.0.6: Mar 13 2011:
- Bug fixes:
- c14n: set stdout to binary mode on Windows to avoid carriage
returns (Bug 840665)
- fix broken --help options
- put actual behaviour of -P, -S options in --help output (see
Bug/Feature Request 2858514)
- remove unneeded escape of quote in ./configure --help
- don't distribute xmlstarlet.spec: it's generated by ./configure
- add src/xml.o depends on version.h to Makefile.am so compile
will succeed without dependency info (eg after make distclean)
- add test for subcommands' --help option
- Portability fixes:
- yes isn't portable, use an awk program instead
- neither read -r nor xargs -0 are portable, escape the command
lines to xargs instead
- don't use nonportable echo -n option
Source: README.1.0.6, updated 2011-03-13
1.0.5: Feb 11 2011:
- Bug fixes:
- use XSLT_PARSE_OPTIONS, else CDATA nodes can cause corruption (Bug 3158482)
- fix typo in help message
- get rid of warnings in -ansi -pedantic mode
- required libxml2 version is 2.6.23
- usage strings use argv[0] as program name
- --help prints to stdout and exits with success
- double /'s under msys to avoid path conversion
- Portability fixes:
- don't use xargs (-d isn't portable)
- use -Wall only for gcc
-Build system:
- use -ansi in configure, and check for strdup declaration
- seperate list of sources and tests into subdirs
- check git version during make, not just autoconf
- tarball releases of configure.ac have actual version number
instead of querying git
Source: README.1.0.5, updated 2011-02-11
1.0.4: Jan 16 2011:
- Bug fixes:
- encode special XML characters in arguments (can now include quotes in xpath)
- non-zero exit code when input file is not found (Bug 3158488)
- ed with --pf/--ps options doesn't reformat output (Bug 3158490)
- exit() instead of segfault when trying to delete namespace nodes
(Bug 1120417)
- added --disable-static-libs ./configure option to use shared libxml2 and libxslt
- non-recursive make
- use TESTS and XFAIL_TESTS for testing, nicer output
Source: README, updated 2011-01-16
1.0.3: Nov 18 2010:
- Bug fixes:
escape --value in update mode (Bug 3052978)
c14n now includes default attributes (Bug 1505579)
Allow special characters in sel --output literal (Bug 1912978)
remove warning from xml_trans.c (Bug 1521756)
Use xmlReader interface so line numbers are 32-bit (Bug 1219072)
test for error messages on lines past 2^16 (Bug 1219072)
don't look for embedded dtd if not asked (Bug 1167215)
Source: README, updated 2010-11-18
xmlstarlet
wishlist AD 2003In 2003 Mikhail Grushinskiy posted
his xmlstarlet
wishlist.
Mikhail Grushinskiy - 2003-05-14
Here is a list of next steps in XmlStarlet on TODO or wishlist:
1. Editing xml documents with xml 'ed' option must be improved.
2. add --recover to fix broken XML documents
3. Document how to use proxy in XmlStarlet with nanohttp/ftp via http_proxy, ftp_proxy environment variables
ex: export http_proxy=http://192.168.0.1:8080/
4. Add ability to specify xpath expression in XmlStarlet 'el' option
5. -u option of XmlStarlet 'xml el' should work with others too. I.e. sort | uniq equivalent should work when attributes and attributes values are printed out.
6. Think about 'join' analogue
7. Something like xml sel -t -m <xpath> --exec <shell-cmd> --args <args> is needed
8. How would be possible to insert one XML fragment into another XML document from command line without XInclude?
9. Make use of regular expressions ex: Make all element names uppercase
10. Start thinking about diff and patch. Several tree diff algorithms could be implemented for ordered and non ordered labeled trees. What about creating context diff? How to define context in XML space? Good luck solving NP-Complete problems.
11. What about XUpdate implementation?
12. How about making output with syntax coloring in case if it is running in terminal (not batch) mode. Similar to GNU ls?
13. Convert XML to Lisp S-expressions
14. XML Namespace normalization process (There is a XSLT stylesheet floating on the web which could do it).
15. Make use of performance updates from libxml2. mmap() for document chunks, XMLReader interface, etc.
16. More regression testing test cases required.
17. Better Documentation User Guide and Tutorial is needed. More good and real-world examples.
If you wish to enhance/add something to this list, please, reply.
XmlStarlet home page,
http://xmlstar.sourceforge.net/
Thanks,
--MG
Mikhail Grushinskiy
Mikhail Grushinskiy - 2003-05-23
Few additions
1. Better namespace support.
2. something like xml head, and xml tail
3. list directories in XML
4. Defining variables in xml sel
Ex: xml sel -t --m / -d var_name -v @elem
-d would translate into
<xsl:variable name="var_name">
</xml:variable>
and this variable could be referenced as $var_name
in XPATH
5. CygWin binaries?
xmlstarlet
usage notes
xmlstarlet
commands
xmlstarlet elements
xmlstarlet select
select
[option …] template … [«xml-file» …]-h (--help)
- display
help-Q (--quiet)
- do not write
anything to standard output-C (--comp)
- display generated
XSLT-R (--root)
- print root element
<xsl-select>
-T (--text)
- output is text
(default is XML)-I (--indent)
- indent
output-D (--xml-decl)
- do not omit
XML declaration line-B (--noblanks)
- remove
nonsignificant whitespace from XML tree-E (--encode) «encoding»
-
output in the given encoding-N «prefix»=«value»
- declare
namespaces--net
- allow fetch DTDs or entities over network-t (--template)
is
<xsl:template match="/">
-m (--match)
is
<xsl:for-each select="xpath-expr">
-s (--sort)
is
<xsl:sort …/>
--var «name» «value» --break
is
<xsl:variable name="…">«value»</xsl:variable
>--var «name»=«value»
is
<xsl:variable name="…" select="«value»"/>
--var «name»=«value»
namespace issue-o (--output)
is
<xsl:text>«value»</xsl:text>
-e (--elem)
is
<xsl:element name="…">
-a (--attr)
is
<xsl:attribute name="…">
-c (--copy-of)
is
<xsl:copy-of select="xpath-expr"/>
-v (--value-of)
is
string-join((xpath-expr),newline)
-i (--if) [--elif …] [--else]
is
<xsl:when> … [<xsl:otherwise>]
-b (--break)
ends current
container element-n (--nl)
prints a newline-f (--inp-name)
prints
pathname / URI of current inputxmlstarlet edit
edit
option […] [action …] [«xml-file-or-uri» …]-i (--insert)
- add node
before-a (--append)
- add node
after-s (--subnode)
- add node as
child$prev
variable (aka $xstar:prev
)--var name 'xpath'
-u (--update) 'xpath' -v (--value) 'value'
-u (--update) 'xpath' -x (--expr) 'xpath'
-d (--delete) 'xpath'
-r (--rename) 'xpath' -v (--value) 'new-name'
-m (--move) 'xpath1' 'xpath2'
xpath
argumentsxpath
argumentsxmlstarlet format
format
[option …] [«xml-file»]-h (--help)
- display
help-e (--encode) «encoding»
-
output in the given encoding-n (--noindent)
- do not
indent-o (--omit-decl)
- omit XML
declaration-s (--indent-spaces) «N»
- indent output with N spaces-t (--indent-tab)
- indent
output with tabulation-C (--nocdata)
- replace CDATA
section with text nodes-D (--dropdtd)
- remove the
DOCTYPE of the input doc-H (--html)
- input is
HTML-N (--nsclean)
- remove
redundant namespace declarations-Q (--quiet)
[undocumented] -
suppress error output-R (--recover)
- try to recover
what is parsable--net
- allow network accessxmlstarlet c14n
xmlstarlet validate
validate
[option …] [«xml-file-or-uri» …]-w (--well-formed)
-
validate well-formedness only (default)-d (--dtd) «dtd-file»
- validate
against DTD--net
- allow network
access-s (--xsd) «xsd-file»
- validate
against XSD schema-E (--embed)
- validate using
embedded DTD-r (--relaxng) «rng-file»
-
validate against Relax-NG schema-e (--err)
- print verbose error
messages on stderr-S (--stop)
- stop on first
error-b (--list-bad)
- list only
files which do not validate-g (--list-good)
- list
only files which validate-q (--quiet)
- do not list
files (return result code only)xmlstarlet pyx
,
depyx
xmlstarlet escape
, unescape
xmlstarlet list
xmlstarlet transform
transform [option …] «xsl-file» [-p|-s «name»=«value» …] [«xml-file-or-uri» …]
-h (--help)
- display
help--omit-decl
- omit XML
declaration-E (--embed)
- allow applying
embedded stylesheet--show-ext
- show list of
extensions--val
- allow validate against
DTDs or schemas--net
- allow fetch DTDs or
entities over network--xinclude
- do XInclude
processing on document input--maxdepth value
- increase
the maximum depth--html
- input document(s) are
in HTML formatxmlstarlet edit
displays its usage reminder)xmlstarlet
XSLT extensionsxmlstarlet
news summaryxmlstarlet
wishlist AD 2003