xmlstarlet
usage notesxmlstarlet
(xmlstar.sf.net) is the no-nonsense XML multitool that lets you write simple queries or edits on the command line, avoids most of the stylesheet formal stuff, and gives your <o/><o/>
-looking eyes a moment of relief. It’s also a cranky minimalist tool which targets the 1.0 versions of XPath / XSLT / EXSLT still widely used and, it seems, users with little need for documentation.
This is an edited version of my personal notes on xmlstarlet
with worked examples – knowledge gained as an outsider through use, trial and error – focusing on the select
and edit
commands and EXSLT. It’s not a tutorial or a FAQ, it requires a grasp of XML tools and the POSIX shell. Copyright is retained.
xmlstarlet
featuresxmlstarlet
relies on libxml2
and libxslt
which are limited to XPath 1.0 and XSLT 1.0, plus a number of EXSLT and extension functionsxmlstarlet
is currently at version 1.6.1 which appeared in 2014 (that’s more or less a century ago in internet years); a number of FIXME
s and TODO
s remain in the source codexmlstarlet select
supports XPath 1.0, a subset of XSLT 1.0 (but not xsl:apply-templates
, xsl:key
a.o.), and EXSLTxmlstarlet edit
supports XPath 1.0 – plus some EXSLT but no XSLT – functions in its xpath
argumentsxmlstarlet transform
is a regular XSLT 1.0 processor with extensions, like xsltproc
xmlstarlet
commands do document formatting, canonicalization, validation, structure display, conversion of PYX and special characters, and file directory listingThere should be no limit on input XML (apart from available memory on your system)– forum posting by the original
xmlstarlet
developerxmlstarlet
is Copyright (c) 2002-2004 Mikhail Grushinskiy. All Rights Reserved.(cf. SourceForge or Fossies)
-q (--quiet)
means either short option -q
or long option --quiet
can be used.
«name»
in a command or message is a placeholder for the actual name used, e.g. xmlXPathCompOpEval: function «name» not found
.
Links look like this: external, internal, internal link appearing in a navigation link cloud, ditto* linking to a larger section with a local link cloud, [ sel ] linking into the table of contents. On mouseover headers display a permalink icon, on level 2 and 3 also navigation link icons, on level 4 a section link icon.
Code looks like this: test -s file.xml || log …
, occasionally with an …
(ellipsis) inside for brevity. For readability longer commands are usually split across lines and indented.
Admonitions look like this: Caution.
All shell code samples were made for a POSIX shell (dash
0.5.10) with xmlstarlet
1.6.1 (linked with libxml2 20913, libxslt 10134, and libexslt 820) from the Debian distribution.
[T]he use of SGML syntax for stylesheets was proposed as long ago as 1994, and it seems that this idea gradually became the accepted wisdom. It’s difficult to trace exactly what the overriding arguments were, and when you find yourself writing something like:
<xsl:variable name="y"> <xsl:call-template name="f"> <xsl:with-param name="x"/> </xsl:call-template> </xsl:variable>
to express what in other languages would be written as
y = f(x);
, then you may find yourself wondering how such a decision came to be made.
– Michael Kay, XSLT Programmer’s Reference, Ch.1, ISBN 1861005067
man xmlstarlet
xmlstarlet --help
xmlstarlet «command-name» --help
xmlstarlet
on SourceForge: homepage | docs | user’s guide | news | source | files | discussion | bugsdoc/xmlstarlet.txt
there is not the latest version as it doesn’t mention $prev
, --var
, -L (--inplace)
, and -E (--embed)
– the user’s guide is still silent on thesexmlstarlet
on Fossies – an accessible presentation of source, examples, and more:xmlstarlet-1.6.1.tar.gz
contents | xmlstarlet
user’s guide (1-page) | doc/xmlstarlet.txt
(latest version)xmlstarlet
forums on StackExchange: stackoverflow.com | unix.stackexchange.comgithub.io
: EXSLT docsgnome.org
: libxml2 Wiki Home (with links to standards, API, utilities a.o.) | libxml2 source | libxslt Wiki Home (with links to XSLT + EXSLT API a.o.) | libxslt+libexslt source | libxslt extensionsxmlsoft.org
now redirects to gnome.org
)packages.debian.org
: text/xmlstarlet | text/xsltproc | text/libxml2-utils (xmllint
) | text/html-xml-utils | utils/xml2 | web/tidystackoverflow.com
forums: XPath | XSLT 1.0 | EXSLT | XMLxmlstarlet select -C
select
’s -C (--comp)
option lists the stylesheet the current command line will generate – it requires no input file – e.g.
xmlstarlet select -T -C -t -m 'str:tokenize("Hello, world",",o")' -v '.' -n
Output:
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:str="http://exslt.org/strings" xmlns:exslt="http://exslt.org/common" version="1.0" extension-element-prefixes="exslt str">
<xsl:output omit-xml-declaration="yes" indent="no" method="text"/>
<xsl:template match="/">
<xsl:for-each select="str:tokenize("Hello, world",",o")">
<xsl:call-template name="value-of-template">
<xsl:with-param name="select" select="."/>
</xsl:call-template>
<xsl:value-of select="' '"/>
</xsl:for-each>
</xsl:template>
<xsl:template name="value-of-template">
<xsl:param name="select"/>
<xsl:value-of select="$select"/>
<xsl:for-each select="exslt:node-set($select)[position()>1]">
<xsl:value-of select="' '"/>
<xsl:value-of select="."/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
xmlstarlet
commandsCaution: Other features need some work too
warns the user’s guide.
Perhaps they’re thinking of these:
edit
’s “no kidding” issueselect
’s EXSLT namespace issuepyx
and depyx
’ several issuesselect
, c14n
, and transform
while edit
returns non-zero printing nothing but its lengthy usage reminderThe special XML characters are &<>'"
or – as references to predefined general entities – &
<
>
'
"
. With few exceptions[1] they are never entered as entity references in an xmlstarlet
command, and they are output as literals when xmlstarlet select
’s -T (--text)
option is in effect.
For example, in XPath predicates xmlstarlet
wants <
(less-than) on the command line, as in factor[. < 2.19]
, whereas XSLT stylesheets require <
inside attribute values, or >
(greater-than) with operands swopped. Likewise, a numeric character reference such as 	
for a tab character belongs in an XML file, not on the xmlstarlet
command line.
[1] Exceptions
The predefined entity ref:s – as well as character ref:s below Ā
– are recognized in the following which therefore require &
to represent an &
(ampersand) character:
-v (--value)
clause of xmlstarlet edit
’s -i (--insert)
, -a (--append)
, and -s (--subnode)
options when used with -t (--type) elem
xmlstarlet unescape
conversion utilityCaution: The example in the user’s guide section 4.1 meant to convert newlines to blanks using a character reference for the newline – xml sel … -v "translate(. , ' ', ' ')" …
– in fact converts &
(ampersand) characters to blanks, and strips #
, 1
, 0
, and ;
characters. The translate(…)
expression would work as intended in an XSLT stylesheet but means something rather different on the xmlstarlet select
command line.
See also: xmlstarlet esc
/ xmlstarlet unesc
| Replace text sample
xmlstarlet edit
handling special characters:
printf '%s' '<e/>'|
xmlstarlet edit -O \
-s '*' -t elem -n esuv -v '' -u '$prev' -v '&Save as <.oona>' \
-a '$prev' -t elem -n eaux -v '' -u '$prev' -x '"&Save as <.oona>"' \
-s '*' -t elem -n esv1 -v '&Save as <.oona>' \
-a '$prev' -t elem -n eav1 -v '&Save as <.oona>' \
-i '$prev' -t elem -n eiv1 -v "$(xmlstarlet escape '&Save as <.oona>')"
-u (--update)
follows xmlstarlet
’s general rule for a literal (esuv
) or an XPath expression (eaux
): special characters are encoded behind the scenes-i (--insert)
, -a (--append)
, and -s (--subnode)
used with -t (--type) elem
are special: certain entity/character ref:s are recognized so &
is required for &
(esv1
, eav1
, eiv1
) whereas <>'"
may be used to represent themselves-i
, -a
, -s
, and -r (--rename)
accept their name argument without modificationOutput:
<e>
<esuv>&Save as <.oona></esuv>
<eaux>&Save as <.oona></eaux>
<esv1>&Save as <.oona></esv1>
<eiv1>&Save as <.oona></eiv1>
<eav1>&Save as <.oona></eav1>
</e>
Using variables with xmlstarlet select
, e.g.
--var newline -n -b
--var tab -o "$(printf '\t')" -b
--var sq -o "'" -b
… -v 'concat($sq,"//_:",local-name(),$sq)'
--var eurosign='"€"'
--var eurosign -o "$(python3 -c 'print("{0:c}".format(0x20ac))')" -b
Using variables with xmlstarlet edit
, e.g.
--var sq '"'\''"'
--var dq "'\"'"
printf
argument, and inner double quotes to make an XPath string expression--var tab "$(printf '"\t"')"
--var tab '" "'
--var nl 'substring-before('"$(printf '"\nA"')"',"A")'
See also: select
--var
| edit
--var
The various xmlstarlet
commands each handle the XML declaration in their own way but all print it with a trailing newline if requested:
xmlstarlet edit … -O (--omit-decl) …
xmlstarlet format … -o (--omit-decl) …
xmlstarlet list
never outputs an XML declarationxmlstarlet pyx
always strips the XML declaration if presentxmlstarlet select … -D (--xml-decl) …
outputs an XML declarationxmlstarlet transform … --omit-decl …
Adding an XML declaration using select
:
$ printf '<v w="x"/>' |
xmlstarlet select -D -E 'ISO-8859-2' -t -c '/'
<?xml version="1.0" encoding="ISO-8859-2"?>
<v w="x"/>
Several xmlstarlet
commands allow selected options to be passed to the libxml2
XML parser (cf. API reference and source code) or the libxml2
XML serializer (cf. API reference and source code).
select
--net
: clears XML_PARSE_NONET
-B (--noblanks)
: sets XML_PARSE_NOBLANKS
-D (--xml-decl)
: clears XML_SAVE_NO_DECL
; sets omit-xml-declaration="no"
on xsl:output
-I (--indent)
: sets XML_SAVE_FORMAT
; sets indent="yes"
on xsl:output
-T (--text)
: sets method="text"
on xsl:output
edit
--net
: clears XML_PARSE_NONET
-O (--omit-decl)
: sets XML_SAVE_NO_DECL
-P (--pf)
: sets XML_SAVE_FORMAT
-S (--ps)
: sets XML_SAVE_WSNONSIG
(requires libxml2
2.7.8+)format
--net
: clears XML_PARSE_NONET
-C (--nocdata)
: sets XML_PARSE_NOCDATA
-N (--nsclean)
: sets XML_PARSE_NSCLEAN
-R (--recover)
: sets XML_PARSE_RECOVER
-o (--omit-decl)
: sets XML_SAVE_NO_DECL
-n (--noindent)
, -s (--indent-spaces)
, and -t (--indent-tab)
offer an alternative to the default indentationc14n
--net
: clears XML_PARSE_NONET
validate
--net
: clears XML_PARSE_NONET
-E (--embed)
: sets XML_PARSE_DTDVALID
XML_PARSE_DTDLOAD
and XML_PARSE_DTDATTR
are set by default (src/validate.c#valMain())transform
--net
: clears XML_PARSE_NONET
--omit-decl
: sets XML_SAVE_NO_DECL
format
’s -H (--html)
and transform
’s --html
options substitute the libxml2
HTML 4.0 parser.
The c14n
command converts an XML document to a normal format.
To expand empty-element tags, changing <p/>
to <p></p>
, for example:
xmlstarlet edit --pf -s '//*[not(node())]' -t text -n ignored -v '' file.xml
See also: network access | Try out edit
’s formatting options example
The xmlEscapeEntities
function in libxml2
’s xmlsave.c
serialization module gives special treatment to characters &<>
(output as &
, <
, and >
) but neither apostrophe nor double quote ('"
). xmlstarlet
has no option to override this.
Using a CDATA section to keep the serializer from applying default rules:
$ printf '%s\n' '<v><w>x</w><x>🧩</x></v>' |
xmlstarlet edit -O -P -d '*/w'
<v><x>🧩</x></v>
$ :
$ printf '%s\n' '<v><w>x</w><x><![CDATA[🧩]]></x></v>' |
xmlstarlet edit -O -P -d '*/w'
<v><x><![CDATA[🧩]]></x></v>
Caution: Of xmlstarlet
’s commands only c14n
, select
, and transform
seem to understand an entity reference like <doc>&e;</doc>
, according to the following test script. This makes pre/post-processing a requirement if using xmlstarlet
’s other commands to handle external entities.
#!/bin/sh
# Test xmlstarlet commands with external general parsed entity.
# - ${xdata} holds data file contents, defaults to a few <e>N</e>
# - ${keepf} non-empty to keep temporary files in $TMPDIR
# - ${dryrun} non-empty to print but not execute commands
# - ${doecho} non-empty to also print commands before executing
skelf=$(mktemp -t "xsskel-$$-XXXXXXXXXX.xml")
dataf=$(mktemp -t "xsdata-$$-XXXXXXXXXX.xml")
idxff=$(mktemp -t "xsidxf-$$-XXXXXXXXXX.xsl")
test "${keepf}" ||
trap "rm '${skelf}' '${dataf}' '${idxff}'" INT EXIT
printf '%s\n' \
'<!DOCTYPE skel [<!ENTITY e SYSTEM "'"${dataf}"'">]><doc>&e;</doc>' \
> "${skelf}"
printf '%s' \
"${xdata:-<e>1</e><e>2</e><e>3</e><e>4</e>}" \
> "${dataf}"
printf '<v/>' | xmlstarlet select -t \
-e xsl:transform -a version -o 1.0 -b \
-e xsl:template -a match -o '@*|node()' -b \
-e xsl:copy -e xsl:apply-templates -a select -o '@*|node()' \
> "${idxff}" ## identity transform
for cmd in c14n ed el fo pyx sel tr val
do
case ${cmd} in
(c14n|el|fo|pyx|val)
set -- ;;
(ed) set -- -d '*/*[3]' ;;
(sel) set -- -T -t -c / -n ;;
(tr) set -- "${idxff}" ;;
(*) break ;;
esac
set -- xmlstarlet "${cmd}" "$@" "${skelf}"
if test "${dryrun}${doecho}"; then
printf '\n\n# command:'; printf " '%s'" "$@"; printf '\n'
fi
if ! test "${dryrun}"; then
"$@"; printf '\n## %s returned %d\n\n' "${cmd}" "$?"
fi
done
Given a data file containing <e>1</e><e>2</e><e>3</e><e>4</e>
(making it well-formed XML) pyx
returns 4 (outputs the doctype but says Entity 'e' not defined
) while c14n
, ed
, el
, fo
, sel
, tr
, and val
all return zero. But ed
, el
, and fo
(plus val
, presumably) fail to expand the entity reference.
Given a data file containing <a>B</c>
(clearly making it non-XML) the ed
, el
, and val
commands all return zero – and val
even pronouncing «datafile» - valid
– while c14n
, fo
, pyx
, sel
, and tr
return 3, 2, 4, 3, and 6, respectively.
xmllint
from the libxml2-utils
package has a --noent
option to substitute entity values for entity references (e.g. xmllint --noent --dropdtd file.xml
).
src/xmlstar.h defines the following exit values for xmlstarlet
:
EXIT_SUCCESS
EXIT_FAILURE
EXIT_BAD_ARGS
EXIT_BAD_FILE
EXIT_LIB_ERROR
EXIT_INTERNAL_ERROR
but mind these:
xmlstarlet select
exits grep
-style returning the query result as 0 or 1xmlstarlet format
’s --omit-decl
option exits, er, shuf
-styledocument()
failure, unescape – both producing stderr
output)XPath 1.0 does not support numbers expressed in scientific notation, cf. W3C recommendation (Number ::= Digits ('.' Digits?)? | '.' Digits
and Digits ::= [0-9]+
).
Tools based on libxml2
do support it, however, cf. xmlXPathFormatNumber()
(snprintf(work, sizeof(work),"%*.*e", integer_place, fraction_place, number);
).
Here are a few examples of libxml2
handling XPath computations – and libxslt
handling the XSLT format-number()
function.
printf '%s\n' '<v>1240057409536</v>' |
xmlstarlet select -T -t \
-v '*' -n \
-v '0 + *' -n \
-v '* div 1' -n \
-v '* div 1000 * 1E3' -n \
-v '* div 1.240057409536e+12' -n \
-o '---' -n \
-v 'round(* div 1)' -n \
-v 'round(* div 10)' -n \
-v 'round(* div 100)' -n \
-v 'round(* div 1000)' -n \
-o '---' -n \
-v 'format-number(* div 1,"#")' -n \
-v 'format-number(* div 1,"#,###")' -n
Output:
1240057409536
1.240057409536e+12
1.240057409536e+12
1.240057409536e+12
1
---
1.240057409536e+12
1.24005740954e+11
1.2400574095e+10
1240057410
---
1240057409536
1,240,057,409,536
In this document longer commands are usually split across lines and indented, like this:
xmlstarlet select -T -t \
--var sq -o "'" -b \
-o 'xmlstarlet edit --omit-decl '\\ -n \
-o " --var N 'Names/Name' \\" -n \
-m '*/*' \
-o ' -r ' -v 'concat($sq,"$N[",position(),"]",$sq)' \
-o ' -v ' -v 'concat($sq,name(),format-number(position(),"0000"),$sq)' -o ' '\\ -n \
-b \
-f -n \
"${infile:-file.xml}"
To strip such a command of line continuation characters and leading whitespace pipe it through following sed
command (changing one line, not an entire shell script),
sed -e ':1' -e 's/^[[:blank:]]*//' -e '/\\$/!b' -e '$b' -e 'N' -e 's/\\\n[[:blank:]]*//' -e 'b1'
or, as an alias, silently using xsel
to paste from the clipboard, call sed
, have paste
add a trailing newline if needed, and return the result to the clipboard:
alias mfyoi="xsel -b -o |
sed -e 's/^[[:blank:]]*//' -e ':1' -e '/\\\\\$/!b' \
-e '\$b' -e 'N' -e 's/\\\\\\n[[:blank:]]*//' -e 'b1' |
paste -s -d '\\n' |
xsel -b -i"
Thus minified:
xmlstarlet select -T -t --var sq -o "'" -b -o 'xmlstarlet edit --omit-decl '\\ -n -o " --var N 'Names/Name' \\" -n -m '*/*' -o ' -r ' -v 'concat($sq,"$N[",position(),"]",$sq)' -o ' -v ' -v 'concat($sq,name(),format-number(position(),"0000"),$sq)' -o ' '\\ -n -b -f -n "${infile:-file.xml}"
makefile
notes (GNU Make)$
(dollar sign) starts expansion of a variable / parameter
make
, e.g. $< $T $(mvar) ${mvar}
, use $$
for a literal$# $$ $svar ${svar}
$xvar
\
(backslash) is make
’s (and the shell’s) escape character, it has no special meaning in XPath or XSLTmake
functions and variables are expanded before the shell is invoked to execute a recipexmlstarlet
’s exit values aren’t all orthodoxLinks: GNU Make manual | Ask Mr. Make article on GNU Make escaping
Sample makefile
:
SHELL := /bin/sh
space := $(info) $(info)
tab := $(shell printf '\t')
define newline =
endef
# next line defines U+0023 NUMBER SIGN (aka \043, pound sign, hashtag, …)
\H := \#
.RECIPEPREFIX = >
.PHONY: all
all:
> printf '%s' '<v a="fee" b="fi" c="fo" d="fum"/>' | \
xmlstarlet select -T -t --var x='*/@*' -v '$$x' -n | \
paste -s -d '$$ ' -
> printf '%s\n' '$(space)x$(tab)\$(newline)'"$${OLDPWD$(\H)$(\H)*/}" \
"process $$$$ exiting"
Output from make -s
:
fee$fi fo$fum
x \
incubator
process 20965 exiting
Global options go before the command, as in xmlstarlet -q format file
.
An input filename starting with -
(dash) – unless it’s short for stdin
– must be prefixed with ./
(dot slash) otherwise it will be parsed as an option, possibly causing select
(Caution) to ignore the file.
Beware of known bugs for filenames containing (#123 ) '
(single quote), or (#110) urlencoded characters, e.g. %20
.
See also: couldn’t read file | failed to load external entity
--help
xmlstarlet --help
shows the general usage reminder, xmlstarlet «command» -h (--help)
the command-specific ditto.
--version
Prints version information and terminates.
Sample output from xmlstarlet --version
:
1.6.1
compiled against libxml2 2.9.4, linked with 20910
compiled against libxslt 1.1.33, linked with 10134
-q (--quiet)
: suppress error outputError messages from libxml2
or libxslt
are suppressed by this option.
Caution: this option also suppresses ordinary output (to stdout
) from xmlstarlet select
.
See also: select -Q (--quiet)
local option | format -Q (--quiet)
local option
--no-doc-namespace
: don’t use namespace bindings from input’s root element--doc-namespace
: extract namespace bindings from input’s root element (default)By default (--doc-namespace
being in effect) namespaces declared in input’s root element (aka document element) can be referred to without explicit -N
options; if the default namespace is declared there it is bound to the _
(underscore) (aka DEFAULT
) prefix.
Although --no-doc-namespace
and --doc-namespace
are global options only xmlstarlet select
and xmlstarlet edit
use them.
See also: User’s guide ch. 5 | Use a namespace | select -N
| edit -N
--net
)Several xmlstarlet
commands - select
, edit
, format
, c14n
, validate
, and transform
- have a --net
option to allow network access, to fetch remote DTDs and entities. --net
clears the XML_PARSE_NONET
flag for the libxml2
XML parser (API ref).
For security, network access is disallowed by default, cf. article on XML external entity attack.
uri
replacing input filexmlstarlet --help
says,
Wherever file name mentioned in command help it is assumed that URL can be used instead as well.
Should work with HTTP and FTP protocols, not HTTPS (due to libxml2 limitations). (Distribution-dependent?)
See also: --net
xmlstarlet elements
xmlstarlet elements
(aka el
) displays the structure of an XML document by listing the paths of elements and optionally attributes and attribute values.
elments [option] [«xml-file»]
At most one option and one input file is accepted.
-a
- include attributes-v
- include attribute values-u
- sorted unique lines-dN
- sorted unique lines to depth N$ : "${infile=recently-used.xbel}"
$ :
$ xmlstarlet elements -u "${infile}"
xbel
xbel/bookmark
xbel/bookmark/info
xbel/bookmark/info/metadata
xbel/bookmark/info/metadata/bookmark:applications
xbel/bookmark/info/metadata/bookmark:applications/bookmark:application
xbel/bookmark/info/metadata/bookmark:groups
xbel/bookmark/info/metadata/bookmark:groups/bookmark:group
xbel/bookmark/info/metadata/mime:mime-type
$ :
$ xmlstarlet el -d3 "${infile}"
xbel
xbel/bookmark
xbel/bookmark/info
$ :
$ # Skip repetitions
$ xmlstarlet el -a "${infile}" | awk '!seen[$1]++' | head -n 10
xbel
xbel/@xmlns:bookmark
xbel/@xmlns:mime
xbel/@version
xbel/bookmark
xbel/bookmark/@href
xbel/bookmark/@added
xbel/bookmark/@modified
xbel/bookmark/@visited
xbel/bookmark/info
$ :
$ xmlstarlet el -v "${infile}" | sed '2d;9q'
xbel[@xmlns:bookmark='http://www.freedesktop.org/standards/desktop-bookmarks' and @xmlns:mime='http://www.freedesktop.org/standards/shared-mime-info' and @version='1.0']
xbel/bookmark/info
xbel/bookmark/info/metadata[@owner='http://freedesktop.org']
xbel/bookmark/info/metadata/mime:mime-type[@type='image/jpeg']
xbel/bookmark/info/metadata/bookmark:groups
xbel/bookmark/info/metadata/bookmark:groups/bookmark:group
xbel/bookmark/info/metadata/bookmark:applications
xbel/bookmark/info/metadata/bookmark:applications/bookmark:application[@name='Image Viewer' and @exec="'eog %u'" and @modified='2022-03-28T07:27:27Z' and @count='1']
$ :
$ # Compute tree height as maximum branch node depth
$ xmlstarlet el -u "${infile}" | awk -F / '{d=NF-1;if(d>h)h=d}END{print 0+h}'
5
See also: Print XPath of selected elements or attributes example
xmlstarlet select
xmlstarlet select
(aka sel
) is basically a shorthand XSLT generator that can either process or print the stylesheet it generates. Typically used to extract and format data it supports a subset of XSLT 1.0 elements, all XPath 1.0 and XSLT 1.0 functions, plus the EXSLT functions offered by libexslt
.
select
implements 7 XSLT instruction elements – xsl:attribute
, xsl:choose
, xsl:copy-of
, xsl:element
, xsl:for-each
, xsl:text
, xsl:value-of
– plus xsl:variable
(and xsl:stylesheet
, xsl:template
, xsl:output
partially) but note the absence of xsl:apply-templates
, xsl:key
a.o. This means recursion and identity transforms are off-limits (unless resorting to code generation).
xmlstarlet select
returns the same system-property()
values as xmlstarlet transform
. A stylesheet generated by select
appears as located in the current directory.
Like grep
xmlstarlet select
returns an exit value of 1 if no nodes were selected, e.g. xmlstarlet select -T -t -m '(//xsl:document)[1]' -f *.xsl
returns 0 if at least one input file matches the XPath expression, otherwise 1 (with or without the -Q (--quiet)
option).
See also: XML parsing and serialization
Caution: xmlstarlet select
does not flag invalid non-template options (src/xml_select.c#selParseOptions()) and ignores characters following the first letter in short template options (src/xml_select.c#selGenTemplate()). Next command outputs:
optfuscation
xmlstarlet select --nonet --rsn -:=% -C -t -i\*r 2=2 -eR_W- x -a'!e'ee y -omit z -bar -b:rrrf -newln | {
xmlstarlet select -C -t -i 2=2 -e x -a y -o z -b -b -n |
cmp -s - /dev/fd/3
} 3<&0 && echo 'optfuscation' || echo 'returned non-zero'
select
[option …] template … [«xml-file» …]-h (--help)
- display help-Q (--quiet)
- do not write anything to standard outputSee also: global option -q (--quiet)
(lowercase -q
)
-C (--comp)
- display generated XSLTLists the XSLT stylesheet that will be generated from the current template options. No input file is required for this option. It produces no output other than a stylesheet or an error message.
Usage samples: -t -m …
| --output …
| --value-of …
| --xinclude
See also: Introspection example
-R (--root)
- print root element <xsl-select>
Wraps a container element named xsl-select
around output. It includes any namespace nodes declared with -N «prefix»=«value»
except the predefined namespaces.
-T (--text)
- output is text (default is XML)Sets method="text"
on the xsl:output
element.
$ cat file.xml
<v><w>a&</w><w>l<</w><w>q"</w><w>g></w></v>
$ :
$ xmlstarlet select -t -c '*/*[position()>2]' -n file.xml
<w>q"</w><w>g></w>
$ :
$ xmlstarlet select --text -t -c '*/*[position()<3]' -n file.xml
a&l<
See also: Special characters | -o (--output)
-I (--indent)
- indent outputSets indent="yes"
on the xsl:output
element.
To re-indent, for example:
xmlstarlet select -B -I -t -c / in.xml > out.xml
See also: XML parsing and serialization | -B (--noblanks)
-D (--xml-decl)
- do not omit XML declaration lineSets omit-xml-declaration="no"
on the xsl:output
element.
Use with -E (--encode)
to specify encoding.
See also: XML declaration
-B (--noblanks)
- remove nonsignificant whitespace from XML treeTo strip nonsignificant whitespace, for example:
xmlstarlet select -B -t -c / in.xml > out.xml
See also: XML parsing and serialization
-E (--encode) «encoding»
- output in the given encodingEncoding value for the XML declaration, e.g. UTF-8
, ISO-8859-2
. Use with -D (--xml-decl)
.
See also: XML declaration
-N «prefix»=«value»
- declare namespacesThis option is repeatable. E.g. -N xsql='urn:oracle-xsql' -N X='http://www.w3.org/1999/xhtml'
. Either side of the equal sign may be empty[1], e.g. -N ''=''
(or -N =
) for xmlns=""
.
Not needed for predefined namespaces or those declared in input’s root element (see --doc-namespace
) but required
document()
function)--no-doc-namespace
global option is in effectselect
’s --var «name»=«value»
namespace issuecrypto
(see EXSLT)See also: Use a namespace | -R (--root)
| edit -N
[1] -N foo=''
is not allowed; echo '<v/>' | xmlstarlet sel -N foo= -t -e a -e foo:k
outputs <a><k/></a>
.
--net
- allow fetch DTDs or entities over networkSee also: network access
-t (--template)
is <xsl:template match="/">
The -t (--template)
option marks the beginning of an xsl:template
element which ends at a following -t
option (i.e. non-nestable) or at the last option after -t
. -t
must be followed by at least one template option. NB: <xsl:template match="expression">
cannot be generated by combining -t
and -m
options.
-t (--template)
makes the root node (/
, not the root element /*
) the current node so XPath expressions can be relative,
xmlstarlet select -t -m '*/*/r' -v '@id' -n file
even obscure,
echo '<q>2</q>' | xmlstarlet sel -t -v '*******************'
with thanks to Michael Kay for his original Christmas cracker the output of which is 1024
.
As xmlstarlet select --help
shows, two or more --template
s are implemented as:
<xsl:template match="/">
<xsl:call-template name="t1"/>
<xsl:call-template name="t2"/>
…
</xsl:template>
See also: List the generated XSLT -C (--comp)
-m (--match)
is <xsl:for-each select="xpath-expr">
-m (--match)
is a rare misnomer among xmlstarlet
’s option names: it translates to the xsl:for-each
element and has nothing to do with an xsl:template
pattern. -m (--match)
is nestable and can be explicitly terminated with -b (--break)
.
Links: XSLT current node | XSLT xsl:for-each
| XSLT current()
| XPath context node
xsl:for-each
changes the current node. The XPath functions position()
and last()
return the context position and context size, respectively.
$ printf '<v w="a:b:c:d:e:f:g:h:i:j"/>' |
xmlstarlet select --text -t \
-m 'str:split(v/@w,":")' \
--if 'position() mod 3 = 0' \
-v 'concat(position()," ",.," ")'
3 c 6 f 9 i
whereas -m 'str:split(v/@w,":")[position() mod 3 = 0]' -v 'concat(…)'
outputs 1 c 2 f 3 i
.
Keeping a reference to root for node changes.
$ cat file.xml
<r><e id="a">fee</e><e id="b">fi</e><e id="c">fo</e><e id="d">fum</e></r>
$ :
$ xmlstarlet select -T -t \
-m 'str:split("a b c d")' \
-v 'concat(//e[@id=current()],". ")' \
-b -n \
file.xml
. . . .
$ :
$ xmlstarlet select -T -t \
--var R='/' \
-m 'str:split("a b c d")' \
-v 'concat($R//e[@id=current()],". ")' \
-b -n \
file.xml
fee. fi. fo. fum.
-s (--sort)
is <xsl:sort …/>
To process a nodeset in sorted order add one or more -s (--sort) 'X:Y:Z' 'xpath'
options immediately after -m (--match)
.
X
is one of A | D | -
to set order
ascending | descending | unspecifiedY
is one of N | T | -
to set data-type
number | text | unspecifiedZ
is one of U | L | -
to set case-order
upper-first | lower-first | unspecified-
(unspecified)For example:
-s 'A:N:-' '.'
-s 'D:N:-' 'position()'
DD.MM.YYYY
dates by year, month, date:-s 'A:T:-' 'concat(substring(.,7,4), substring(.,4,2), substring(.,1,2))'
See also: examples/sort* | Query Euro rates | Remove all but the latest member of each group
--var «name» «value» --break
is <xsl:variable name="…">«value»</xsl:variable
>--var «name»=«value»
is <xsl:variable name="…"/>
xmlstarlet select
has 2 forms of --var
, cf. xsl:variable
:
--var name=value
, e.g.
--var n='5'
--var s='"fee fi fo fum"'
--var f='true()'
--var V='//_:abc[@class="def"]'
--var W='$V/_:ghi[boolean(@jkl)]'
-m 'str:split($ws)' --var w='.' …
--var lut='document("")//xsl:variable[@name="rtf"]/*'
--var name value --break
-b (--break)
, e.g.
--var nl -n -b
--var s -o '<f&g>' -b
--var stuff -e doranc -c 'a[c] | d[c]' -b -b
--var lines -m '$expr' -v '…' -n -b -b
--var reply --if '$v > 4' -o 'yes' --elif '$v < 2' -o 'no' --else -o 'maybe' -b -b
(xmlstarlet edit
has 1 form: --var name xpath
.)
See also: --var «name»=«value»
namespace issue
Result tree fragment (RTF) demo:
printf '<v/>\n' |
xmlstarlet select -t \
--var rtf \
-e x -a k -o 1st -b -o First. -b \
-e x -a k -o 2nd -b -o Second. -b \
-e x -a k -o 3rd -b -o Third. -b \
-b \
--var tbl='exslt:node-set($rtf)' \
-v 'exslt:object-type($rtf)' -o ' rtf ' -v '$rtf' -n -c '$rtf' -n \
-v 'exslt:object-type($tbl)' -o ' tbl ' -v '$tbl' -n -c '$tbl' -n
Output:
RTF rtf First.Second.Third.
<x k="1st">First.</x><x k="2nd">Second.</x><x k="3rd">Third.</x>
node-set tbl First.Second.Third.
<x k="1st">First.</x><x k="2nd">Second.</x><x k="3rd">Third.</x>
$tbl/x[@k="2nd"]
is a valid XPath expression, $rtf/x[@k="2nd"]
is not and triggers an Invalid type
run-time error.
See also: RTF examples file list | accumulation
Links: exslt:node-set
| nodeset vs. RTF by David Carlisle, Jörg Pietschmann | RTF background by Michael Kay
--var «name»=«value»
namespace issueCaution: An EXSLT namespace prefix (other than exslt
(?)) used only inside xmlstarlet select
’s --var name='…'
triggers runtime error xmlXPathCompOpEval: function «func» bound to undefined prefix «ns»
unless option -N ns=…
is given. Workaround: use -N ns=…
or use the prefix outside --var name='…'
, e.g. in -v
or -m
or (for string content) --var name … -b
.
$ printf '%s\n' '<v s="a b c"/>' |
xmlstarlet select -t \
--var d='str:split(v/@s)' \
-v '$d' -n
xmlXPathCompOpEval: function split bound to undefined prefix str
runtime error: element variable
Failed to evaluate the expression of variable 'd'.
no result for -
$ :
$ printf '%s\n' '<v s="a b c"/>' |
xmlstarlet select -t \
-m 'str:split(v/@s)' \
-v . -b -n
abc
-o (--output)
is <xsl:text>«value»</xsl:text>
$ xmlstarlet select -T -C -t -o 'A<&'\''">z'
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output omit-xml-declaration="yes" indent="no" method="text"/>
<xsl:template match="/">
<xsl:text>A<&'">z</xsl:text>
</xsl:template>
</xsl:stylesheet>
-o ''
translates to an empty <xsl:text/>
element.
See also: Special characters
To manage parameters of the xsl:output
element, see XML parsing and serialization.
-e (--elem)
is <xsl:element name="…">
-e
is nestable and can be explicitly terminated with -b (--break)
.
See also: Create a namespace | Create a SOAP envelope example
-a (--attr)
is <xsl:attribute name="…">
-a
can be explicitly terminated with -b (--break)
.
In XSLT, the latter of two same-named attributes is accepted, e.g.
$ echo '<v/>' |
xmlstarlet select -t -e doc -a f -o n -b -a f -o y
<doc f="y"/>
-c (--copy-of)
is <xsl:copy-of select="xpath-expr"/>
See examples at: -T (--text)
| -I (--indent)
-v (--value-of)
is string-join((xpath-expr),newline)
With zero or one nodeset members in xpath-expr
-v (--value-of)
works exactly as XSLT 1.0’s <xsl:value-of select="xpath-expr"/>
, otherwise (like string-join()
in XSLT 2.0) all members are output, stringified and separated by newlines.
$ echo '<v><w>fee</w><w>fi</w><w>fo</w><w>fum</w></v>' |
xmlstarlet select -T -t -v '*/*' -t -n
fee
fi
fo
fum
Adding -C (--comp)
option to list the XSLT code for the value-of-template
:
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:exslt="http://exslt.org/common" version="1.0" extension-element-prefixes="exslt">
<xsl:output omit-xml-declaration="yes" indent="no" method="text"/>
<xsl:template match="/">
<xsl:call-template name="t1"/>
<xsl:call-template name="t2"/>
</xsl:template>
<xsl:template name="t1">
<xsl:call-template name="value-of-template">
<xsl:with-param name="select" select="*/*"/>
</xsl:call-template>
</xsl:template>
<xsl:template name="t2">
<xsl:value-of select="' '"/>
</xsl:template>
<xsl:template name="value-of-template">
<xsl:param name="select"/>
<xsl:value-of select="$select"/>
<xsl:for-each select="exslt:node-set($select)[position()>1]">
<xsl:value-of select="' '"/>
<xsl:value-of select="."/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
-i (--if) [--elif …] [--else]
is <xsl:when> … [<xsl:otherwise>]
-i (--if)
is nestable and can be explicitly terminated with -b (--break)
. It translates to an xsl:choose
element.
-b (--break)
ends current container element-b (--break)
closes the currently open container element, one of:
-m (--match)
(nestable)-i (--if, --elif, --else)
(nestable)-e (--elem)
(nestable)-a (--attr)
-a 'data-dec' -o 'pre' -v '.' -o 'suf' -b
--var
without =
(nestable)--var idls -m 'was[not(was)]' -v 'concat(@id,$sep,@class)' -n -b -b
--var
s are local to the enclosing --var
but must have unique variable names.-t (--template)
These can be followed by a variable number of options and so must be terminated explicitly unless followed by one of:
-t (--template)
optionxmlstarlet
commandclosing all open elements. In other words, trailing -b
s may be omitted if they’re the last options in the current template.
A -b (--break)
too many can trigger compilation error: xsltParseStylesheetTop: unknown «name» element
.
-n (--nl)
prints a newline-f (--inp-name)
prints pathname / URI of current inputShorthand for -v '$inputFile'
(a predefined variable). Outputs -
(dash) for standard input (stdin
).
Download (< 2K) and convert the European Central Bank’s Euro rates sorted by currency in A
scending order as T
ext, U
pper-first:
wget -qO- 'https://www.ecb.europa.eu/stats/eurofxref/eurofxref-daily.xml' |
xmlstarlet select --text -t \
-m '//_:Cube[@currency]' \
-s 'A:T:U' '@currency' \
-v 'concat(@currency," ",@rate)' -n
See also: -s (--sort)
List files in current dir and subdirs containing at least one milk
element (returns non-zero if no match):
find . -type f -name '*.xml' -exec \
xmlstarlet select -T -t -m '(//*[local-name()="milk"])[1]' -f -n {} +
Return zero if at least one XML element text exactly matches milk
, otherwise non-zero (no output is produced):
find . -type f -name '*.xml' -exec \
xmlstarlet select -Q -T -t -m '(//*[text()="milk"])[1]' -f -n {} +
find
’s {} +
fills up the command line with pathnames.
See also: -f (--inp-name)
| -Q (--quiet)
| exit values
Note: This handles element or attribute nodes but no other node types.
: ${fileglob:=/usr/share/*/xslt/docbook/common/db-common.xsl}
: ${target:='//xsl:param[string(@select)]'}
xmlstarlet select --text -t \
-m "${target}" \
-m 'ancestor-or-self::*' \
--var pos='1+count(preceding-sibling::*[name() = name(current())])' \
-v 'concat("/",name(),"[",$pos,"]")' \
-b \
--if 'count(. | ../@*) = count(../@*)' \
-v 'concat("/@",name())' \
-b \
-n \
${fileglob}
where:
${varname:=…}
assigns a default value by shell parameter expansion via the built-in colon utility-m (--match)
option specifies the target, with optional search conditions given as XPath predicates-m (--match)
builds the XPath of elements from root to target, calculating position by counting siblings using the XSLT 1.0 current()
functionancestor-or-self::*
– the -i (--if)
clause adjusts the XPathOutput:
/xsl:stylesheet[1]/xsl:template[1]/xsl:param[1]
/xsl:stylesheet[1]/xsl:template[4]/xsl:param[1]
/xsl:stylesheet[1]/xsl:template[5]/xsl:param[1]
/xsl:stylesheet[1]/xsl:template[5]/xsl:param[2]
/xsl:stylesheet[1]/xsl:template[6]/xsl:param[1]
Output if called with target='//xsl:*/@test[contains(.,"position")]'
:
/xsl:stylesheet[1]/xsl:template[2]/xsl:for-each[1]/xsl:if[1]/@test
/xsl:stylesheet[1]/xsl:template[3]/xsl:for-each[1]/xsl:if[1]/@test
/xsl:stylesheet[1]/xsl:template[7]/xsl:for-each[1]/xsl:choose[1]/xsl:when[1]/@test
/xsl:stylesheet[1]/xsl:template[7]/xsl:for-each[1]/xsl:choose[1]/xsl:when[3]/@test
See also: xmlstarlet elements
If the plaintext input is uncomplicated perhaps EXSLT’s string
functions can do the conversion. Note that str:replace
, str:split
, and str:tokenize
are available for xmlstarlet select
(and transform
), but not for edit
.
<root>
A;2022-08-10;db #1
B;sortie bidon;50.0
A;2022-08-12;db Cth
B;mali climber;40.0
C;fray illumine;9.75
</root>
ifs
, iss
, and irs
, respectivelystr:split
function, while applying XML markupA
in inputxmlstarlet select --indent -t \
--var ifs -o ';' -b \
--var iss -n -b \
--var irs='concat($iss,"A")' \
-e recs \
-m 'str:split(*,$irs)' \
-e rec \
--var sr='str:split(.,$iss)' \
--var hd='str:split($sr[1],$ifs)' \
-e hd \
-e dt -v '$hd[1]' -b \
-e wd -v '$hd[2]' -b \
-b \
-e bd \
-m '$sr[position()!=1]' \
--var f='str:split(.,$ifs)' \
-e fld \
-a typ -v '$f[1]' -b \
-e dsc -v '$f[2]' -b \
-e amt -v '$f[3]' -b \
"${infile:-file.xml}"
See also: --var
| -m (--match)
| -e (--elem)
| -b (--break)
Output:
<recs>
<rec>
<hd>
<dt>2022-08-10</dt>
<wd>db #1</wd>
</hd>
<bd>
<fld typ="B">
<dsc>sortie bidon</dsc>
<amt>50.0</amt>
</fld>
</bd>
</rec>
<rec>
<hd>
<dt>2022-08-12</dt>
<wd>db Cth</wd>
</hd>
<bd>
<fld typ="B">
<dsc>mali climber</dsc>
<amt>40.0</amt>
</fld>
<fld typ="C">
<dsc>fray illumine</dsc>
<amt>9.75</amt>
</fld>
</bd>
</rec>
</recs>
document()
functionLinks: document()
in W3C rec
The XSLT document()
function
xmlstarlet select
)xmlstarlet
abends with an error message such as Extra content at the end of the document
and (Caution) exit value 0document("")
Examples: merge 2 XML files | extract and merge records | introspection | external lookup table
Insert child nodes of ${partfile}
’s root element into ${infile}
’s ${destination}
element – using a 3-stage pipeline:
xmlstarlet select -R -t \
--var part -o "${partfile:-file2.xml}" -b \
-c ' / | document($part)' "${infile:-file.xml}" |
xmlstarlet edit -m '/xsl-select/*[2]/node()' '/xsl-select'"${destination:-/..}" |
xmlstarlet select -B -I -t -c '/xsl-select/*[1]'
select
to copy the 2 documents and wrap them (-R
) as /xsl-select/*[1]
and /xsl-select/*[2]
, using document()
to access the ${partfile}
– either ${infile}
or ${partfile}
can be /dev/stdin
edit
to -m (--move)
children of ${partfile}
’s root element to ${destination}
– an XPath expression locating an element in ${infile}
– incoming nodes will be appended as last nodes there${destination}
(/..
) causes an error to be generated and must be overriddenselect
to extract and format the merged documentSee also: -R (--root)
| --var … -b
| -B (--noblanks)
| -I (--indent)
| -c (--copy-of)
If called with this ${partfile}
<items>
<item>1</item><item>2</item><item>3</item>
</items>
and this ${infile}
<doc><g><g1/><g2/><g3/></g></doc>
and destination=/doc//g1
, then output becomes:
<doc>
<g>
<g1>
<item>1</item>
<item>2</item>
<item>3</item>
</g1>
<g2/>
<g3/>
</g>
</doc>
See also: transform --xinclude
Given a number of similar XML input files each containing a simple record set,
echo '<v/>' |
xmlstarlet select -R -I -t \
--var fls \
-e f -o 'data/rs1.xml' -b \
-e f -o 'data/rs2.xml' -b \
-e f -o 'data/rs3.xml' -b \
-b \
-c 'document(exslt:node-set($fls)/f) /*/r'
select --var
)exslt:node-set()
function…/f
) to document()
document()
returns the root nodes of the XML trees parsed from the input files-c (--copy-of)
copies the r
elements (…/*/r
) from the source trees to the result treeSee also: -R (--root)
| -I (--indent)
| select --var
| -e (--elem)
Output:
<xsl-select>
<r a1="x" a2="42" a3="-2"/>
<r a1="x" a2="41" a3="-2"/>
<!-- etc. -->
</xsl-select>
Also possible:
-c '(document("…1.xml") | document("…2.xml") | document("…3.xml")) /*/r'
-c 'document("…1.xml")/*/r
-c 'document("…2.xml")/*/r' …
If file order determined by sort
is sufficient the EXSLT str:split()
function can split the newline-separated output from find
into a nodeset:
echo '<v/>' |
xmlstarlet sel -R -I -t \
--var sep -n -b \
--var fls2 -o "$(find 'data' -type f -name 'rs*.xml' | sort)" -b \
-c 'document(str:split($fls2,$sep)) /*/r'
With a different -c (--copy-of)
argument in the previous example,
-c 'document("")'
outputs the stylesheet like the -C (--comp)
option (but inside a wrapper element here because of -R (--root)
).
With
-c 'document("")//xsl:variable[@name="fls"]'
the file list variable is copied:
<xsl-select>
<xsl:variable xmlns:xsl="http://www.w3.org/1999/XSL/Transform" name="fls">
<xsl:element name="f">data/rs1.xml</xsl:element>
<xsl:element name="f">data/rs2.xml</xsl:element>
<xsl:element name="f">data/rs3.xml</xsl:element>
</xsl:variable>
</xsl-select>
A simple food composition table lists – per 100 gram food – the calorie count (kcal
) as well as the amount in grams of protein, fat, and carbohydrate:
<fc:foodcomp xmlns:fc="urn:foodcomp-subset">
<fc:nutrient nid="n0893" kcal="297" prot="24.3" fat="1.9" carb="48.8" name="Lentils, green, dried, raw"/>
<fc:nutrient nid="n2443" kcal="98" prot="7.9" fat="0.6" carb="16.3" name="Garlic, raw"/>
<!-- etc. -->
</fc:foodcomp>
With an input file containing a culinary recipe on the form
<recipe servings="4" name="Lentil and goats' cheese salad">
<ingredients>
<ing foodid="n0893" grams="200" name="green lentils"/>
<ing foodid="n2443" grams="10" name="garlic"/>
<!-- etc. -->
</ingredients>
<method><!-- etc. --></method>
</recipe>
specify the calorie count per serving per ingredient:
-N «prefix»=«value»
to declare lookup table’s namespacedocument()
to access the external lookup tableRP
) and lookup (FC
) documents as -m (--match)
changes the current nodeformat-number()
xmlstarlet select --text -N fc='urn:foodcomp-subset' -t \
--var fcfile -o "${lutfile:-file2.xml}" -b \
--var FC='document($fcfile)/fc:foodcomp' \
--var RP='/recipe' \
-m '//ing' \
--var kcal='$FC/*[@nid = current()/@foodid]/@kcal' \
--var kcal-per-serv='$kcal div 100.0 * @grams div $RP/@servings' \
-v 'str:align(current()/@name,str:padding(20," ."),"left")' \
-o ' : ' \
-v 'str:align(format-number($kcal-per-serv,"0 kcal"),str:padding(8),"right")' \
-n \
-b \
"${infile:-file.xml}"
Output:
green lentils. . . . : 149 kcal
garlic . . . . . . . : 2 kcal
lemon juice. . . . . : 0 kcal
extra virgin olive o : 56 kcal
fresh basil. . . . . : 4 kcal
goats' cheese. . . . : 96 kcal
black pepper . . . . : 0 kcal
salt . . . . . . . . : 0 kcal
To compute nutritional values for an entire recipe collect the gram-weighted food composition data – here in a result tree fragment (RTF, cf. select --var
) as data size is modest – and sum(…)
vertically, along the lines of:
…
--var attrib='str:split("kcal prot fat carb")' \
--var nutr-weighted-rtf \
-m '//ing' \
--var ing='.' \
-e data \
-c '@foodid' \
-m '$attrib' \
-a '{.}' -v '$FC/*[@nid = $ing/@foodid]/@*[name() = current()] div 100.0 * $ing/@grams' -b \
-b \
-b \
-b \
-b \
--var nutr-wt='exslt:node-set($nutr-weighted-rtf)' \
-o 'Nutrition per serving: ' \
-m '$attrib' \
--var sum-per-serv='sum($nutr-wt/data/@*[name() = current()]) div $RP/@servings' \
-v 'concat(.," ",format-number($sum-per-serv,"0"))' \
…
Output:
Nutrition per serving: kcal 307, prot 19g, fat 15g, carb 26g
Caution:
This method does not preserve document order as element nodes are copied after other node types causing mixed content (e.g. marked up text) to be messed up. Unless added to the argument of -c (--copy-of)
comments and processing-instructions are ignored.
Recursion and xsl:template
are off-limits to xmlstarlet select
but nesting -m (--match)
options is OK, i.e. using nested xsl:for-each
elements. To make this script extract a subtree to depth N – while (Caution) removing selected prefixed namespaces – repeat the -m '*'
line to reach the desired depth, and it might be the hack that works…
: "${exclude:=(//namespace::xsi)[1] | (//namespace::ns3)[1] }"
: "${subroot:=//soupenv:body}"
xmlstarlet select -B -I -N ns3='https://www.example.com/ns/ns3' -t \
--var nl -n -b \
--var xlist -n -v "${exclude}" -n -b \
-m "${subroot}" -e '{local-name()}' -c '@*[not(contains($xlist,concat($nl,namespace-uri(),$nl)))] | text()' \
-m '*' -e '{local-name()}' -c '@*[not(contains($xlist,concat($nl,namespace-uri(),$nl)))] | text()' \
-m '*' -e '{local-name()}' -c '@*[not(contains($xlist,concat($nl,namespace-uri(),$nl)))] | text()' \
"${infile:-file.xml}"
Notes:
${varname:=…}
assigns a default value by shell parameter expansion via the built-in colon utility-N «prefix»=«value»
options if not declared in input’s root element; use -N =
to change the default namespace to the null dittoxlist
holds the list of namespace URIs to strip, all bookended by newlines to avoid a false match in contains(…)
, cf. -v (--value-of)
, so the exclude
shell variable may be specified as an XPath string of newline-separated URIs-m (--match)
changes the current node, recall, any -m '*'
lines reaching beyond actual depth match nothingxmlstarlet edit
xmlstarlet edit
(aka ed
) copies its input to output, supporting basic create, update, delete, rename, and move actions (operations).
Note that edit
/
) as current node$prev
variable as a back reference to the most recently created nodexpath
argumentsTo do conditional updates, or to dynamically create -n
names or -v
values for an edit
command, it may be worthwhile having xmlstarlet select
generate it.
edit
option […] [action …] [«xml-file-or-uri» …]-h (--help)
- display help-O (--omit-decl)
- omit XML declaration-P (--pf)
- preserve original formatting-S (--ps)
- preserve non-significant spaces-O (--omit-decl)
, -P (--pf)
, and -S (--ps)
set/unset libxml2
flags, cf. XML parsing and serialization.
See also: Try out edit
’s formatting options | select -I (--indent)
-L (--inplace)
- edit input file(s) in-placeThis option
stdout
– if input is stdin
%20
, cf. Global options and parameters-P (--pf)
is given--net
- allow network accessSee also: network access
-N «prefix»=«value»
- declare namespacesThis option is repeatable; must be last non-action option(s). E.g. -N xsql='urn:oracle-xsql'
.
Not needed for predefined namespaces or those declared in input’s root element (--doc-namespace
) but required
--no-doc-namespace
global option is in effectSee: Use a namespace | select -N
Caution: xmlstarlet edit
isn’t an XSLT processor so with or without the -N …
option,
printf '%s' '<a/>' |
xmlstarlet edit --pf -O -N b='https://www.example.org/b' \
-s '*' -t elem -n 'b:c' -v 'd'
generates:
<a><b:c>d</b:c></a>
See also: Create a SOAP envelope
-i (--insert)
- add node before-a (--append)
- add node after-s (--subnode)
- add node as childThere are 3 ways to add an element, an attribute, or a text node to each member of a nodeset:
xmlstarlet edit OP xpath -t node-type -n node-name -v value
where
xpath
is an xpath
argumentOP
is one of:
-i (--insert)
- insert before xpath
as preceding sibling-a (--append)
- insert after xpath
as following sibling-s (--subnode)
- append as last child of xpath
$prev
(aka $xstar:prev
) variable-i
and -a
accept the root element (document element) as xpath
-t (--type) node-type
selects one of these node types:
elem
- elementattr
- attribute; -i
, -a
, and -s
all create an attribute in the xpath
element but do not influence attribute ordertext
- text-n (--name) node-name
selects an XML QName, e.g. item
or svg:g
; required (and ignored) for text
nodesxmlstarlet edit
will accept names such as !--
and , <&> .
without turning a hair, cf. -r (--rename)
.xmlstarlet edit
will create a namespace-prefixed element like svg:g
but it will not be available as such in following edit
actions (explanation); workaround: use $prev
.-v (--value) value
may be omitted if creating an empty elem
or attr
node but it’s required for text
, e.g. -a 'bean' -t text -n ignored -v ''
."
, <
a.o.) and certain numeric character references (	
a.o.) are recognized in the value
text – when used with -t (--type) elem
– in which case &
is required to represent &
(ampersand), cf. example at special characters. (For value
used with type attr
or type text
the general rule applies.)Basic usecase (v/e
may replace $prev
here):
$ printf '%s' '<v/>' |
xmlstarlet edit -O \
-s 'v' -t elem -n 'e' -v '42' \
-s '$prev' -t attr -n 'a' -v 'y'
<v>
<e a="y">42</e>
</v>
Examples: examples/ed-append | examples/ed-insert | examples/ed-subnode | Insert HTML <link …/>
See also: -u (--update)
$prev
variable (aka $xstar:prev
)The $prev
(aka $xstar:prev
) variable refers to the nodeset created by the most recent -i (--insert)
, -a (--append)
, or -s (--subnode)
option, which all define or redefine it. To reset $prev
(to avoid a false match later) for example -a '/..' -t elem -n nil
which fails as the root node has no parent.
$prev
isn’t mentioned in the user’s guide; examples are given in doc/xmlstarlet.txt and in this section.
--var name 'xpath'
The --var name xpath
option to define an xmlstarlet edit
variable is mentioned in doc/xmlstarlet.txt but not in the user’s guide. It uses a different format than select
’s --var
.
Examples:
xmlstarlet edit --inplace \
--var T '//_:p[@class="eyg"] | //_:span[contains(@class,"eyg_")]' \
--var res "$((3 * 7 * 2))" \
-u '$T' -x 'concat(.,", currently ",$res)' \
file.xhtml
xmlstarlet edit \
-s '/doc/abc' -t elem -n 'ns:nd' \
--var nsnd '$prev' \
# ...
See also: xpath
arguments | -u (--update)
| $prev
-u (--update) 'xpath' -v (--value) 'value'
-u (--update) 'xpath' -x (--expr) 'xpath'
There are 2 ways to modify the value of each member of a nodeset:
xmlstarlet edit -u (--update) 'xpath' -v (--value) 'value'
xmlstarlet edit -u (--update) 'xpath' -x (--expr) 'xpath'
where
-u 'xpath'
is an xpath
argument describing the destination nodeset-v
expects a literal value, e.g. hello
, 'a b c<&d>'
, or "$(grep -e '^rev' abc.txt)"
-x 'xpath'
is one of:
xpath
argument: all -u
nodes updated with same value / nodeset / variablexpath
argument: -u
nodes updated with whatever the XPath resolves to,-u '…' -x '. * 1.25'
-u '…' -x 'concat("prefix-",.,"-suffix")'
-u '…' -x '../../@name'
-u '…' -x 'string(../../@name)'
-x
makes a deep copy of its argument. Given an element e
, <e a="v"><c1/><c2/></e>
, -x 'e'
copies the entire thing whereas -x 'e/node() | e/@*'
copies e
’s child nodes and e
’s attribute nodes (cf. the many-to-many move example). (Attributes are not children of their parent – background.)
Creation of a new (empty) node is often followed by an update using $prev
, for example to enable an -x
expression:
xmlstarlet edit --inplace \
-s '*' -t elem -n entry \
-u '$prev' -x 'date:date-time()' \
-s '$prev' -t attr -n user -v "${LOGNAME}" \
log.xml
See also: Moving nodes
-d (--delete) 'xpath'
See also: xpath
arguments | Delete a namespace
-r (--rename) 'xpath' -v (--value) 'new-name'
new-name
is an XML QName such as item
or svg:g
.
Caution: The -v (--value)
clause of this option admits its new-name
argument unmodifed – accepting an empty string or one containing tab, newline, XML special characters a.o. – ignoring XML QName requirements. In this respect it works like the -n (--name)
clause of the -i
, -a
, and -s
options.
See also: xpath
arguments | Rename elements example
-m (--move) 'xpath1' 'xpath2'
The source (xpath1
) of the -m (--move)
action can be nodes other than root and namespace: element, attribute, text, comment, or processing instruction.
The destination (xpath2
) must be a single element (the only container node) – otherwise xmlstarlet
exits with an error message and a non-zero return code. Source nodes will be appended as last nodes at destination.
Caution: With overlapping source and destination --move
completes with exit value 0 and no messages.
Caution: -m (--move)
causes a segmentation fault (exit value 139) if attempting a many-to-many move.
See also: xpath
arguments | Moving nodes | Move a namespace
xpath
argumentsFor the xpath
argument – in the -i
, -a
, -s
, -d
, -r
, -m
, -u
, -x
, and --var
options – xmlstarlet edit
can use an XPath 1.0 expression, incl. XPath 1.0 functions[1] and selected EXSLT functions, but using XSLT functions such as current()
, document()
, generate-id()
, or format-number()
triggers (as expected) the error xmlXPathCompOpEval: function «name» not found
.
[1] The XPath functions position()
and last()
rely on an evaluation context. With xmlstarlet edit
they can be used inside an XPath predicate (e.g. --move '…' '…[last()]'
) – last()
returning the context size – but used outside (as in -u '…' -x 'substring("abcdef",position(),1)'
) triggering an Invalid context position
error or an Invalid context size
error.
position()
alternative: 1 + count(preceding-sibling::«node»)
or count(preceding::«node»)
.
xpath
argumentsBased on the exslt«name»XpathCtxtRegister
functions in libexslt
xmlstarlet edit
supports selected functions from the EXSLT modules dates-and-times
, math
, sets
, and strings
in its xpath
arguments:
dates-and-times
in namespace date
, cf. libexslt/date.c,add
, add-duration
, date
, date-time
, day-abbreviation
, day-in-month
, day-in-week
, day-in-year
, day-name
, day-of-week-in-month
, difference
, duration
, hour-in-day
, leap-year
, minute-in-hour
, month-abbreviation
, month-in-year
, month-name
, second-in-minute
, seconds
, sum
, time
, week-in-month
, week-in-year
, year
math
in namespace math
, cf. libexslt/math.c,abs
, acos
, asin
, atan
, atan2
, constant
, cos
, exp
, highest
, log
, lowest
, max
, min
, power
, random
, sin
, sqrt
, tan
sets
in namespace set
, cf. libexslt/sets.c,difference
, distinct
, has-same-node
, intersection
, leading
, trailing
strings
in namespace str
, cf. libexslt/strings.c,align
, concat
, decode-uri
, encode-uri
, padding
All date
, math
, and set
are there but note the absence of str:replace
(removed from xmlstarlet edit
in 2012), str:split
, and str:tokenize
.
Hello, EXSLT
:
printf '%s' '<v y="." e="." pi="." z="." r="."/>' |
xmlstarlet edit -O \
--var e 'math:constant("E",10)' \
-u '*/@*[math:power(1e3,0)]' -x 'date:day-name("2011-09-24")' \
-u '*/@e' -x '$e' \
-u '*/@z' -x 'count(set:distinct(/*/@*))' \
-u '*/@pi' -x 'math:constant("PI",10)' \
-u '*/@r' -x '$e*/*/@z*/*/@pi' \
-r '*/@*[starts-with(.,str:align(25," "))]' -v 'ezpi'
The number 1e3
(in scientific notation) isn’t XPath 1.0 but understood by libxml2
.
Output:
<v y="Saturday" e="2.71828182" pi="3.14159265" z="3" ezpi="25.6192025590219"/>
See also: Divide a document into sections example
Caution: xmlstarlet edit -u '…' -x '…'
silently deletes the first child node at destination (whether absolute or relative -x
XPath expression), -u '…' -v '…'
silently deletes all child nodes. This suggests that the -u (--update)
option is intended to modify leaf nodes (aka external nodes) but xmlstarlet
’s silence in this matter extends to both documentation and source code.
Links: src/xml_edit.c#edUpdate()
Given this input,
<r><a><a1 k="v1">V1</a1>
<a2 k="v2">V2</a2></a>
<b><b1><D/></b1><b2/><b3><F1/><F2/></b3></b>
</r>
xmlstarlet edit -O -P \
-u 'r/b/*' -x '../../a/a1/text()' \
"${infile:-file.xml}"
produces:
<r><a><a1 k="v1">V1</a1>
<a2 k="v2">V2</a2></a>
<b><b1>V1</b1><b2>V1</b2><b3><F2/>V1</b3></b>
</r>
Workaround for -x
: insert a sacrificial element (or non-whitespace text) node as first child. The -i (--insert)
option has no effect if no nodes exist on destination’s child axis.
xmlstarlet edit -O -P \
-i 'r/b/*/node()[1]' -t elem -n 'herenow' \
-u 'r/b/*' -x '../../a/a1/text()' \
"${infile:-file.xml}"
Output:
<r><a><a1 k="v1">V1</a1>
<a2 k="v2">V2</a2></a>
<b><b1><D/>V1</b1><b2>V1</b2><b3><F1/><F2/>V1</b3></b>
</r>
If instead -i … -u 'r/b/*' -v 'gonethere'
:
… <b><b1>gonethere</b1><b2>gonethere</b2><b3>gonethere</b3></b> …
<link …/>
xmlstarlet edit --omit-decl --pf \
-s '/_:html/_:head' -t elem -n link \
--var lk '$prev' \
-s '$lk' -t attr -n 'rel' -v 'stylesheet' \
-s '$lk' -t attr -n 'type' -v 'text/css' \
-s '$lk' -t attr -n 'href' -v 'style/www.css' \
in.xhtml > out.xhtml
… <link rel="stylesheet" type="text/css" href="style/www.css"/>
…
See also: -P (--pf)
| -s (--subnode)
| --var
| $prev
| Use a namespace
$ cat file.xml
<doc><e f="0">false<c>false</c></e></doc>
$ :
$ xmlstarlet edit -L -O --pf --var T 'doc/e/@f' -u '$T' -x '($T+1) mod 2' file.xml
$ cat file.xml
<doc><e f="1">false<c>false</c></e></doc>
$ :
$ xmlstarlet edit -O --pf --var T 'doc/e/text()' -u '$T' -x 'not($T="true")' file.xml
<doc><e f="1">true<c>false</c></e></doc>
The -L (--inplace)
option edits the input file in-place.
See also: -P (--pf)
| --var
| -u (--update)
The EXSLT function str:replace
was removed from xmlstarlet edit
in 2012 so it’s either straight XPath 1.0:
xmlstarlet edit -O \
-u 'doc/e' \
-x 'concat(substring-before(.,"&text=Ulysses"),
substring-after(.,"&text=Ulysses"))' \
"${infile:-file.xml}"
or invoke xmlstarlet select
first to apply str:replace
:
avar="$(xmlstarlet select --text -t \
-v 'str:replace(doc/e,"&text=Ulysses","")' "${infile:-file.xml}")"
xmlstarlet edit -O -u 'doc/e' -v "${avar}" "${infile:-file.xml}"
With this XML input:
<doc>
<e>abcdefghi/q?sid=ry12345&text=Ulysses&ofmt=x'"ml</e>
</doc>
both commands produce:
<doc>
<e>abcdefghi/q?sid=ry12345&ofmt=x'"ml</e>
</doc>
See also: pyx
, depyx
If a record rec
has a desc
element add the text of the sibling name
element to it, otherwise add a new desc
element and set its value to the text of name
.
xmlstarlet edit \
-u '//recs/rec/desc' -x 'concat(.," - ",../name/text())' \
-a '//recs/rec[not(desc)]/name' -t elem -n 'desc' \
-u '$prev' -x '../name/text()' \
"${infile:-file.xml}"
See also: -u (--update)
| -a (--append)
| $prev
| -s (--subnode)
If desc
and name
are attributes, instead:
xmlstarlet edit \
-u '//recs/rec/@desc' -x 'concat(.," - ",../@name)' \
-s '//recs/rec[not(@desc)]' -t attr -n 'desc' \
-u '$prev' -x 'string(../@name)' \
"${infile:-file.xml}"
Note that -x 'string(../@name)'
– and ../@name
as a concat()
string argument – copies the attribute value, -x '../@name'
the attribute node; the latter fails as attributes cannot contain other nodes (causing an empty value to be assigned to @desc
).
xmlstarlet edit
has no if-then-else
construct so the following snippets use standard XPath 1.0 expressions and edit
’s back reference $prev
variable to apply conditions. All use the following nodeset variable:
--var T '/Server/Service[@name="Catalina"]'
Add $T/Connector
after last ditto:
-a '$T/Connector[last()]' -t elem -n Connector \
--var C '$prev' -s '$C' -t attr -n port -v '7654'
Nothing is added if no $T/Connector
node exists – in which case $prev
becomes null
and -s '$C' …
has no effect – otherwise a Connector
element is appended as first following sibling (to become the new last Connector
) and given a port
attribute.
Add $T/Connector
if not exists, as last child of $T
:
-s '$T[not(Connector)]' -t elem -n Connector \
--var C '$prev' -s '$C' -t attr -n port -v '8765'
Nothing is added if a $T/Connector
node exists – if the first -s …
matches nothing then the second -s …
(due to a null
$prev
) will match nothing.
Add $T/Connector
if not exists, after $T/Executor[1]
if exists:
-a '$T[not(Connector)]/Executor[1]' -t elem -n Connector \
--var C '$prev' -s '$C' -t attr -n port -v '9876'
Nothing is added if a $T/Connector
node exists or no $T/Executor
node exists, otherwise appended as first following sibling of first $T/Executor
.
See also: --var
| -a (--append)
| -s (--subnode)
| $prev
| -u (--update)
This code duplicates a bean
element from a formatted input file, inserts the copy right after the original, changing its @id
, and restores the inter-element whitespace that was. Use with edit
’s -P (--pf)
or -S (--ps)
option.
xmlstarlet edit --ps \
--var N '/beans/bean[@id="bean4"]' \
--var ws '$N/following::text()[1][normalize-space()=""]' \
-a '$N' -t elem -n bean \
-u '$prev' -x '$N/node() | $N/@*' \
-u '$prev/@id' -v 'bean4a' \
-i '$prev' -t text -n whitespace -v '' \
-u '$prev' -x '$ws' \
file.xml > newfile.xml
Notes:
--var N
references the original bean
elementws
holds the first text node after original, provided it’s all whitespace (otherwise empty)-a …
appends a new empty bean
element as a following-sibling of the original$prev
s refer to the new element-u …
makes a deep copy of the original’s child and attribute nodes-i … -u …
copies the whitespace following the original bean
, to keep the formattingtext
requires a (dummy) name and an initial value$prev
refers to the text node created by -i
See also: --var
| -a (--append)
| -u (--update)
| $prev
| -i (--insert)
xmlstarlet edit -m or -a, -u, -r, -d
The basic usecases for moving XML nodes (except namespace nodes) from source to destination are:
The ground rules are:
-m (--move)
action can handle the one-to-one or many-to-one usecases if nodes are to be appended at destination (an element node)--var name 'xpath'
collects a nodeset in a named variable$prev
variable refers to the node created by the most recent -i (--insert)
, -a (--append)
, or -s (--subnode)
action-x
) xpath
of the -u (--update)
action can be relative or non-relativeIn this context one-to-many is a copy (update) operation handled by -u … -v …
or -u … -x …
followed by -d …
.
See also: xpath
arguments | Namespaces
The following examples (one-to-one | many-to-one | many-to-many | move to position N) use this input XML file:
<div>
<a>anchor</a>
<p><span><a id="a1">value 1</a></span></p>
<p><span><a id="a2">value 2</a></span></p>
<p id="vol"/>
<p/>
</div>
Moving one node to another:
xmlstarlet edit --omit-decl \
-m '/div/p[4]' '/div/p[3]' \
-m '/div/p[3]/@id' '/div/p[2]' \
-m '/div/a' '/div' \
"${infile:-file.xml}"
Note the placement of a
as the last element in destination.
See also: -m (--move)
Output:
<div>
<p>
<span>
<a id="a1">value 1</a>
</span>
</p>
<p id="vol">
<span>
<a id="a2">value 2</a>
</span>
</p>
<p>
<p/>
</p>
<a>anchor</a>
</div>
Moving the first 3 p
elements to the 4th p
:
xmlstarlet edit --omit-decl \
-m '/div/p[position() <= 3]' '/div/p[4]' \
"${infile:-file.xml}"
Destination can also be expressed as (//p)[4]
, being the 4th p
in document order.
See also: -m (--move)
Output:
<div>
<a>anchor</a>
<p>
<p>
<span>
<a id="a1">value 1</a>
</span>
</p>
<p>
<span>
<a id="a2">value 2</a>
</span>
</p>
<p id="vol"/>
</p>
</div>
Whereas overlapping -m '/div/p[position() <= 3]' '/div/p[3]'
gives:
<div>
<a>anchor</a>
<p/>
</div>
Move all a
children of span
elements up one level, then remove the emptied span
s (untagging).
xmlstarlet edit --omit-decl --pf \
--var N '//span[a]' \
-a '$N' -t elem -n 'a' \
-u '$prev' -x 'preceding-sibling::span[1]/a/node() | preceding-sibling::span[1]/a/@*' \
-d '$N' \
"${infile:-file.xml}"
N
variable contains the nodeset of all span
elements which have an a
element as an immediate child-a (--append)
creates a sibling a
element for each span
element in $N
-u (--update)
inserts values in each newly created a
element ($prev
) using a relative XPath expression (-x
) to make a deep copy of a
’s child and attribute nodes (span
being the first preceding-sibling
of the new a
element)-d (--delete) …
deletes each element in $N
, after conversionIn the general case use the following-sibling
axis with the -i (--insert)
action and the preceding-sibling
axis with -a (--append)
. In this specific case preceding::span[1]
or ../span
also refers to preceding-sibling::span[1]
.
Output:
<div>
<a>anchor</a>
<p><a id="a1">value 1</a></p>
<p><a id="a2">value 2</a></p>
<p id="vol"/>
<p/>
</div>
Alternatively, untag using an identity transform plus a template such as:
<xsl:template match="span[a]">
<xsl:xsl:apply-templates/>
</xsl:template>
Move the 3rd p
element to position 2, to become the new 1st p
.
The move destination cannot be a list position so work around:
xmlstarlet edit --omit-decl \
--var src '/div/p[3]' \
--var tgt '/div/*[2]' \
-i '$tgt' -t elem -n 'p_TMP' \
-u '$prev' -x '$src/node() | $src/@*' \
-d '$src' \
-r '$prev' -v 'p' \
"${infile:-file.xml}"
Both -u '$prev' -x '…'
and -m '…' '$prev'
work here.
See also: -i (--insert)
| -u (--update)
| $prev
| -d (--delete)
| -r (--rename)
| -m (--move)
Output:
<div>
<a>anchor</a>
<p id="vol"/>
<p>
<span>
<a id="a1">value 1</a>
</span>
</p>
<p>
<span>
<a id="a2">value 2</a>
</span>
</p>
<p/>
</div>
This example – a many-to-many move operation – uses xpath
arguments with relative expressions and the EXSLT set:leading
function to group by h2
and move elements into div
s.
See also: Moving nodes | -i (--insert)
| $prev
| -u … -x …
| -d (--delete)
| EXSLT in xpath
arguments
<doc>
<h1/>
<h2/><p1/><p2/>
<h2/><p3/><p4/><p5/>
<h2/><p id="p6"><v/></p><p/>
</doc>
xmlstarlet edit -O \
-i 'doc/h2' -t elem -n div \
-u '$prev' -x 'set:leading(following-sibling::*, following-sibling::div[1])' \
-d 'doc/div/following-sibling::*[not(self::div)]' \
"${infile:-file.xml}"
Output:
<doc>
<h1/>
<div>
<h2/>
<p1/>
<p2/>
</div>
<div>
<h2/>
<p3/>
<p4/>
<p5/>
</div>
<div>
<h2/>
<p id="p6">
<v/>
</p>
<p/>
</div>
</doc>
See also: Use set:leading
and set:trailing
example
xmlstarlet edit
’s formatting optionsTest edit
’s various formatting options on this input file:
<div>
<a>anchor</a>
<p><span><a id="a1">value 1</a></span></p>
<p>
<span> <a
id="a2"
> value 2 </a
> </span>
</p>
<p id="empty"></p>
<p/>
</div>
See also: XML parsing and serialization | Duplicate an element, keep the formatting | -O (--omit-decl)
| -P (--pf)
| -S (--ps)
Default formatting:
xmlstarlet edit -O "${infile:-file.xml}"
<div>
<a>anchor</a>
<p>
<span>
<a id="a1">value 1</a>
</span>
</p>
<p>
<span>
<a id="a2"> value 2 </a>
</span>
</p>
<p id="empty"/>
<p/>
</div>
See also: select -I (--indent)
xmlstarlet edit -O --pf "${infile:-file.xml}"
<div>
<a>anchor</a>
<p><span><a id="a1">value 1</a></span></p>
<p>
<span> <a id="a2"> value 2 </a> </span>
</p>
<p id="empty"/>
<p/>
</div>
xmlstarlet edit -O --ps "${infile:-file.xml}"
<div>
<a>anchor</a>
<p><span><a id="a1">value 1</a></span></p>
<p>
<span> <a id="a2"> value 2 </a> </span>
</p>
<p id="empty"/>
<p/>
</div>
This combination appears to match that of xmllint --pretty 2 file.xml
. It could prove a temptation to a regex user (but beware). Swopping --pf
and --ps
makes no difference.
xmlstarlet edit -O --pf --ps "${infile:-file.xml}"
<div
>
<a
>anchor</a
>
<p
><span
><a
id="a1"
>value 1</a
></span
></p
>
<p
>
<span
> <a
id="a2"
> value 2 </a
> </span
>
</p
>
<p
id="empty"
/>
<p
/>
</div
>
Add a subnode:
xmlstarlet edit -O -s '*/p[4]' -t elem -n added "${infile:-file.xml}"
<div>
<a>anchor</a>
<p>
<span>
<a id="a1">value 1</a>
</span>
</p>
<p>
<span>
<a id="a2"> value 2 </a>
</span>
</p>
<p id="empty"/>
<p>
<added/>
</p>
</div>
xmlstarlet edit -O --pf -s '*/p[4]' -t elem -n added "${infile:-file.xml}"
<div>
<a>anchor</a>
<p><span><a id="a1">value 1</a></span></p>
<p>
<span> <a id="a2"> value 2 </a> </span>
</p>
<p id="empty"/>
<p><added/></p>
</div>
xmlstarlet edit -O --ps -s '*/p[4]' -t elem -n added "${infile:-file.xml}"
<div>
<a>anchor</a>
<p><span><a id="a1">value 1</a></span></p>
<p>
<span> <a id="a2"> value 2 </a> </span>
</p>
<p id="empty"/>
<p><added/></p>
</div>
Delete whitespace-only text nodes:
xmlstarlet edit -O --pf -d '//text()[normalize-space()=""]' "${infile:-file.xml}"
<div><a>anchor</a><p><span><a id="a1">value 1</a></span></p><p><span><a id="a2"> value 2 </a></span></p><p id="empty"/><p/></div>
See also: select -B (--noblanks)
xmlstarlet edit -O --ps -d '//text()[normalize-space()=""]' "${infile:-file.xml}"
<div>
<a>anchor</a>
<p>
<span>
<a id="a1">value 1</a>
</span>
</p>
<p>
<span>
<a id="a2"> value 2 </a>
</span>
</p>
<p id="empty"/>
<p/>
</div>
xmlstarlet edit -O --pf --ps -d '//text()[normalize-space()=""]' "${infile:-file.xml}"
<div
><a
>anchor</a
><p
><span
><a
id="a1"
>value 1</a
></span
></p
><p
><span
><a
id="a2"
> value 2 </a
></span
></p
><p
id="empty"
/><p
/></div
>
xmlstarlet format
The format
(aka fo
) command is an XML code formatter which accepts one input file, default is stdin
.
See also: XML parsing and serialization | select -I (--indent)
| Try out edit
’s formatting options
format
[option …] [«xml-file»]-h (--help)
- display help-e (--encode) «encoding»
- output in the given encoding-n (--noindent)
- do not indentSets indentation to zero spaces, left-aligning the output. Does not strip nonsignificant whitespace from input.
See also: select -B (--noblanks)
-o (--omit-decl)
- omit XML declarationCaution: Setting this option causes xmlstarlet format
to return an exit value equal to the number of bytes written (or -1 in case of error) modulo 256 (src/xml_format.c#foProcess(), cf. <libxml/xmlIO.h>
).
See also: XML declaration
-s (--indent-spaces) «N»
- indent output with N spacesDefault indentation per level is 2 spaces.
-t (--indent-tab)
- indent output with tabulation-C (--nocdata)
- replace CDATA section with text nodes$ infile=$(mktemp)
$ printf '<v><![CDATA[A\t%s\nZ]]></v>' '"&'\''<>' > "${infile}"
$ :
$ xmlstarlet format -o -C "${infile}"
<v>A "&'<>
Z</v>
$ :
$ xmlstarlet format -o "${infile}"
<v><![CDATA[A "&'<>
Z]]></v>
$ :
$ xmlstarlet pyx "${infile}"
(v
[A\t"&'<>\nZ
)v
See also: pyx
-D (--dropdtd)
- remove the DOCTYPE of the input docAlternatives: xmllint --dropdtd a.xml | …
or xsltproc --novalid b.xsl a.xml
.
-H (--html)
- input is HTMLReads input using the libxml2
HTML 4.0 parser, cf. API reference.
Attempt to convert HTML – or broken XML – to usable XHTML:
wget -qO- "${url}" |
xmlstarlet -q format --html --recover --dropdtd --omit-decl > output
See also: global -q (--quiet)
option
Links: HTML Tidy | W3C’s html-xml-utils | xmllint
-N (--nsclean)
- remove redundant namespace declarationsSee also: Remove namespace declarations
-Q (--quiet)
[undocumented] - suppress error outputDoes what the -q (--quiet)
global option does.
-R (--recover)
- try to recover what is parsableSee also: -H (--html)
--net
- allow network accessSee also: network access
xmlstarlet c14n
The c14n
(aka canonic
) command is used to convert an XML document to Canonical XML, a normal format intended to allow relatively simple comparison of pairs of XML documents for equivalence.
The W3C recommendations list examples of XML canonicalization. Examples of the c14n
command are given in the source code’s examples/c14n*.
Links: Canonical XML - Wikipedia | Canonical XML - W3C rec | Exclusive XML Canonicalization - W3C rec
Caution xmlstarlet c14n
does not flag invalid options, cf. src/xml_C14N.c#c14nMain().
c14n [option] [«mode»] «xml-file» [«xpath-file»] [«inclusive-ns-list»]
-h (--help)
- display help--net
- allow network accessSee also: network access
«mode»
- canonicalization mode«mode»
is one of the following:
--with-comments
- canonicalization with comments (this is the default mode)--without-comments
- canonicalization without comments--exc-with-comments
- exclusive canonicalization with comments--exc-without-comments
- exclusive canonicalization without comments«xml-file»
- input XML document file name (stdin is used if -
)Basic use case:
xml-generator-command | xmlstarlet c14n |
{ xmlstarlet c14n expected.xml | diff -b -C 1 - /dev/fd/3; } 3<&0 || log …
«xpath-file»
- XML file with document subset expressionCf. document subset in the W3C recommendation.
Sample xpath-file, from examples/xml/c14n.xpath:
<?xml version="1.0"?>
<XPath xmlns:n0="http://a.example.com" xmlns:n1="http://b.example">
(//. | //@* | //namespace::*)[ancestor-or-self::n1:elem1]
</XPath>
The InclusiveNamespaces PrefixList as a comma-separated (Caution: the user’s guide says blank-separated) list of namespace prefixes; for exclusive canonicalization only.
xmlstarlet validate
The validate
(aka val
) command performs validation on XML documents. Examples of the val
command are given in the source code’s examples/valid1. NB: XML Schemas (XSD) are not fully supported due to incomplete support in libxml2
.
Wikipedia links: XML schemas in general | XSD (W3C) | RELAX NG | DTD
See also: XML parsing and serialization | External entities
validate
[option …] [«xml-file-or-uri» …]-w (--well-formed)
- validate well-formedness only (default)-d (--dtd) «dtd-file»
- validate against DTD--net
- allow network accessSee also: network access
-s (--xsd) «xsd-file»
- validate against XSD schema-E (--embed)
- validate using embedded DTD-r (--relaxng) «rng-file»
- validate against Relax-NG schema-e (--err)
- print verbose error messages on stderr-S (--stop)
- stop on first error-b (--list-bad)
- list only files which do not validate-g (--list-good)
- list only files which validate-q (--quiet)
- do not list files (return result code only)xmlstarlet pyx
, depyx
xmlstarlet
’s pyx
(aka xmln
) and depyx
(aka p2x
) commands are used to convert XML to PYX during processing. PYX is a simple line-oriented text-based format usable with standard text tools such as grep
, sed
, or awk
. Given xmlstarlet
’s lack of native support for regular expressions this type of processing is occasionally useful, but beware of side effects: a pyx | depyx
pipeline does not guarantee an accurate roundtrip. pyx
uses a SAX parser.
PYX’s simplicity and lack of structure (and namespaces) makes it a good choice for certain types of operations – e.g. queries or editing of non-complex data like config files or database record sets – and a poor choice for handling complex documents or operations.
The PYX format lives a quiet life these days; xml.com
still carries its article on Pyxie whereas IBM’s intro is now at archive.org.
The first character of each line of PYX indicates the type of parsing event:
char event
---- -----
( start-tag
) end-tag
A attribute or namespace
- character data
? processing instruction
C comment
[ CDATA section
D DTD declaration
N notation declaration
U unparsed entity
& external entity
Caution: pyx
strips an XML declaration if present.
Caution: &
(ampersand) is buggy, e.g. external entities, cf. src/xml_pyx.c.
Caution: depyx
outputs non-collapsed empty elements, e.g. <void></void>
.
Caution: depyx
outputs XML special characters inside comments as entity references, e.g. &
as &
.
Caution: depyx
may output spurious newlines, for example after a comment, cf. src/xml_depyx.c.
Links: packages.debian.org
xml2
pyx
[–help] [«xml-file»]depyx
[–help] [«pyx-file»]xmlstarlet pyx "${infile:-pom.xml}" | head -n 40
Output:
(project
Axmlns http://maven.apache.org/POM/4.0.0
Axmlns:xsi http://www.w3.org/2001/XMLSchema-instance
Axsi:schemaLocation http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd
-\n
(modelVersion
-4.0.0
)modelVersion
-\n\n
(groupId
-com.github.example8
)groupId
-\n
(artifactId
-maven-simple
)artifactId
-\n
(version
-0.2-SNAPSHOT
)version
-\n
(packaging
-jar
)packaging
-\n\n
(name
-Simple Maven example
)name
-\n
(url
-https://example8.io/#example8/maven-simple/0.1
)url
-\n\n
(dependencies
-\n
(dependency
-\n
(groupId
-junit
)groupId
Dvorak
in tags’ textxmlstarlet pyx '/usr/share/X11/xkb/rules/base.xml' | grep '^-.*Dvorak'
xmlstarlet pyx index.xhtml | sed -n 's/^Ahref //p'
date
attributes to extended ISO 8601 formatUsing GNU sed (for e
flag of s
command) and GNU date (for -d (--date)
option and %F
format):
xmlstarlet pyx "${infile:-file.xml}" |
sed -E '1v;/^(Adate )(.*)/ s//date -d "\2" "+\1%F"/e' |
xmlstarlet depyx
The do-nothing v
command fails on non-GNU sed
s. The empty regex causes the last applied regex to be reused.
datetime
elements from basic to extended ISO 8601 formatxmlstarlet pyx "${infile:-file.xml}" |
sed '/^(datetime$/,/^)datetime$/ { /^-\(....\)\(..\)\(..\)\(..\)\(..\)\(..\)/ s//-\1-\2-\3T\4:\5:\6/; }' |
xmlstarlet depyx
Assuming non-nested datetime
elements. The /^-…/
condition leaves non-text nodes (incl. CDATA sections) unmodified.
foo
elementsxmlstarlet pyx "${infile:-file.xml}" |
awk -v FS='\n' '
$0=="(foo" {flag++; next;}
$0==")foo" {flag--; next;}
!flag
' |
xmlstarlet depyx
!flag
prints current line if flag
is zero.
parent/element
This awk
script reads a PYX
-format file and extracts each group
element having a parent glist
element (including any nested ditto) to a separate numbered file, converting each chunk from PYX
to XML by invoking xmlstarlet depyx
. Alternatively, output PYX
-format files and convert them in parallel.
Caution: Doesn’t understand namespaces, and beware of pyx/depyx
side effects.
xmlstarlet pyx "${infile:-file.xml}" |
awk -v FS='\n' -v partfmt='./part%04d.xml' -v element='group' -v parent='glist' '
/^\(/ { E[++level] = substr($0,2) }
$0 ~ "[()]" element "$" && E[level-1] == parent {
if ( !flag && "(" == substr($0,1,1) ) {
fxml = sprintf(partfmt,++partnum)
fpyx = fxml ".pyx.tmp"
flag=1
} else if ( flag ) {
print >> fpyx
close(fpyx)
system("xmlstarlet depyx " fpyx " > " fxml " && rm " fpyx)
flag=0
}
}
/^\)/ { --level }
flag { print >> fpyx }
'
See also: Create multiple result documents example
xmlstarlet escape
, unescape
The escape
(aka esc
) command converts
&<>
to the equivalent &
<
>
entity referencestaking its input from the first text string on the command line, or stdin
if it’s -
(dash) or absent.
The unescape
(aka unesc
) command does the inverse. Caution: unesc
leaves longer references such as €
unmodified (cf. MAX_ENTITY_NAME = 1+4
in src/xml_escape.c), prints a diagnostic message, and returns zero.
See also: Special characters | --xinclude
(for parse="text"
)
escape
[–help] [«text»]unescape
[–help] [«text»]$ xmlstarlet escape 'a&<>'\''"z'
a&<>'"z
$ :
$ xmlstarlet unescape 'a&<>'"z'; printf '\n'
a&<>'"z
$ :
$ # Unicode U+20AC EURO SIGN
$ xmlstarlet esc '€'
€
$ :
$ xmlstarlet unesc 'a€	100z
'
entity name too long: €
a€ 100z
$ :
$ xmlstarlet esc < "${infile:-file.xml}"
<foo>if they treat children as they do <bar>documentation</bar> they'll be <bat>prosecuted</bat></foo>
$ :
$ xmlstarlet esc < "${infile:-file.xml}" | xmlstarlet unesc
<foo>if they treat children as they do <bar>documentation</bar> they'll be <bat>prosecuted</bat></foo>
xmlstarlet list
The list
(aka ls
) command prints the contents of a file system directory in XML format. It accepts a directory name as its only argument; default is current dir. No recursion option and no -h (--help)
option available.
list
[«directory-name»]xmlstarlet list /etc/sgml
Output:
<dir>
<f p="rw-r--r--" a="20220704T194637Z" m="20211001T213455Z" s="376" n="docbook-xml.cat"/>
<d p="rwxr-xr-x" a="20220706T075545Z" m="20220419T100344Z" s="4096" n="docbook-xml"/>
<l p="rwxrwxrwx" a="20220706T075532Z" m="20220702T102624Z" s="31" n="catalog"/>
<f p="rw-r--r--" a="20220704T194637Z" m="20201229T232017Z" s="652" n="sgml-data.cat"/>
<f p="rw-r--r--" a="20220704T194637Z" m="20190227T001849Z" s="45" n="xml-core.cat"/>
</dir>
Elements inside the dir
document element have a one-char name indicating the file type,
f regular file
d directory
c character device
b block device
l symlink
p FIFO
s socket
u unknown
and attributes as returned by stat
:
p read-write-execute permissions for user, group, and other
a UTC time of last access in ISO 8601 basic format
m UTC time of last modification in ISO 8601 basic format
s file size in bytes
n filename
See man 7 inode
for permissions s
(S_ISUID, S_ISGID) and t
(S_ISVTX).
xmlstarlet transform
The transform
(aka tr
) command is an XSLT processor supporting XSLT 1.0 plus several EXSLT, crypto
, and saxon
extensions.
xmlstarlet transform
returns the same system-property()
values as xmlstarlet select
and xsltproc
:
xsl:vendor libxslt
xsl:vendor-url http://xmlsoft.org/XSLT/
xsl:version 1.0
Caution: xmlstarlet transform
doesn’t flag invalid options (src/xml_trans.c#trParseOptions()).
Caution: The --catalogs
option mentioned in the user’s guide was never implemented, it seems; not listed by xmlstarlet transform --help
.
transform [option …] «xsl-file» [-p|-s «name»=«value» …] [«xml-file-or-uri» …]
-h (--help)
- display help--omit-decl
- omit XML declarationSee also: XML declaration
-E (--embed)
- allow applying embedded stylesheetLinks: <?xml-stylesheet?>
- W3C recommendation | Embedding stylesheets - W3C XSLT 1.0 Rec
With an e.xml
XML document containing an <?xml-stylesheet type="text/xml" href="e.xsl"?>
processing instruction before the document element, the following command will run the XSLT stylesheet e.xsl
on e.xml
.
xmlstarlet tr -E e.xml > output
This option is mentioned in doc/xmlstarlet.txt but not in the user’s guide.
--show-ext
- show list of extensionsPrints a list of registered XSLT extensions to stderr
and terminates.
--val
- allow validate against DTDs or schemas--net
- allow fetch DTDs or entities over networkSee also: network access
--xinclude
- do XInclude processing on document inputLinks: XML Inclusions XInclude - W3C recommendation
See also: the XSLT document()
function
Basic XInclude example: include file2.xml
in file1.xml
.
cat << 'HERE' > 'file1.xml'
<root>
<gs>
<xi:include
xmlns:xi="http://www.w3.org/2001/XInclude"
href="file2.xml"
xpointer="xpointer(//g[@id='items']/*)"
parse="xml"
/>
</gs>
</root>
HERE
cat << 'HERE' > 'file2.xml'
<doc><g id="items"><g1/><g2/><g3/><g4/></g></doc>
HERE
xmlstarlet select -C -t -c / |
xmlstarlet transform --xinclude /dev/stdin 'file1.xml'
xmlstarlet transform --xinclude
does the XInclude processing using an XSLT stylesheet (generated on the fly by xmlstarlet select
) which duplicates its input by copying the root node (/
)xi:include
elements may appear in both the including and the included file(s)
parse="xml"
is the default and may be omitted, the only other option is parse="text"
(sample output below)href
attribute must refer to an XML documenthref
attribute is absent (or empty) when parse="xml"
it refers to the including document in which case the xpointer
attribute must be presentxpointer
attribute is given, or its value set to xpointer(/)
, the entire inclusion target will be includedxml:base
attributes will appear in output (cf. XML Base - W3C rec) unless the inclusion source and target(s) use a shared include location (hint: use absolute pathnames)xmlns:xi
namespace node will appear in output if given outside the xi:include
element (and may prove rather sticky)Output:
<root>
<gs>
<g1/><g2/><g3/><g4/>
</gs>
</root>
If instead parse="text"
:
<root>
<gs>
<doc><g id="items"><g1/><g2/><g3/><g4/></g></doc>
</gs>
</root>
An alternative to XInclude or document()
:
The hxincl
utility from the W3C html-xml-utils
package is HTML/XML-aware and expands certain embedded comments – or prints a makefile
rule listing the dependent include files – e.g. hxincl -x -s incfnm=file2.xml file1.xml
.
--maxdepth value
- increase the maximum depthUsed to detect template loops, cf. variable xsltMaxDepth
in xslt.h
.
--html
- input document(s) are in HTML formatReads input using the libxml2
HTML 4.0 parser, cf. XML parsing and serialization.
«xsl-file»
- main XSLT stylesheet for transformationCf. option -E (--embed)
.
-p
- parameter is an XPath expression-s
- parameter is a string literal-p
and -s
are repeatable, up to a maximum of 256 key-value pairs.
«name»=«value»
- name and value of the parameter passed to XSLT processorE.g. … -p m1='"Hello, XSLT"' -s m2='0xab 0xbb' file.xml
«xml-file»
- input XML document file name (stdin
is used if missing)This parameter is repeatable.
Links: Understanding XML namespaces - Evan Lenz | Namespaces in XML 1.0 / 1.1 - W3C rec | The “xml:” namespace - W3C memo | Namespaces at Pawson Q&A
xmlstarlet
predefines the namespaces xml
, xsl
, and those used with EXSLT functions and elements minus crypto
plus saxon
. By default (global option --doc-namespace
being in effect) select
and edit
can use the namespaces declared in the input’s root element (document element) without explicit -N «prefix»=«value»
options; if the default namespace is declared there it is bound to the _
(underscore) (aka DEFAULT
) prefix.
A QName (qualified name) with no prefix appearing in an XPath expression uses the null namespace, not the default namespace.
Prefixed namespace:
xmlstarlet select --text -t \
-m 'set:distinct(//mime:mime-type/@type)' -v '.' -n \
recently-used.xbel
Default namespace:
xmlstarlet edit --inplace --pf \
-u '/_:html/_:head/_:link/@href[.="www.css"]' -v 'solarized.css' \
-d '//_:*[contains(@class,"pull-quote"] | //_:aside' \
article.xhtml
Null namespace:
xmlstarlet select -T -t \
-m 'recs/rec' -v '@date' -n \
file.xml
See also: User’s guide ch. 5 | Undefined namespace prefix error | name bound to undefined prefix error
$ cat nspre.xml
<p:r xmlns:p="urn:ns1">r1
<r xmlns="urn:ns2">r2
<p:e>e1</p:e>
<e>e2</e>
</r>
</p:r>
$ :
$ xmlstarlet select -t -m '//p:*' -v 'normalize-space(text())' -n nspre.xml
r1
e1
$ :
$ xmlstarlet select -N p='urn:ns2' -t -m '//p:*' -v 'normalize-space(text())' -n nspre.xml
r2
e2
A namespace declaration cannot be created directly with XSLT 1.0[1]. It’s done by adding element and attribute nodes which have a (possibly null) namespace and a local name. Hint: In the following examples, add -C (--comp)
before select
’s -t
option to list the generated XSLT code.
[1] xmlstarlet edit
isn’t so picky: see edit -N
| Create a SOAP envelope
See also: select -R (--root)
echo '<v/>' |
xmlstarlet select -N m=urn:ssssssssskeyssssstickingagain:local -t \
-e m:doc -a flag -o 1 -b -a m:flag -o 0
<m:doc xmlns:m="urn:ssssssssskeyssssstickingagain:local" flag="1" m:flag="0"/>
echo '<v/>' |
xmlstarlet select -N ''='https://www.example.org' -t \
-e 'doc' -a 'flag' -v '"x"'
<doc xmlns="https://www.example.org" flag="x"/>
printf '<fi/>' |
xmlstarlet select -t \
-e fee -a faw -o fum -b -e fi -e fo -b -o fum
<fee faw="fum"><fi><fo/>fum</fi></fee>
Input file:
<h:rs id="hrs" xmlns="urn:e" xmlns:f="urn:f" xmlns:g="urn:g" xmlns:h="urn:h">
<f:r id="fr"/><g:r id="gr"/><h:r id="hr"/>
</h:rs>
Query:
xmlstarlet select -t \
-m 'h:rs' -e '{local-name()}' -c '@*' -b -n \
"${infile:-file.xml}"
<rs xmlns="urn:e" id="hrs"/>
Use -N ''=''
for xmlns=""
:
xmlstarlet select -N = -t \
-m 'h:rs' -e '{local-name()}' -c '@*' -b -n \
"${infile:-file.xml}"
<rs id="hrs"/>
Edit:
xmlstarlet edit --omit-decl --pf \
-s 'h:rs' -t elem -n 'foo' -v 'bar' \
-s '$prev' -t attr -n 'xmlns' -v '' \
"${infile:-file.xml}"
<h:rs xmlns="urn:e" xmlns:f="urn:f" xmlns:g="urn:g" xmlns:h="urn:h" id="hrs">
<f:r id="fr"/><g:r id="gr"/><h:r id="hr"/>
<foo xmlns="">bar</foo></h:rs>
xmlstarlet edit -m '//namespace::xsi' '/_:doc/_:el' examples/xml/S0.xml
returns non-zero and the error message FIXME: can't move namespace nodes
.
Links: examples/xml/S0.xml
xmlstarlet edit -d '//namespace::xsi' examples/xml/S0.xml
returns non-zero and the error message FIXME: can't delete namespace nodes
.
Links: examples/xml/S0.xml
See also: null-ns hack
Tools to remove redundant namespace declarations include xmlstarlet format
’s --nsclean
option, xmlstarlet c14n
, the--nsclean
option of xmllint
– all with side effects – but they won’t remove xmlns:xi
nodes left by XInclude
processing.
xml2/2xml
or pyx/depyx
and grep
can do the doctoring (Caution: no questions asked):
xml2 < file.xml | grep -v '^/doc/@xmlns:xi' | 2xml > newfile.xml
Caution: xmlstarlet edit
silently ignores the namespace of an inserted node referencing a previously inserted node having a namespace prefix.
For instance, to insert an element such as <ns1:c class="caveat"/>
it’s logical to say,
xmlstarlet edit \
-s '/a/b' -t elem -n 'ns1:c' \
-s '/a/b/ns1:c' -t attr -n 'class' -v 'caveat' \
file.xml
but the output will not contain the attribute node as the following -s
(or -i
or -a
) option returns an empty nodeset. In other words ns1:c
gets inserted but is not available as such in following edit
actions. This is on the to-do list as hinted by NULL /* TODO: NS */
in src/xml_edit.c#edInsert().
Workaround: Use the $prev
back reference instead, as in … -s '$prev' -t attr …
.
See also: -s (--subnode)
Clark notation
Links: XPath recommendation: Namespace nodes | namespace
axis
<doc xmlns="http://www.example.org"
xmlns:xi="http://www.w3.org/2001/XInclude">
a
<xi:include href="b.xml"/>
b
<c xmlns="urn:my:local"/>
<d xmlns="">In no namespace</d>
</doc>
xmlstarlet select -T -t \
-m 'set:distinct(//namespace::*)' \
-v 'concat("{",.,"}",name())' -n \
"${infile:-file.xml}"
Output:
{http://www.w3.org/XML/1998/namespace}xml
{http://www.w3.org/2001/XInclude}xi
{http://www.example.org}
{urn:my:local}
{}
Links: SOAP
on Wikipedia
printf '%s' '<v/>' |
xmlstarlet select --xml-decl --indent \
-N xsi='http://www.w3.org/2001/XMLSchema-instance' \
-N soapenv='http://schemas.xmlsoap.org/soap/envelope/' \
-N my='http://www.example.org/myService' \
-t \
-e 'soapenv:Envelope' \
-e 'soapenv:Header' -o '' -b \
-e 'soapenv:Body' \
-e 'my:Service' \
-e 'Param1' -a 'xsi:type' -o 'integer' -b -o '1' -b \
-e 'Param2' -a 'xsi:type' -o 'string' -b -o 'message' -b
<?xml version="1.0"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
<soapenv:Header/>
<soapenv:Body>
<my:Service xmlns:my="http://www.example.org/myService">
<Param1 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="integer">1</Param1>
<Param2 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="string">message</Param2>
</my:Service>
</soapenv:Body>
</soapenv:Envelope>
To force namespace nodes into the root element (namespace normalization) append dummy attributes there, then strip them:
printf '%s' '<v/>' |
xmlstarlet select \
-N xsi='http://www.w3.org/2001/XMLSchema-instance' \
-N soapenv='http://schemas.xmlsoap.org/soap/envelope/' \
-N my='http://www.example.org/myService' \
-t \
-e 'soapenv:Envelope' -a 'xsi:nslift' -b -a 'my:nslift' -b \
-e 'soapenv:Header' -o '' -b \
-e 'soapenv:Body' \
-e 'my:Service' \
-e 'Param1' -a 'xsi:type' -o 'integer' -b -o '1' -b \
-e 'Param2' -a 'xsi:type' -o 'string' -b -o 'message' -b \
| xmlstarlet edit -d 'soapenv:*/@xsi:nslift | soapenv:*/@my:nslift'
<?xml version="1.0"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:my="http://www.example.org/myService">
<soapenv:Header/>
<soapenv:Body>
<my:Service>
<Param1 xsi:type="integer">1</Param1>
<Param2 xsi:type="string">message</Param2>
</my:Service>
</soapenv:Body>
</soapenv:Envelope>
To have xmlstarlet edit
produce the latter version, for example:
printf '%s' '<v/>' |
xmlstarlet edit \
-r '*' -v 'soapenv:Envelope' \
-a '*' -type attr -n 'xmlns:soapenv' -v 'http://schemas.xmlsoap.org/soap/envelope/' \
-a '*' -type attr -n 'xmlns:xsi' -v 'http://www.w3.org/2001/XMLSchema-instance' \
-a '*' -type attr -n 'xmlns:my' -v 'http://www.example.org/myService' \
-s '*' -type elem -n 'soapenv:Header' -v '' \
-s '*' -type elem -n 'soapenv:Body' \
-s '$prev' -type elem -n 'my:Service' \
--var svc '$prev' \
-s '$svc' -type elem -n 'Param1' -v '1' \
-s '$prev' -type attr -n 'xsi:type' -v 'integer' \
-s '$svc' -type elem -n 'Param2' -v 'message' \
-s '$prev' -type attr -n 'xsi:type' -v 'string'
A non-exhaustive list of xmlstarlet
messages follows.
See xmlstarlet
user’s guide ch. 5 “Namespaces and default namespace”.
See also: Use a namespace
xmlstarlet edit
displays its usage reminder)… but offers no other clue
text
node without a -n (--name)
or -v (--value)
clause?-v (--value)
clause?See failed to load external entity (re stdin
)
Triggered by:
--net
optionuri
parameter using HTTPS protocolstdin
add a -
(dash) to the command line to work around parsing issues, e.g. if format
’s -e (--encode) «encoding»
is the last option (explanation: «encoding»
mistaken for filename in src/xml_format.c#foProcess(), similarly in src/xml_validate.c#valParseOptions())See Delete a namespace.
See Move a namespace.
See xpath
arguments.
The destination operand of xmlstarlet edit
’s -m (--move)
option does not exist or is not a single element node.
This is a message from the XML parser: a warning but not necessarily an error. Recall that :
(colon) in component names is tolerated but unrecommended as it makes a document not namespace-well-formed.
Example:
<doc><div vid="yo" abc:txt="hello"/></doc>
To copy the value of @abc:txt
to @vid
, for example:
xmlstarlet -q edit \
-u '*/*[@*[local-name()="abc:txt"][namespace-uri()=""]]/@vid' \
-x 'string(../@*[local-name()="abc:txt"][namespace-uri()=""])' \
file.xml
where
-q
option suppresses messages from the parser about the missing namespace definitionlocal-name()
and namespace-uri()
are used as a workaround in the special case where an unprefixed name contains a :
(colon), because @abc:txt
would cause the parser to look for the non-existing abc
namespaceSee also: Use a namespace | Undefined namespace prefix
Triggered by
--doc-namespace
and prefix not declared in input’s root element or with -N
option--no-doc-namespace
and prefix not declared with -N
optionSee also: Use a namespace | Namespace prefix «name» … is not defined
edit
’s -N
option must be the last non-action option.
In libxml2
the XML_PARSE_HUGE
option is disabled by default to prevent denial-of-service attacks. This triggers the xmlSAX2Characters: huge text node: out of memory
error when loading a text node larger than 10 MB.
For a workaround see this patch.
Triggered by
select
’s --var «name»=«value»
namespace issuecrypto
namespace prefix; see EXSLTSee also: Use a namespace | select -N
.
xmlstarlet edit
’s xpath
arguments do not support XSLT functions.
See EXSLT.
A -b (--break)
too many was used.
EXSLT is an extension library for XSLT, mainly for XSLT 1.0. It provides missing language features such as functions to handle strings, math, dates, and sets, as well as nodeset coercion, user-defined functions, and dynamic evaluation of strings containing XPath expressions.
Linked with the libexslt library xmlstarlet
’s XSLT processing commands, select
and transform
, support a larger number of EXSLT functions (and a few elements) whereas xmlstarlet edit
supports a subset.
xmlstarlet
predefines the EXSLT namespaces with prefixes date
, dyn
, exslt
(not exsl
), math
, set
, and str
, as well as the saxon
and test
namespaces. To use crypto
functions (or the func
elements) declare the namespace explicitly with -N
, for the exslt:document
element see example.
Note: In 2012 str:replace
was removed as broken when used in an XPath context (xmlstarlet edit
) but remains available when used in an XSLT context (select
and transform
).
See also: List of XSLT extensions | transform --show-ext
| select -N
Links: EXSLT docs on github.io | EXSLT project on github.com | EXSLT on stackoverflow.com
Observations:
libexslt
’s math:random
appears to use the standard C random generator, returning the same number sequence (beginning with 0.84018771715471
) in every xmlstarlet
sessionlibexslt
(not EXSLT) limits the length of str:padding
to 100000 (one hundred thousand)crypto
- 5 functions in namespace http://exslt.org/crypto
(md4
, md5
, rc4_decrypt
, rc4_encrypt
, sha1
), cf. libexslt/crypto.c source codesaxon
- 5 functions in namespace http://icl.com/saxon
(eval
, evaluate
, expression
, line-number
, systemId
), from the saxon 6.5.5 extensionstest
- function and element in namespace http://xmlsoft.org/XSLT/
- echoes its argumentCompute the height of an XML tree as the maximum depth of a branch node of the tree. Root and leaf nodes count as zero.
-t (--template)
makes /
(root node) the current nodedyn:map
function maps each XML element (root node descendants) to its depth: the number of its ancestor elementsmath:max
returns the maximum numberxmlstarlet select -T -t \
-v 'math:max(dyn:map(descendant::*,"count(ancestor::*)"))' -n \
"${infile:-file.xml}"
Links: Tree (data structure) on Wikipedia
A simple recordset enclosed in a root element,
<rs>
<r id="1" user="3" name="abc" date="2017-08" flag1="false"/>
<r id="2" user="7" name="defg" date="2019-12" flag1="false"/>
<r id="3" user="9" name="hijkl" date="2020-02" flag1="true"/>
<r id="4" user="11" name="mno" date="2022-01" flag1="false"/>
<r id="5" user="14" name="pqrs" date="2022-01" flag1="false"/>
</rs>
is converted to TSV by:
xmlstarlet select --text -t \
--var ishdr="${hdr:-1}" \
--var ofs -o "$(printf '\t')" -b \
--var ors -n -b \
--var fnhdr='"concat($ofs,name())"' \
--var fnrow='"concat($ofs,string())"' \
-m '*/*[$ishdr and position() = 1]' \
-v 'substring-after(str:concat(dyn:map(@*,$fnhdr)),$ofs)' -v '$ors' \
-b \
-m '*/*' \
-v 'substring-after(str:concat(dyn:map(@*,$fnrow)),$ofs)' -v '$ors' \
-b \
"${infile:-file.xml}"
where:
ofs
and ors
variables hold output field and record separators, respectivelyishdr
flag controls output of a header line with attribute namesfnhdr
is the function argument (as text) to the EXSLT dyn:map
function mapping an attribute node to its name()
preceded by a field separator; dyn:map
returns a nodeset which is stringified by the EXSLT str:concat
function, then stripped of the initial extra separator by substring-after()
fnrow
is the ditto to map to its text value, in this case .
(dot) can replace string()
*/*
and @*
only (both in 2 places)TSV output:
id user name date flag1
1 3 abc 2017-08 false
2 7 defg 2019-12 false
3 9 hijkl 2020-02 true
4 11 mno 2022-01 false
5 14 pqrs 2022-01 false
If the data items exist as child elements of /rs/r
(e.g. after: xmlstarlet sel -t -e '{name(*)}' -m '*/*' -e '{name()}' -m '@*' -e '{name()}' -v . "${infile}"
), instead dyn:map(*,…)
(2 places) to process child::*
rather than attribute::*
.
Links: packages.debian.org
xml2
h2
sections with foo
titles select div
with the most p
childrenProcess an HTML document where each h2
element heads a number of div
s:
xmlstarlet select -t \
--var T='//_:div[_:p][contains(preceding::_:h2[1]/text(),"foo")]' \
-c '($T[count(_:p) = math:max(dyn:map($T,"count(_:p)"))])[1]' \
file.xhtml
T
collects the div
nodes with least one p
child following each h2
title containing the text foo
dyn:map
maps each div
to the count of its p
children, math:max
picks the maximum count($T[…])[1]
selects the first of possibly more div
s with a maximum p
countConverting between local time (L
) and UTC time (Z
) in different time zones, the TZ
environment variable selecting an entry in /usr/share/zoneinfo
, putting EXSLT functions date:date-time
, date:add
, date:duration
, and date:seconds
to use.
Links: EXSLT dates-and-times
docs | tz database on Wikipedia
for zone in \
'America/Vancouver' 'Europe/Vatican' 'Asia/Manila' 'Pacific/Chatham'
do
printf '<v/>\n' |
TZ=":$zone" xmlstarlet select --text \
-t \
--var ofs -o "$(printf '\t')" -b \
--var ors -n -b \
--var tz -o "$zone" -b \
--var dttodayL -v 'date:date-time()' -b \
--var tzoffset='substring($dttodayL,20)' \
--var tzseconds='number(concat(
translate(substring($tzoffset,1,1),"-−+","--"),
substring($tzoffset,2,2) * 60 * 60 +
substring($tzoffset,5,2) * 60))' \
--var dtepochL='concat("1970-01-01T00:00:00",$tzoffset)' \
--var dtepochZ -o '1970-01-01T00:00:00+00:00' -b \
--var dttodayZ='concat(substring-before(date:add($dtepochZ,
date:duration(date:seconds($dttodayL))),
"Z"),"+00:00")' \
--var dttoday2L='date:add($dtepochL,
date:duration(date:seconds($dttodayZ)+$tzseconds))' \
-v 'concat(
"# ",$tz,$ors
,"dttodayL", $ofs, $dttodayL, $ors
,"tzoffset", $ofs, $tzoffset, $ors
,"tzseconds",$ofs, $tzseconds,$ors
,"dttodayZ", $ofs, $dttodayZ, $ors
,"dttoday2L",$ofs, $dttoday2L,$ors
)'
done |
expand -t 16
Using … -v 'date:date-time()' -b
(rather than …='date:date-time()'
) to avoid xmlstarlet select
’s EXSLT namespace issue.
Output:
# America/Vancouver
dttodayL 2023-08-15T13:04:07-07:00
tzoffset -07:00
tzseconds -25200
dttodayZ 2023-08-15T20:04:07+00:00
dttoday2L 2023-08-15T13:04:07-07:00
# Europe/Vatican
dttodayL 2023-08-15T22:04:07+02:00
tzoffset +02:00
tzseconds 7200
dttodayZ 2023-08-15T20:04:07+00:00
dttoday2L 2023-08-15T22:04:07+02:00
# Asia/Manila
dttodayL 2023-08-16T04:04:07+08:00
tzoffset +08:00
tzseconds 28800
dttodayZ 2023-08-15T20:04:07+00:00
dttoday2L 2023-08-16T04:04:07+08:00
# Pacific/Chatham
dttodayL 2023-08-16T08:49:07+12:45
tzoffset +12:45
tzseconds 45900
dttodayZ 2023-08-15T20:04:07+00:00
dttoday2L 2023-08-16T08:49:07+12:45
set:leading
and set:trailing
Links: EXSLT set
functions on github.io
Using an explicit namespace declaration -N str='…'
to avoid xmlstarlet select
’s EXSLT namespace issue.
printf '%s\n' '<v s="/fee/fi/fo/fum"/>' |
xmlstarlet select -R \
-N str='http://exslt.org/strings' \
-t \
--var sep='"/"' \
--var T='str:split(*/@s,$sep)' \
-n -c '$T' -n \
-n -c 'set:leading($T,$T[.="fo"])' -n \
-n -m '$T' -c 'set:leading($T,following-sibling::*[1])' -n -b \
-n -m '$T' -c 'set:trailing($T,preceding-sibling::*[1])' -n -b \
-n -e 'foo' -m 'set:trailing($T,$T[.="fee"])' -v 'concat($sep,.)' -b -b -n
Output:
<xsl-select>
<token>fee</token><token>fi</token><token>fo</token><token>fum</token>
<token>fee</token><token>fi</token>
<token>fee</token>
<token>fee</token><token>fi</token>
<token>fee</token><token>fi</token><token>fo</token>
<token>fee</token><token>fi</token><token>fo</token><token>fum</token>
<token>fee</token><token>fi</token><token>fo</token><token>fum</token>
<token>fi</token><token>fo</token><token>fum</token>
<token>fo</token><token>fum</token>
<token>fum</token>
<foo>/fi/fo/fum</foo>
</xsl-select>
See also: Divide a document into sections | Generate a date sequence
Links: ISO 8601 standard on Wikipedia | Daylight saving time (DST) on Wikipedia | TZ
env.var. on OpenGroup
This is where the EXSLT strings
, sets
, and dates-and-times
modules come together to compute a datetime series from 3 arguments:
start
, default value is today in ISO 8601 extended formatstep
, default value is 1 day in ISO 8601 formatmaxct
, the maximum count, default value is 100XSLT doesn’t do loops but EXSLT lets you create a string of any length and str:split
it into a nodeset each member of which contains the step
interval. An initial empty time period (PT0S
) is added to handle the first item. Using set:leading
to collect N step
s, date:sum
to sum them up, then adding the sum to start
, to arrive at a result for each item in the series. It’s an inefficient algorithm so nil points for performance (though probably fast enough for ordinary maxct
values).
xsdateseq0() {
printf '<v start="%s" step="%s" maxct="%s"/>\n' \
"${1:-$(date '+%Y-%m-%d')}" "${2:-P1D}" "${3:-100}" |
xmlstarlet select --text \
-N str='http://exslt.org/strings' \
-t --var start='*/@start' \
--var padlen='(*/@maxct - 1) * (1 + string-length(*/@step))' \
--var D='str:split(concat("PT0S",str:padding($padlen, concat(" ",*/@step))))' \
-m '$D' -v 'date:add($start, date:sum(set:leading($D,following-sibling::*[1])))' -n
}
Using an explicit namespace declaration -N str='…'
to avoid xmlstarlet select
’s EXSLT namespace issue. libexslt
doesn’t support date:format-date
but there’s an implementation (EXSLT function and XSLT template) by Jeni Tennison.
Print 53 dates starting on January 1st with a step value of 7 days.
TZ=':Europe/Vatican' xsdateseq0 '2023-01-01' P7D 53 | pr -t -8 -s' ' -
2023-01-01 2023-02-19 2023-04-09 2023-05-28 2023-07-16 2023-09-03 2023-10-15 2023-11-26
2023-01-08 2023-02-26 2023-04-16 2023-06-04 2023-07-23 2023-09-10 2023-10-22 2023-12-03
2023-01-15 2023-03-05 2023-04-23 2023-06-11 2023-07-30 2023-09-17 2023-10-29 2023-12-10
2023-01-22 2023-03-12 2023-04-30 2023-06-18 2023-08-06 2023-09-24 2023-11-05 2023-12-17
2023-01-29 2023-03-19 2023-05-07 2023-06-25 2023-08-13 2023-10-01 2023-11-12 2023-12-24
2023-02-05 2023-03-26 2023-05-14 2023-07-02 2023-08-20 2023-10-08 2023-11-19 2023-12-31
2023-02-12 2023-04-02 2023-05-21 2023-07-09 2023-08-27
Print 10 datetimes with a step value of 1 day, 1 hour, 1 minute, and 5 seconds.
Note the lack of DST adjustment.
TZ=':Europe/Vatican' xsdateseq0 '2022-10-26T07:30:00+01:00' 'P1DT1H1M5S' 10
2022-10-26T07:30:00+01:00
2022-10-27T08:31:05+01:00
2022-10-28T09:32:10+01:00
2022-10-29T10:33:15+01:00
2022-10-30T11:34:20+01:00
2022-10-31T12:35:25+01:00
2022-11-01T13:36:30+01:00
2022-11-02T14:37:35+01:00
2022-11-03T15:38:40+01:00
2022-11-04T16:39:45+01:00
Print 8 datetimes in email format (RFC 822). Uses GNU date
for -R
and -f
options and DST adjustment.
TZ=':Europe/Vatican' xsdateseq0 '2022-10-26T11:30:00' '' 8 | date -Rf-
Wed, 26 Oct 2022 13:30:00 +0200
Thu, 27 Oct 2022 13:30:00 +0200
Fri, 28 Oct 2022 13:30:00 +0200
Sat, 29 Oct 2022 13:30:00 +0200
Sun, 30 Oct 2022 12:30:00 +0100
Mon, 31 Oct 2022 12:30:00 +0100
Tue, 01 Nov 2022 12:30:00 +0100
Wed, 02 Nov 2022 12:30:00 +0100
xmlstarlet select
doesn’t support xsl:key
but grouping can be done using EXSLT functions. As an example, group repeating fields in each record by element name and merge their texts in document order.
<recs>
<rec>
<fb>fee</fb>
<fa>foo</fa>
<fd>zzz</fd>
<fc>bat</fc>
<fa>bar</fa>
<fb>faw</fb>
<fd>bat</fd>
<fb>fum</fb>
<fa>quux</fa>
</rec>
<rec>
<fa>fee</fa>
<fc>fo</fc>
<fc>fum</fc>
<fa>fi</fa>
</rec>
</recs>
xmlstarlet select --indent -t \
--var sfs="'${sfs:- }'" \
-e '{name(*)}' \
-m '*/*' \
--var rec='.' \
-e '{name()}' \
-m 'set:distinct(dyn:map(*,"name()"))' \
-s 'A:T:-' '.' \
-e '{.}' \
-v 'substring-after(
str:concat(
dyn:map($rec/*[name()=current()],"concat($sfs,text())")
)
,$sfs)' \
"${infile:-file.xml}"
sfs
, takes its value from a shell variable of the same name, defaulting to a single space character-e (--elem)
opens a named element using an attribute value template, duplicating the input structure-m (--match)
iterates over rec
elements, the 2nd over unique field names (also used as sort key): dyn:map
maps fields to their name and set:distinct
eliminates duplicatesrec
elements with the same name as the current field dyn:map
adds a sub-field separator to each text, returning a nodeset which is stringified by str:concat
then stripped of the initial extra separatorOutput:
<recs>
<rec>
<fa>foo bar quux</fa>
<fb>fee faw fum</fb>
<fc>bat</fc>
<fd>zzz bat</fd>
</rec>
<rec>
<fa>fee fi</fa>
<fc>fo fum</fc>
</rec>
</recs>
See also: Remove all but the latest member of each group example
This section takes xmlstarlet
off the beaten track.
select
as edit
script generatorxmlstarlet select
doesn’t copy its input to output; edit
cannot do xsl:for-each
, xsl:choose
, or use XSLT functions. In tandem they have a wider range – but so does an XSLT stylesheet.
Links: shell quoting | shell word expansions
xmlstarlet edit
’s rename action requires a literal value for the new name so XPath functions are out. But select
can generate the edit command, for example to number elements (here using the XSLT format-number()
function):
<Names>
<Name>fee</Name>
<Name>faw</Name>
<Name>fum</Name>
</Names>
# shellcheck shell=sh disable=SC2016
xmlstarlet select --text -t \
--var sq -o "'" -b \
-o "xmlstarlet edit --omit-decl \\" -n \
-o " --var N 'Names/Name' \\" -n \
-m '*/*' \
-o ' -r ' -v 'concat($sq,"$N[",position(),"]",$sq)' \
-o ' -v ' -v 'concat($sq,name(),format-number(position(),"0000"),$sq)' -o " \\" -n \
-b \
-f -n \
"${infile:-file.xml}"
Output:
xmlstarlet edit --omit-decl \
--var N 'Names/Name' \
-r '$N[1]' -v 'Name0001' \
-r '$N[2]' -v 'Name0002' \
-r '$N[3]' -v 'Name0003' \
file.xml
To execute the output as a shell script:
xmlstarlet-select-command | sh -s > result.xml
Alternatively, replace $N
with (Names/Name)
, or process elements in reverse order by repeatedly renaming Names/Name[last()]
– the predicate […]
binding to the nearest XPath location step.
EXSLT functions provide another way to do grouping. Here’s how to create a shell script invoking xmlstarlet edit
to delete all but the latest member of each group. The input file has module ID strings on the form: group ID, _
(underscore), major version number, .
(dot), minor version number – as shown in this snippet:
<mod>mrR_0.9</mod>
<mod>mrR_0.10</mod>
<mod>mrM_0.19</mod>
<mod>mrM_0.2</mod>
<mod>mrM_0.20</mod>
<mod>mrM_0.3</mod>
Method:
dyn:map
a module ID string to its group ID (invoking $fngrpid
)set:distinct
eliminates duplicatesstr:split
the version into major and minor numberdyn:map
each number to an N-digit string (invoking $fnverno
)str:concat
stringifies the nodeset returned by dyn:map
creating strings on the form 00010011
(i.e. version 1.11
) to be passed to -s (--sort)
/..
(root node has no parent)edit
actions-n
before -v
set:difference
($M
minus $keep
)
$
s (dollar signs) to guard against shell word expansions# shellcheck shell=sh disable=SC2016,SC2064
xmlstarlet select --text -t \
--var dq -o '"' -b \
--var sep1='"_"' \
--var sep2='"."' \
--var fngrpid -o 'substring-before(.,$sep1)' -b \
--var fnverno -o 'format-number(.,"0000")' -b \
--var allm='//_:mods/_:mod' \
-o "xmlstarlet edit \\" -n \
-o " --var M '//_:mods/_:mod' \\" -n \
-o " --var keep '/.. " \
-m 'set:distinct(dyn:map($allm,$fngrpid))' \
--var grpid_='concat(.,$sep1)' \
-m '$allm[starts-with(.,$grpid_)]' \
-s 'D:N:-' '0 + str:concat(dyn:map(str:split(substring-after(.,$sep1),$sep2),$fnverno))' \
--if 'position() = 1' \
-n -v 'concat(" | $M[.=",$dq,current(),$dq,"]")' \
-b \
-b \
-b \
-o "' \\" -n \
-o " --delete 'set:difference(\$M,\$keep)' \\" -n \
-f -n \
"${infile:-file.xml}"
See also: select
’s --var
| -m (--match)
| -s (--sort)
| -i (--if)
| -b (--break)
| -f (--inp-name)
| Group by element name and merge text example
Links: XSLT functions format-number()
| current()
Output:
xmlstarlet edit \
--var M '//_:mods/_:mod' \
--var keep '/..
| $M[.="mrR_1.11"]
| $M[.="mrS_0.7"]
| $M[.="mrE_2.2"]
| $M[.="mrM_0.20"]' \
--delete 'set:difference($M,$keep)' \
file.xml
To execute the output as a shell script:
xmlstarlet-select-command | sh -s > result.xml
See also: edit
’s --var
| -d (--delete)
This is the basic “update node with result of shell command” usecase.
Create a shell script to have xmlstarlet edit
add missing targets to an XLIFF version 2.0 localization data file by invoking translate-shell
to supply translated phrases:
# shellcheck shell=sh disable=SC2016
xmlstarlet select --text -t \
--var sq -o "'" -b \
--var dq -o '"' -b \
--var cmdopt='concat("trans -from ",/_:xliff/@srcLang," -to ",/_:xliff/@trgLang)' \
-o 'xmlstarlet edit --pf '\\ -n \
-m '//_:segment[not(_:target)]' \
--var xpath-a='concat("//_:unit[@id=",$dq,parent::_:unit/@id,$dq,"]/_:segment/_:source")' \
--var src-e -v 'str:replace(_:source,$sq,concat($sq,"\",$sq,$sq))' -b \
-o ' -a ' -v 'concat($sq,$xpath-a,$sq)' -o ' -t elem -n target '\\ -n \
-o " -u '\$xstar:prev'" -o ' -v "$(' -v 'concat($cmdopt," ",$sq,$src-e,$sq)' -o ')" '\\ -n \
-o " -i '\$xstar:prev' -t text -n indent -v '' \\" -n \
-o " -u '\$xstar:prev' -x 'preceding-sibling::node()[2][normalize-space()=\"\"]' \\" -n \
-o ' '\\ -n \
-b \
-f -n \
"${infile:-file.xml}"
--text
option to generate a shell script--var
to define quote characters and longer substrings as XPath variables-m
processes segment
elements not having a target
src-e
variable escapes single quotes in the source
text, str:replace
converting '
to '\''
,--var … -b
to avoid select
’s --var «name»=«value»
namespace issue-o
and -v
generate text quoting correctly for both the shell and XPath, escaping $
(dollar sign) inside double quotes to guard against shell word expansionsSnippets from sample data file:
<source>Über "O'ona"$tra"</source>
<source>&Speichern als <.oona></source>
Sample output:
xmlstarlet edit --pf \
-a '//_:unit[@id="2"]/_:segment/_:source' -t elem -n target \
-u '$xstar:prev' -v "$(trans -from de -to en 'Über "O'\''ona"$tra"')" \
-i '$xstar:prev' -t text -n indent -v '' \
-u '$xstar:prev' -x 'preceding-sibling::node()[2][normalize-space()=""]' \
\
-a '//_:unit[@id="24"]/_:segment/_:source' -t elem -n target \
-u '$xstar:prev' -v "$(trans -from de -to en '&Speichern als <.oona>')" \
-i '$xstar:prev' -t text -n indent -v '' \
-u '$xstar:prev' -x 'preceding-sibling::node()[2][normalize-space()=""]' \
\
file.xml
--pf
preserves original formatting-a
appends an empty target
element (after source
)trans
is invoked through shell command substitution-u … -v "$(…)"
adds the translated text to the new target
element – this is faster than -a … -v "$(…)"
which requires &
(ampersand) encoded as &
(i.e. avoids adding xmlstarlet escape
to the pipeline)-i …
and -u … -x …
indent target
to the same column as source
$xstar:prev
reference the newly appended target
element, the 3rd the indentation textThe output can be executed directly as a shell script:
xmlstarlet-select-command | sh -s > result.xlf
Snippets from result.xlf
:
<target>About "O'ona"$tra"</target>
<target>&Save as <.oona></target>
select
as XSLT stylesheet generatorThis section is included for completion.
Links: shell quoting | shell word expansions
To use XSLT or extension elements not supported by xmlstarlet select
’s options it’s possible to have select
spell out an XSLT stylesheet. This example inserts one document into another, the first xsl:template
is the identity transform.
: "${xml1=z1.xml}" "${xml2=z2.xml}"
test -s "$xml1" || printf '%s\n' '<v><THERE/></v>' > "$xml1"
test -s "$xml2" || printf '%s\n' '<w><x q="what">ever</x></w>' > "$xml2"
echo '<v/>' |
xmlstarlet select -t \
-e xsl:transform -a version -o 1.0 -b \
-e xsl:param -a name -o xdoc -b -o /dev/null -b \
-e xsl:template -a match -o '@*|node()' -b \
-e xsl:copy \
-e xsl:apply-templates -a select -o '@*|node()' -b -b \
-b \
-b \
-e xsl:template -a match -o THERE -b \
-e xsl:copy-of -a select -o 'document($xdoc,/)' -b -b \
-b \
-b |
xmlstarlet transform --omit-decl /dev/stdin -s xdoc="$xml2" "$xml1"
Output: <v><w><x q="what">ever</x></w></v>
Notes:
document($xdoc,/)
keeps the XSLT processor from resolving $xdoc
relative to the stylesheet’s location and attempting to open the probably nonexisting /dev/z2.xml
-b (--break)
s may be omitted as they’re not followed by any template options$var
– unlike ${var}
– can be an XSLT or a shell variableSee also: -t (--template)
| -e (--elem)
| -a (--attr)
| -o (--output)
| -b (--break)
| transform
Links: XSLT document()
Take this one step further and create a library of shorthand shell functions (causing shellcheck.net a.o. to vociferate):
xslxfm() ## xsl:transform(); non-closed
printf " -e xsl:transform -a version -v '1.0' -b "
xsltpl() ## xsl:template(match name?); non-closed
printf " -e xsl:template -a match -v '%s' -b%s" \
"${1:?usage: template(match name?)}" "${2:+ -a name -v '$2' -b }"
xslapt() ## xsl:apply-templates(select?); closed
case $# in
(0) printf ' -e xsl:apply-templates -b ' ;;
(1) printf " -e xsl:apply-templates -a select -v '%s' -b -b " "$1" ;;
(*) printf ' usage: xslapt(select?)\n' 1>&2; false ;;
esac
xslIDN() { ## xsl:template name=identity; closed
xsltpl '@*|node()' 'identity'
printf " -e xsl:copy "
xslapt '@*|node()'
printf " -b -b "
}
xslhelp() { ## list xsl* functions in this file
sed -n -e '/^\(xsl[^ ]*\)()[ {]*## \(.*\)/ s//\1 \2/p' "${_pnself_:-$0}" |
expand -t 12
}
Produce the same output as above having select
, not transform
, include the external document,
. "${pathto:-./}xsdefs.sh"
echo '<v/>' |
xmlstarlet sel -I -t $(xslxfm) $(xslIDN) $(xsltpl THERE) -c "document('${xml2}',/)" -b |
xmlstarlet tr --omit-decl /dev/stdin "$xml1"
providing the following stylesheet to the XSLT processor,
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="@*|node()" name="identity">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="THERE">
<w>
<x q="what">ever</x>
</w>
</xsl:template>
</xsl:transform>
as tee /dev/stderr |
inserted in the pipeline will show.
Links: The XML version of the XSLT 1.0 Rec contains element syntax and function prototypes
Use the EXSLT exslt:document
element as a template to split an XML document into multiple parts and output them as separate files in an existing directory.
Caution: Leaving out the exslt:nop
attribute here triggers an xsl:extension-element-prefix : undefined namespace exslt
error, with or without -N exslt='http://exslt.org/common'
(exslt
is predefined).
printf '%s\n' '<v><x>fee fi</x><y>fo fum</y></v>' |
xmlstarlet select -I -t \
--var part-prefix -o "${outDir:-/tmp/}part" -b \
-e 'xsl:transform' \
-a 'version' -o '1.0' -b \
-a 'exslt:nop' -o '' -b \
-a 'extension-element-prefixes' -o 'exslt' -b \
-e 'xsl:template' \
-a 'match' -o '/' -b \
-m '*/*' \
-e 'exslt:document' \
-a 'href' -v 'concat($part-prefix,format-number(position(),"000"),".xml")' -b \
-a 'method' -o 'xml' -b \
-a 'omit-xml-declaration' -o 'yes' -b \
-e 'part' \
-a 'no' -v 'position()' -b \
-a 'of' -v 'last()' -b \
-c '.' |
{ printf '%s\n' '<v/>' | xmlstarlet transform /dev/fd/3 /dev/stdin ; } 3<&0
Generated XSLT script:
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:exslt="http://exslt.org/common" version="1.0" exslt:nop="" extension-element-prefixes="exslt">
<xsl:template match="/">
<exslt:document href="/tmp/part001.xml" method="xml" omit-xml-declaration="yes">
<part no="1" of="2">
<x>fee fi</x>
</part>
</exslt:document>
<exslt:document href="/tmp/part002.xml" method="xml" omit-xml-declaration="yes">
<part no="2" of="2">
<y>fo fum</y>
</part>
</exslt:document>
</xsl:template>
</xsl:transform>
Generated documents:
<part no="1" of="2"><x>fee fi</x></part>
<part no="2" of="2"><y>fo fum</y></part>
See also: Split XML file example
xmlstarlet
XSLT extensionsList of registered extension functions, elements, and modules:
Registered XSLT Extensions
--------------------------
Registered Extension Functions:
{http://exslt.org/common}node-set
{http://exslt.org/common}object-type
{http://exslt.org/crypto}md4
{http://exslt.org/crypto}md5
{http://exslt.org/crypto}rc4_decrypt
{http://exslt.org/crypto}rc4_encrypt
{http://exslt.org/crypto}sha1
{http://exslt.org/dates-and-times}add
{http://exslt.org/dates-and-times}add-duration
{http://exslt.org/dates-and-times}date
{http://exslt.org/dates-and-times}date-time
{http://exslt.org/dates-and-times}day-abbreviation
{http://exslt.org/dates-and-times}day-in-month
{http://exslt.org/dates-and-times}day-in-week
{http://exslt.org/dates-and-times}day-in-year
{http://exslt.org/dates-and-times}day-name
{http://exslt.org/dates-and-times}day-of-week-in-month
{http://exslt.org/dates-and-times}difference
{http://exslt.org/dates-and-times}duration
{http://exslt.org/dates-and-times}hour-in-day
{http://exslt.org/dates-and-times}leap-year
{http://exslt.org/dates-and-times}minute-in-hour
{http://exslt.org/dates-and-times}month-abbreviation
{http://exslt.org/dates-and-times}month-in-year
{http://exslt.org/dates-and-times}month-name
{http://exslt.org/dates-and-times}second-in-minute
{http://exslt.org/dates-and-times}seconds
{http://exslt.org/dates-and-times}sum
{http://exslt.org/dates-and-times}time
{http://exslt.org/dates-and-times}week-in-month
{http://exslt.org/dates-and-times}week-in-year
{http://exslt.org/dates-and-times}year
{http://exslt.org/dynamic}evaluate
{http://exslt.org/dynamic}map
{http://exslt.org/math}abs
{http://exslt.org/math}acos
{http://exslt.org/math}asin
{http://exslt.org/math}atan
{http://exslt.org/math}atan2
{http://exslt.org/math}constant
{http://exslt.org/math}cos
{http://exslt.org/math}exp
{http://exslt.org/math}highest
{http://exslt.org/math}log
{http://exslt.org/math}lowest
{http://exslt.org/math}max
{http://exslt.org/math}min
{http://exslt.org/math}power
{http://exslt.org/math}random
{http://exslt.org/math}sin
{http://exslt.org/math}sqrt
{http://exslt.org/math}tan
{http://exslt.org/sets}difference
{http://exslt.org/sets}distinct
{http://exslt.org/sets}has-same-node
{http://exslt.org/sets}intersection
{http://exslt.org/sets}leading
{http://exslt.org/sets}trailing
{http://exslt.org/strings}align
{http://exslt.org/strings}concat
{http://exslt.org/strings}decode-uri
{http://exslt.org/strings}encode-uri
{http://exslt.org/strings}padding
{http://exslt.org/strings}replace
{http://exslt.org/strings}split
{http://exslt.org/strings}tokenize
{http://icl.com/saxon}eval
{http://icl.com/saxon}evaluate
{http://icl.com/saxon}expression
{http://icl.com/saxon}line-number
{http://icl.com/saxon}systemId
{http://xmlsoft.org/XSLT/}test
Registered Extension Elements:
{http://exslt.org/common}document
{http://exslt.org/functions}result
{http://xmlsoft.org/XSLT/}test
Registered Extension Modules:
http://exslt.org/functions
http://icl.com/saxon
http://xmlsoft.org/XSLT/
… as output by:
xmlstarlet transform --show-ext 2>&1 |
awk -F '\n' -v S='sort' '/^Registered|^-*$/{ close(S); print; next } { print | S }'
Caution: Missing from above list is the EXSLT extension element {http://exslt.org/functions}function
(aka func:function
) which works as documented in libexslt
despiteelement-available("func:function")
returning false
.
xmlstarlet
news summarySnipped from SourceForge news and files sections. Covers versions 1.0.3 through 1.6.1 (in reverse order).
XMLStarlet 1.6.1 Released
1.6.1: August 9, 2014
- handle unicode arguments under Windows
There is no difference for non-Windows platforms.
Posted by Noam Postavsky 2014-08-09
XMLStarlet 1.6.0 Released
Changes:
get rid of "helpful" message about namespaces
update user guide
Enhancements:
add --stop option to val
add global option --no-doc-namespace
Build:
let the make install target succeed even if docs aren't built.
Posted by Noam Postavsky 2014-06-13
XMLStarlet 1.5.0 is released, changes:
Bugs:
avoid segfault on pyx non-existant file
fix unescaping of entities straddling 4K byte boundary (Bug #102)
Enhancements:
unescape hex entities (&#xXX;)
give a helpful message if doc has default namespace and nothing matched
add "_" and "DEFAULT" as names for document's top-level default namespace
Adding a global quiet option
ed: Allow omitting value argument to create empty element.
use default attribute values in sel subcommand
Build:
fix test variables to work with newer automake (1.11 -> 1.13)
fix usage2c.awk for mawk
scripts for building on mingw
Posted by Noam Postavsky 2013-07-07
1.4.2: Dec 28, 2012
- pyx: avoid segfault on documents with multiple attributes (Bug
#3595212)
1.4.1: Dec 8, 2012
- avoid segfault when attempting to edit the document node (Bug
#3575722)
- Packaging:
- include doc/xmlstar-fodoc-style.xsl in the dist so that the
--enable-build-docs option works from the tarball (Bug
#3580667)
- AC_SUBST PACKAGE_TARNAME for automake so that documentation is
installed to the right place (Bug #3561958)
- Test Suite:
- avoid test failures due to XML formatting and whitespace
changes (also fixes Bug #3572789)
- use automake's parallel test suite
- make bigxml tests much faster by using whitespace instead of nodes
- don't test str:replace() with ed: it doesn't work outside of
xslt in new libxslt
- ignore extra errors from libxml 2.9.0 bug
- let tests run using busybox
- add runAllTests.sh to run tests without make
1.4.0: Aug 26, 2012
- Documentation:
- executable name used in documentation now matches
--transform-program-name (Bug #3283713)
- added Makefile rules for generating documentation
(./configure --enable-build-docs)
- ed subcommand:
- relative XPaths are now handled correctly (Bug #3527850)
- the last nodeset inserted by an edit operation can be
accessed as the XPath variable $prev (or $xstar:prev)
- add --var option to define XPath variables
- allow ed -u -x to insert nodesets instead of converting to
string
- remove hard limit for number of edit operations (Bug
#3488240)
- pyx now handles namespaces correctly
1.3.1: Jan 14, 2012
- handle multiple values for --value-of properly (Bug #2563866)
- substitute external entities (Bug #3467320)
- pyx output needs space between attribute name and value (Bug #3440797)
1.3.0: Oct 7, 2011
- avoid ASCII CRs in UTF-16/32 text (reported by Ming Chen)
- --value-of outputs concat values of all nodes (Req #2563866)
- encode special chars for ed -u -x
- allow use of exslt functions in ed -u -x
- add --var to select (allow --var <name>=<value> as well as --var
<name> <value> --break)
- work around libxml bug that passes bogus data to error handler
(Bug #3362217)
Source: README.1.3.0, updated 2011-10-02
1.2.1: July 07, 2011
- check for NULL nodeset result (Bugs #3323189, #3323196)
- "-" was being confused with --elif
- generated XSLT should also have automatic namespaces
- allow -N after other option (Bug #3325166)
- namespace values were being registered as prefixes
- avoid segfault when asked to move namespace nodes
- missing newline in ed --help message
- test scripts portability
- no bashisms allowed in NetBSD sh
- make BRE portable: '+' is not allowed
- deal with msys path conversion properly (Bug #3178657)
- don't use XML_SAVE_WSNONSIG #if libxml < 2.7.8 (Bug #3310475)
Source: README.1.2.1, updated 2011-07-07
1.2.0: June 1, 2011
- implement ed --update --expr
- use top-level namespace definitions from first input file, this
should remove the need to define namespaces on the command line
with -N in most cases.
- select exits with 0 only if result is non-empty (Req #3155702)
- add -Q to select, like grep's -q
- add column number to error messages
- restore input context (lost in version 1.0.3) to error messages
(Bug #3305659)
- print extra string information in error messages
- use entity definitions from dtd (Bug #3305659)
- add --net option to c14n, ed, fo, and val (Req #1071398)
- remove --catalog from tr --help message since it isn't actually supported
- add --elif and --else to sel --help message
Source: README.1.2.0, updated 2011-06-01
1.1.0: Apr 3, 2011
- bug fix for BSD/OSX: check that O_BINARY is declared before
#including io.h (Bug 3211822)
- select improvements
- add --elif and --else options
- sorting on multiple fields
- correct (for English) lexical sorting instead of ASCIIbetical
- only outputs namespaces that are actually used
- only outputs xsl:param inputFile if it's used
- don't make separate templates if there is only 1
- link to shared libxml and libxslt libraries by default
- add library version info to --version output
- add directory argument for ls; exit status indicates
failure/success instead of file count
- stop using old SAX1 interface, xmlstarlet will now link with a
libxml configured --without-sax1 and --without-legacy
Source: README.1.1.0, updated 2011-04-04
1.0.6: Mar 13 2011:
- Bug fixes:
- c14n: set stdout to binary mode on Windows to avoid carriage
returns (Bug 840665)
- fix broken --help options
- put actual behaviour of -P, -S options in --help output (see
Bug/Feature Request 2858514)
- remove unneeded escape of quote in ./configure --help
- don't distribute xmlstarlet.spec: it's generated by ./configure
- add src/xml.o depends on version.h to Makefile.am so compile
will succeed without dependency info (eg after make distclean)
- add test for subcommands' --help option
- Portability fixes:
- yes isn't portable, use an awk program instead
- neither read -r nor xargs -0 are portable, escape the command
lines to xargs instead
- don't use nonportable echo -n option
Source: README.1.0.6, updated 2011-03-13
1.0.5: Feb 11 2011:
- Bug fixes:
- use XSLT_PARSE_OPTIONS, else CDATA nodes can cause corruption (Bug 3158482)
- fix typo in help message
- get rid of warnings in -ansi -pedantic mode
- required libxml2 version is 2.6.23
- usage strings use argv[0] as program name
- --help prints to stdout and exits with success
- double /'s under msys to avoid path conversion
- Portability fixes:
- don't use xargs (-d isn't portable)
- use -Wall only for gcc
-Build system:
- use -ansi in configure, and check for strdup declaration
- seperate list of sources and tests into subdirs
- check git version during make, not just autoconf
- tarball releases of configure.ac have actual version number
instead of querying git
Source: README.1.0.5, updated 2011-02-11
1.0.4: Jan 16 2011:
- Bug fixes:
- encode special XML characters in arguments (can now include quotes in xpath)
- non-zero exit code when input file is not found (Bug 3158488)
- ed with --pf/--ps options doesn't reformat output (Bug 3158490)
- exit() instead of segfault when trying to delete namespace nodes
(Bug 1120417)
- added --disable-static-libs ./configure option to use shared libxml2 and libxslt
- non-recursive make
- use TESTS and XFAIL_TESTS for testing, nicer output
Source: README, updated 2011-01-16
1.0.3: Nov 18 2010:
- Bug fixes:
escape --value in update mode (Bug 3052978)
c14n now includes default attributes (Bug 1505579)
Allow special characters in sel --output literal (Bug 1912978)
remove warning from xml_trans.c (Bug 1521756)
Use xmlReader interface so line numbers are 32-bit (Bug 1219072)
test for error messages on lines past 2^16 (Bug 1219072)
don't look for embedded dtd if not asked (Bug 1167215)
Source: README, updated 2010-11-18
xmlstarlet
wishlist AD 2003In 2003 Mikhail Grushinskiy posted his xmlstarlet
wishlist.
Mikhail Grushinskiy - 2003-05-14
Here is a list of next steps in XmlStarlet on TODO or wishlist:
1. Editing xml documents with xml 'ed' option must be improved.
2. add --recover to fix broken XML documents
3. Document how to use proxy in XmlStarlet with nanohttp/ftp via http_proxy, ftp_proxy environment variables
ex: export http_proxy=http://192.168.0.1:8080/
4. Add ability to specify xpath expression in XmlStarlet 'el' option
5. -u option of XmlStarlet 'xml el' should work with others too. I.e. sort | uniq equivalent should work when attributes and attributes values are printed out.
6. Think about 'join' analogue
7. Something like xml sel -t -m <xpath> --exec <shell-cmd> --args <args> is needed
8. How would be possible to insert one XML fragment into another XML document from command line without XInclude?
9. Make use of regular expressions ex: Make all element names uppercase
10. Start thinking about diff and patch. Several tree diff algorithms could be implemented for ordered and non ordered labeled trees. What about creating context diff? How to define context in XML space? Good luck solving NP-Complete problems.
11. What about XUpdate implementation?
12. How about making output with syntax coloring in case if it is running in terminal (not batch) mode. Similar to GNU ls?
13. Convert XML to Lisp S-expressions
14. XML Namespace normalization process (There is a XSLT stylesheet floating on the web which could do it).
15. Make use of performance updates from libxml2. mmap() for document chunks, XMLReader interface, etc.
16. More regression testing test cases required.
17. Better Documentation User Guide and Tutorial is needed. More good and real-world examples.
If you wish to enhance/add something to this list, please, reply.
XmlStarlet home page,
http://xmlstar.sourceforge.net/
Thanks,
--MG
Mikhail Grushinskiy
Mikhail Grushinskiy - 2003-05-23
Few additions
1. Better namespace support.
2. something like xml head, and xml tail
3. list directories in XML
4. Defining variables in xml sel
Ex: xml sel -t --m / -d var_name -v @elem
-d would translate into
<xsl:variable name="var_name">
</xml:variable>
and this variable could be referenced as $var_name
in XPATH
5. CygWin binaries?
xmlstarlet
usage notes
xmlstarlet
commands
xmlstarlet elements
xmlstarlet select
select
[option …] template … [«xml-file» …]-h (--help)
- display help-Q (--quiet)
- do not write anything to standard output-C (--comp)
- display generated XSLT-R (--root)
- print root element <xsl-select>
-T (--text)
- output is text (default is XML)-I (--indent)
- indent output-D (--xml-decl)
- do not omit XML declaration line-B (--noblanks)
- remove nonsignificant whitespace from XML tree-E (--encode) «encoding»
- output in the given encoding-N «prefix»=«value»
- declare namespaces--net
- allow fetch DTDs or entities over network-t (--template)
is <xsl:template match="/">
-m (--match)
is <xsl:for-each select="xpath-expr">
-s (--sort)
is <xsl:sort …/>
--var «name» «value» --break
is <xsl:variable name="…">«value»</xsl:variable
>--var «name»=«value»
is <xsl:variable name="…"/>
--var «name»=«value»
namespace issue-o (--output)
is <xsl:text>«value»</xsl:text>
-e (--elem)
is <xsl:element name="…">
-a (--attr)
is <xsl:attribute name="…">
-c (--copy-of)
is <xsl:copy-of select="xpath-expr"/>
-v (--value-of)
is string-join((xpath-expr),newline)
-i (--if) [--elif …] [--else]
is <xsl:when> … [<xsl:otherwise>]
-b (--break)
ends current container element-n (--nl)
prints a newline-f (--inp-name)
prints pathname / URI of current inputxmlstarlet edit
edit
option […] [action …] [«xml-file-or-uri» …]-i (--insert)
- add node before-a (--append)
- add node after-s (--subnode)
- add node as child$prev
variable (aka $xstar:prev
)--var name 'xpath'
-u (--update) 'xpath' -v (--value) 'value'
-u (--update) 'xpath' -x (--expr) 'xpath'
-d (--delete) 'xpath'
-r (--rename) 'xpath' -v (--value) 'new-name'
-m (--move) 'xpath1' 'xpath2'
xpath
argumentsxpath
argumentsxmlstarlet format
format
[option …] [«xml-file»]-h (--help)
- display help-e (--encode) «encoding»
- output in the given encoding-n (--noindent)
- do not indent-o (--omit-decl)
- omit XML declaration-s (--indent-spaces) «N»
- indent output with N spaces-t (--indent-tab)
- indent output with tabulation-C (--nocdata)
- replace CDATA section with text nodes-D (--dropdtd)
- remove the DOCTYPE of the input doc-H (--html)
- input is HTML-N (--nsclean)
- remove redundant namespace declarations-Q (--quiet)
[undocumented] - suppress error output-R (--recover)
- try to recover what is parsable--net
- allow network accessxmlstarlet c14n
xmlstarlet validate
validate
[option …] [«xml-file-or-uri» …]-w (--well-formed)
- validate well-formedness only (default)-d (--dtd) «dtd-file»
- validate against DTD--net
- allow network access-s (--xsd) «xsd-file»
- validate against XSD schema-E (--embed)
- validate using embedded DTD-r (--relaxng) «rng-file»
- validate against Relax-NG schema-e (--err)
- print verbose error messages on stderr-S (--stop)
- stop on first error-b (--list-bad)
- list only files which do not validate-g (--list-good)
- list only files which validate-q (--quiet)
- do not list files (return result code only)xmlstarlet pyx
, depyx
xmlstarlet escape
, unescape
xmlstarlet list
xmlstarlet transform
transform [option …] «xsl-file» [-p|-s «name»=«value» …] [«xml-file-or-uri» …]
-h (--help)
- display help--omit-decl
- omit XML declaration-E (--embed)
- allow applying embedded stylesheet--show-ext
- show list of extensions--val
- allow validate against DTDs or schemas--net
- allow fetch DTDs or entities over network--xinclude
- do XInclude processing on document input--maxdepth value
- increase the maximum depth--html
- input document(s) are in HTML formatxmlstarlet edit
displays its usage reminder)xmlstarlet
XSLT extensionsxmlstarlet
news summaryxmlstarlet
wishlist AD 2003