xmlstarlet-notes
newsLink: https://martin7th.github.io/xmlstarlet-notes/
repology.org
links(link) Added links to the
versions pages at repology.org
for xmlstarlet
,
libxml2
, and libxslt
.
--doc-namespace
reads from first input file(link)(link) Clarified that by
default (--doc-namespace
being in effect) namespaces
declared in the root element of the first input file need not
be declared with -N
options. select
and
edit
both support multiple input files.
(link) New entry: process output
from xmlstarlet elements
with awk
and
graphviz
’ dot
.
select
section)(link)xsl:sort
was
missing in the intro.
(link)select="«value»"
was missing in the --var
header.
select
section)(It seemed a good idea at the time.)
generate-id()
fixed in libxslt 1.1.38
A long-standing issue with libxslt
has been fixed in
version 1.1.38: The result of generate-id()
is now deterministic across multiple transformations fixing many issues
with reproducible builds.
This means that xmlstarlet
’s select
and
transform
commands can do variable-based grouping without
using xsl:key
, axes, or extensions, and even group across
multiple files in one pass. For examples see postings by G. Ken Holman
on stackoverflow.com
and xsl-list
.
(link) Added error
message. Triggered by missing --net
option.
(link) Added error
message. Only select
, transform
, and
canonic
seem to understand an entity reference.
(link) Added error
message. select
fails with a run-time error if input file
pathnames on the command line contain single quotes (known bug).
func:function
(Appendix A)(link)
Deleted phrase: despite
.element-available("func:function")
returning
false
element-available
isn’t supposed to return true for
func:function
as this is a top-level element, not an
instruction, per the XSLT 1.0 recommendation sections 14.1
and 15.
The cautionary note about the missing listing of
{http://exslt.org/functions}function
has been removed
because libxslt
since version 1.1.39 registers and lists it
as a top-level extension element.
(link) Added example: CDATA can protect against default serialization.
(link) New entry, with
examples of libxml2
numbers in scientific notation.
-L (--inplace)
- edit input file(s) in-place(link) Added:
-P (--pf)
is given(link) New entry,
demonstrating if-then-else
logic with
xmlstarlet edit
.
--xinclude
-
do XInclude processing on document input(link) The entire entry on XInclude has been rewritten.
(link) Added error message.
--huge
and --big-lines
options to xmlstarlet
Recently the author of the xmlstarlet-notes
document has
provided a patch to
make xmlstarlet
use ‘huge’ nodes and ‘big lines’. It
requires the program to be rebuilt from source code.
Quoting from the README:
This patch adds 2 global options to
xmlstarlet
, available in the subcommandselements
,select
,edit
,format
,canonic
,validate
, andtransform
:
--huge
Load XML files withlibxml2
’sXML_PARSE_HUGE
parser option. Without this option the parser will fail with a “xmlSAX2Characters: huge text node: out of memory” error when loading a text node larger than 10 MB. (xmlstarlet
’spyx
subcommand uses the SAX API which has no such limitation.) Inlibxml2
theXML_PARSE_HUGE
option is disabled by default to prevent denial-of-service attacks.
--big-lines
Load XML files withlibxml2
’sXML_PARSE_BIG_LINES
parser option. This allows line numbers larger than 65535 to be reported correctly in error messages and (forselect
andtransform
) in output from thesaxon:line-number
extension. There is currently an open issue on this option suggesting it’s limited to text nodes, however, it appears to have been resolved by now as line numbers are output as expected for all node types except the root node (/
) which is fixed at line-1
.
(link)edit
newline variable example changed:
--var nl 'substring-before('"$(printf '"\nA"')"',"A")'
(link) Added:
The xmlEscapeEntities
function in libxml2
’s xmlsave.c
serialization module gives special treatment to characters
&<>
(output as &
,
<
, and >
) but neither apostrophe
nor double quote ('"
). xmlstarlet
has no
option to override this.
$prev
variable(link) Changed:
The $prev
variable refers to a nodeset (changed from
‘node’).
(link) Added:
a pyx | depyx
pipeline does not guarantee an
accurate roundtrip
(link) New entry, dealing
with :
(colon) in a name when not indicating a namespace
prefix.
(links): select -t (--template)
| select -m (--match)
| ex4: external lookup
table | extract subtree example |
xmlstarlet edit
| tree height example
Changed:
The current node – part of the XSLT processing model – was referred to as context node – part of the XPath evaluation context. Changed to ‘current node’ as this is more in line with XSLT 1.0 terminology.
SourceForge now accepts the HTTPS protocol so all remaining HTTP
links (xmlstar.sourceforge.net
and
saxon.sourceforge.net
) have been changed to HTTPS.
-q (--quiet)
:
suppress error output(link) Changes:
Removed: ‘Caution: this
option is ignored by xmlstarlet edit
.’
Added: Link to format -Q (--quiet)
local option.
-Q (--quiet)
[undocumented] -
suppress error output(link) New entry:
Does what the -q (--quiet)
global option
does.
-b (--break)
ends current
container element(link) Changes:
Added to the --var without =
list item: any nested
--var
s are local to the enclosing --var
but
must have unique variable names.
(link) 2nd and 3rd list item replaced with:
xmlstarlet
on SourceForge: homepage | docs | user’s guide | news | source | files
| discussion
| bugsdoc/xmlstarlet.txt
there is not the latest version as it doesn’t mention $prev
, --var
, -L (--inplace)
, and -E (--embed)
– the user’s
guide is still silent on thesexmlstarlet
on Fossies
– an accessible presentation of source, examples, and more:xmlstarlet-1.6.1.tar.gz
contents | xmlstarlet
user’s guide (1-page) | doc/xmlstarlet.txt
(latest version)Links to doc/xmlstarlet
changed to the latest version at
Fossies here: $prev
, --var
, -L (--inplace)
, and -E (--embed)
.
(link) Added after 1st paragraph:
An input filename starting with -
(dash) – unless it’s
short for stdin
– must be prefixed with ./
(dot slash) otherwise it will be parsed as an option, possibly causing
select
(Caution) to ignore the file.
Beware of known bugs for filenames containing (#123 )
'
(single quote), or (#110)
urlencoded characters, e.g. %20
.
See also: couldn’t read file | failed to load external entity
-L (--inplace)
- edit input file in-place(link) Added as last list item:
%20
, cf. Global options and
parameters(new entry)
See failed to load external
entity (re stdin
)
(link) 3rd list item changed to:
stdin
add a -
(dash) to
the command line to work around parsing issues, e.g. if
format
’s -e (--encode) «encoding»
is
the last option (explanation: «encoding»
mistaken for
filename in src/xml_format.c#foProcess(),
similarly in src/xml_validate.c#valParseOptions())Initial release.