tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Testing xmlgrep


As part of this year's NetBSD SoC, I'm working on XML command-line
tools, the first of which, xmlgrep has now reached a somewhat usable
state (besides I think it's time I made some kind of public report).

The code should still be considered (very?) experimental though (given
I'm the only user/tester so far), and more testing ought to be done,
but I thought some of you may well have real-world test cases
I wouldn't think of, so if you do, please do not hesitate to send them
to me. Even if you do not have the time to learn the pattern language
right now, providing me with the idea and the data should be enough
and would be appreciated.

My xmlgrep features include:
- stream-oriented model;
- generally small memory footprint;
- compact simple pattern syntax.

By simple syntax I don't mean easy-to-read, but rather that the syntax
is clean and the grammar small, unlike e.g. XPath. By generally small
memory footprint, I mean you *can* make a pattern such that the whole
input tree will end-up in memory, but xmlgrep provides you with the
means to avoid that, in most cases (at least I hope so). I've run some
toy benchmarks which show memory usage and speed of xmlgrep and two of
its direct competitors: xmlstarlet and Twig xmlgrep. You can view the
charts (and get more news on the development) from my xmltools blog:

You can get the thing by pulling from my Git repository:


or if you don't have Git, you can download a snapshot from my Gitweb:;a=summary

(Just click `snapshot' in the rightmost column to get a tarball.)

To build, cd to the xmlgrep subdirectory first and then make as
usual. You will need expat installed to build (by default, the
makefiles use the one in /usr/pkg but this can be overridden; see

I've included a man page which describes the full pattern syntax and
behavior of xmlgrep. Tests are not yet included in the repository
since I'm still not decided on how I should write them.

I would very much appreciate some feedbacks on the tool itself as well
as the pattern language.

Nhat Minh

Home | Main Index | Thread Index | Old Index