3. Tutorial¶
3.1. Basics + Core Interface¶
Let’s begin with some very minimal GNDStk-based example codes, explain our arrangement of major C++ namespaces, and then move on to more-interesting and useful examples.
3.1.1. Minimal GNDStk-Aware Code¶
Here’s the most minimal GNDStk “application”, albeit one that doesn’t do anything:
#include "GNDStk.hpp"
int main()
{
}
The takeaway: to use GNDStk, you should #include
its one
primary header file, GNDStk.hpp
. In contrast to the manner in which
some C++ libraries
are designed, GNDStk provides just this one main header file for user
consumption. That header, in turn, includes all of GNDStk’s other headers,
and in the correct manner with respect to namespaces and such.
Put another way, GNDStk is not designed, as some libraries are, so that you
selectively choose what headers to #include
. When we use the C++ Standard
Library, for instance, we’ll pick and choose: include iostream
, most likely,
and perhaps vector
, and any number of additional specific chosen headers.
The entire C++ Standard library is very large, of course, and such selectivity
is important so that compile times are kept under control. GNDStk is much
smaller, and we’ve judged that compilation times aren’t significantly impacted
by putting forth our simple, easily followed rule: just
#include GNDStk.hpp
– nothing more, nothing less – in whichever of your
own source files need it.
3.1.2. Recommended Starting Point¶
For most users, most of the time, we suggest this starting point:
#include "GNDStk.hpp"
using namespace njoy::GNDStk::core;
int main( /* argc,argv as necessary */ )
{
}
This merely adds a specific using
directive to the previous example.
To understand what this directive does for us, we need to understand GNDStk’s
namespace hierarchy.
3.1.3. Namespace Hierarchy¶
GNDStk, like many C++ libraries, places its various constructs into a moderate number of C++ namespaces. For simplicity’s sake, omitting details that most users won’t care about, consider that GNDStk’s namespace hierarchy looks something like this:
// Outer namespaces that surround everything else.
namespace njoy {
namespace GNDStk {
// Namespace for basic, generic "GNDS query objects". There are
// other sets of query objects; don't worry about those for now.
namespace basic {
// Query objects specific for GNDS metadata.
namespace meta {
}
// Query objects specific for GNDS child nodes.
namespace child {
}
// Bring in meta:: and child:: above.
using namespace meta;
using namespace child;
// Bring in common:: below.
using namespace common;
}
// Some query objects that we wish to be shared across basic::
// as well as other query-object namespaces that we don't list here.
namespace common {
}
// Logging capabilities, including, for example, support for errors,
// warnings, and informational messages.
namespace log {
}
// Our "core" namespace; see discussion.
namespace core {
// Bring in GNDStk:: itself, and basic:: above
using namespace GNDStk;
using namespace basic;
}
}
}
A few things are happening here, so bear with us.
For starters, everything is in njoy::GNDStk::
. That’s easy enough to
understand. Notably, however: considering that a typical C++ library will often
begin, at global scope, with a namespace that matches the library’s name, it’s
laughably easy to forget the njoy::
part. We’ve made that mistake several
times, ourselves. We have the njoy::
only because GNDStk is just one element
of Los Alamos National Laboratory’s broader NJOY suite of software projects.
If – well, when – you ever forget the njoy::
, a modern C++ compiler may
suggest, in its initial error message (before its flood of spurious ones), that
you probably meant the GNDStk::
in njoy::
.
Fundamentally important in GNDStk are its “query objects”. Think of these as
small modules of information that facilitate a concise notation for getting
(or setting) GNDS data. (Much more on all of this later.) For now, suffice it
to say that the query objects of most interest to users are our basic ones
in – drum roll, please – basic::
. Within basic::
, query objects are
further placed into meta::
or child::
, depending on whether they’re
designed for getting and setting metadata, or for getting and setting child
nodes. Names for our query objects equate, except in certain rare cases, to
the GNDS names of the metadata and child nodes with which they interact. For
example, a GNDS label
metadatum is called label
in GNDStk’s query
objects, and a GNDS styles
node is called styles
.
Strangely – at first glance – basic::
has using
directives for its
own meta::
and child::
sub-namespaces! Why not place the contents
of those directly into basic::
to begin with, and dispense with the
sub-namespaces altogether? It turns out, in fact, that the GNDS standard has
a small amount of overlap between its names for metadata, and its names for
nodes. Two examples are parity
and spin
. If, for example, you look
through the currently available XML-format GNDS files, you’ll see XML
spin="something"
metadata, and also XML
<spin>
elements. Our arrangement for
basic::
, meta::
and child::
is such that if you’re
using namespace basic
, you can dispense with a meta::
or child::
prefix where
names are unique (styles
, label
, and most other names),
or prefix appropriately in the occasional cases where they aren’t:
meta::spin
for spin metadata, child::spin
for spin nodes, and so forth.
basic::
is one of two namespaces (at the time of this writing) into which
we’ve placed full sets of query objects for GNDS metadata and child nodes.
(Don’t worry, for now, about the other set. We may even remove it, as other
capabilities of GNDStk have made it less worthwhile to have than it once was.)
A third namespace, common::
, contains a small handful (not a complete set)
of query objects that are intended for use with both of the two full sets.
In addition to using
its own meta::
and child::
sub-namespaces
for the reasons we described above, basic::
also brings in the contents
of common::
, so that no common::
prefix is needed when you’re
using namespace basic
.
A namespace log::
also exists in GNDStk. We’ll discuss it elsewhere, but
mention it here only because (1) you may occasionally find its contents to be
useful for your own purposes; and (2) it serves, in contrast to the other
namespaces being discussed here, as an example of something that isn’t
included automatically by our core::
namespace. We don’t consider it to
be useful enough, for the average user, to justify cluttering core::
with
its contents. If and when you need it, log::
is short, and easy to type.
3.1.4. Core Interface¶
That brings us, finally, to the core::
namespace that we called out, in our
example code, as being precisely what we suggest that most users bring in:
using namespace njoy::GNDStk::core;
core::
is little more than this:
namespace core {
using namespace GNDStk;
using namespace basic;
}
So, core::
brings in basic::
, which as we saw above brings in its own
meta::
and child::
sub-namespaces, as well as the (modest but useful)
content in common::
. On top of that, core::
actually brings in
GNDStk::
(that is, njoy::GNDStk::
), even though core::
itself in
inside of njoy::GNDStk::
! (The language does allow that.)
All things considered, then, the single directive
using namespace njoy::GNDStk::core
brings in all content from:
njoy::GNDStk::
njoy::GNDStk::basic::
njoy::GNDStk::basic::meta::
njoy::GNDStk::basic::child::
njoy::GNDStk::common::
with the single caveat we spoke of already in regards to basic::
– that in
the rare but occasional cases of overlap (meta::spin
vs. child::spin
,
for instance, or meta::parity
vs. child::parity
), you must disambiguate.
And the compiler will tell you as much, as it’ll be an error until you do.
The combined content of the above-listed namespaces constitute what we consider
to be a good set of core GNDStk capabilities. Hence, our motivation for
creating a core::
namespace that brings all of them into your code,
together, via the one convenient directive that we’ve recommended.
You can consider the phrases core namespace and core interface to be essentially interchangeable. Which term we use, and where, depends on whether we’re referring to the namespace in particular, or to the functionality it exposes.
We’ll note, finally, that having (and recommending) our core::
namespace
is helpful from the standpoint of software maintainability. If we decide at
some future time that GNDStk needs a refactor, and/or a rearrangement of its
functionality into a different overarching namespace scheme, we anticipate
being able to update the contents of core::
in such a way that the codes
that use it – like, we hope, yours – will need few if any changes, even if
the GNDStk constructs that the codes employ have been moved to entirely new
or different locations.
3.2. Read and Write GNDS¶
3.2.1. Read XML¶
Here’s a simple code that reads the XML format GNDS file named
n-094_Pu_239.xml
:
#include "GNDStk.hpp"
using namespace njoy::GNDStk::core;
int main()
{
Tree pu239("n-094_Pu_239.xml");
}
Obviously, but worth a reminder, this assumes that the file resides right there, in the directory from which this code is run. If it doesn’t, include a suitable absolute or relative path in the file name string. We, and no doubt everyone reading this, have probably made this mistake often enough over the years.
Tree
is GNDStk’s data structure for holding an entire GNDS hierarchy, a.k.a.
GNDS tree. GNDS is, indeed, a tree-like structure, and that’s reflected in the
name of our C++ class. Once loaded, you’ll be able to do great things – most
likely, in this particular example, data queries from an existing GNDS file –
with object, pu239
here, into which the GNDS data in the loaded file were
placed.
A large collection of XML-format GNDS files can be downloaded from here:
That’s where we got our example’s n-094_Pu_239.xml
, and many other GNDS
files. At around 24MB in size, it’s one of the larger GNDS files from the above
site, but it isn’t among the absolute largest. We’ll often use it in our
examples; its modest size (by today’s standards) should still allow for fast
reading, and we believe that its contents make for good examples. The same,
of course, could probably be said about any GNDS file, depending on what data
are of interest.
While we didn’t say so directly, a C++ programmer will have realized what the
above code tells us: that Tree
has a constructor from a character string
(in fact, from a std::string
), and for which the behavior is: “interpret
the string as the name of a GNDS file, and load the file.” You could write
this instead:
#include "GNDStk.hpp"
using namespace njoy::GNDStk::core;
int main()
{
Tree pu239;
pu239.read("n-094_Pu_239.xml");
}
which is more explicit in its purpose, but slightly less concise. (And the
earlier, direct-constructed Tree
could be const
, if that matters.)
3.2.2. Files, Streams, Types¶
In terms of what we saw above, Tree
has four similar constructors. (And has
several additional completely different constructors that are outside the scope
of the present discussion).
The first argument is either a file name, or a C++ std::istream
from which
to read the “file.” The second argument allows you to stipulate the file format
explicitly, and can either be something from GNDStk’s FileType
enumerator:
enum class FileType {
null, // Default, automagick, etc.
tree, // <== DON't use this for reading; just writing
// Generally use one of these:
xml, XML = xml,
json, JSON = json,
hdf5, HDF5 = hdf5
};
or can be a direct string: "xml"
, etc. A direct string is shorter and
slightly easier to type – but, if mistyped, would lead to a run-time error,
not a compile-time error, if that matters to you in this simple context.
HDF5 is not supported at this time! Just XML and JSON.
You should seldom, if ever, need to provide the second argument. Absent the second argument, GNDStk determines the file type automatically, and we doubt that you’ll have any objections to that. If you do choose provide the second argument, then it, not GNDStk’s automatic file type determination, will be used, but you’ll see a warning if your directly-given value contradicts GNDStk’s automatic determination, which it still performs for diagnostic purposes. Of course, if you try to force reading in one format, and the file’s actual format is something else, you’ll soon be seeing a flood of errors, not mere warnings, as we attempt to read the file pursuant to the (incorrect) forced format.
GNDStk uses the “file magic number,” not the file name, to determine file type
automatically. The file magic number really means the first byte, or bytes,
of the file. XML files always begin with a <
character. HDF files (not
supported yet) begin with ASCII 137 and a few other specific bytes. If the
first byte is neither of those values, then GNDStk assumes JSON format.
A nice thing about using the file magic number, not the file name, is that
it works for std::istream
, for which a “file name” isn’t even available.
Moreover, it tells us what’s actually in the file or the stream, independent
of what any name might imply. If you provide an XML file but call
it something.JSON
, then that
would be a rather strange thing to do, but GNDStk will correctly determine the
actual type – XML – and thus read the file correctly. In cases like that,
GNDStk will do an additional good deed: it’ll warn you that the file’s name
contradicts the file’s type as implied by the file magic number.
3.2.3. Read & Write XML¶
Here’s a simple example in which we read our trusty example GNDS XML file, then write it back out to another XML file:
#include "GNDStk.hpp"
using namespace njoy::GNDStk::core;
int main()
{
Tree pu239;
pu239.read("n-094_Pu_239.xml");
indent = 2; // <== not necessary; just for illustration
pu239.write("pu239");
}
You probably already guessed, correctly, that if Tree
has a read
that
reads, then it probably has a write
that writes. We’ve endeavored to make
GNDStk’s design as intuitive and consistent as reasonably possible.
We wrote earlier that GNDStk uses the file magic number, not the file name, to determine the file type when reading. When writing, there is no file magic number – the file to be written doesn’t exist, yet, or if it does, then its present contents are meaningless because the file is about to be replaced.
When you call write
, therefore, GNDStk does use the file name to determine
what format in which to write, except that you can provide that second argument
again – a value from our FileType
enumerator, or a string like "xml"
or "json"
– to specify the type you want directly. As it does for read
,
GNDStk makes some consistency checks. If you write, for instance,
pu239.write("pu239.xml", "json");
then GNDStk will write file pu239.xml
in JSON format, as you asked for in
the second argument, but will warn that the file extension is inconsistent with
the format you asked for.
What if the file name extension isn’t given, or isn’t recognized, and a format
isn’t forced with a second argument? That is, what if we wrote, for example,
pu239.write("pu239")
? In that case, write
writes the Tree
into
a simple output format that we created largely for debugging purposes. You
probably won’t have much use for this format, and we don’t provide the ability
to read from it, but you’ll no doubt notice the problem quickly and be able
to fix it.
In the above code, what’s indent
all about? We didn’t really need to clutter
the example by including it, as it isn’t required at all, but we wanted to
illustrate something minor but perhaps of interest. indent
is one of a small
handful of useful “global variables” (not truly global, but
in namespace njoy::GNDStk::
)
that GNDStk provides to you for fun and profit. Fun, at least. For XML and JSON
output files, as well as for a few other things throughout GNDStk’s vast array
of features, indent
tells how many spaces you’d like indentation to be.
GNDStk’s default is 3 spaces, which this author happens to prefer. In the
example, we’re saying (before the write
, of course) that we’d like 2 spaces
to be used. At present, behavior is undefined if you give a negative number, and
of course the output will look ridiculous if you give a huge number. Most people
prefer 2-5 spaces for indentation. In case you’re wondering, GNDStk has no
facility for using tabs – an evil creation, quite arguably – for this
purpose.
Finally, we note that write
can write to a std::ostream
, not just
a file, in much the same way that read
can read from a std::istream
,
not just a file. (Always remember: ostream
for writes, istream
for
reads.) Bear in mind, again, that with output a file magic number isn’t
available, and if you use std::ostream
, then a file name, from which we
might guess the format, isn’t available either. So, you’ll specifically want
to give the second argument – "xml"
, say, or "json"
– if you write
to a std::ostream
.
3.2.4. More Reads & Writes¶
We hope that GNDStk’s basic facilities for reading and writing GNDS files are clear enough at this point, but we’ll provide a few more examples nonetheless. A simple XML to JSON conversion can be done like this:
#include "GNDStk.hpp"
using namespace njoy::GNDStk::core;
int main()
{
Tree pu239;
pu239.read("n-094_Pu_239.xml");
pu239.write("n-094_Pu_239.json");
}
Here’s a more compact version of the same thing:
#include "GNDStk.hpp"
using namespace njoy::GNDStk::core;
int main()
{
Tree("n-094_Pu_239.xml").write("n-094_Pu_239.json");
}
Just as we can write JSON, we can read it, too. If we’ve produced the output
.json
file as with the above example, we can read it thus:
#include "GNDStk.hpp"
using namespace njoy::GNDStk::core;
int main()
{
Tree pu239("n-094_Pu_239.json");
}
Here, as you can see , we’ve returned to using a “read by constructor,” as in
our original XML example, instead of employing a read
call. It’s just more
concise, in our opinion. Of course, you’ll use – and should – whichever
variation you prefer.
3.2.5. Read, Write, Compare¶
We’ll wrap up our set of read/write examples with a code that reads our favorite
GNDS XML file, writes it to a JSON file, independently reads the JSON back into
another Tree
object, and then also compares the new Tree
to the
original:
#include "GNDStk.hpp"
using namespace njoy::GNDStk::core;
int main()
{
// Read from XML
Tree FromXML;
FromXML.read("n-094_Pu_239.xml");
// Write to JSON
FromXML.write("test.json");
// Read back from JSON
Tree FromJSON;
FromJSON.read("test.json");
// Compare
assert(FromXML == FromJSON);
}
Several remarks are in order here.
The comparison operator for Tree
compares the two GNDS trees in an
order-agnostic manner. GNDS fundamentally provides data in two places: nodes
(think XML “elements”) in its overall tree structure, and metadata (think XML
“attributes”). The GNDS standard does not, however, consider ordering to be
important. One tree node’s child nodes or metadata, anywhere or everywhere
throughout the entire tree structure, could be reordered arbitrarily, but if
each remains equivalent – in the same respect that we consider two mathematical
sets to be equivalent – then the two GNDS trees are to be considered
equivalent. So it is with our Tree
comparison.
Interestingly enough, the above example’s FromXML
and FromJSON
objects
will, in fact, have quite different orderings, across the board, of child nodes
and metadata! Why is this the case? GNDStk makes use of an external library
called pugixml for reading and writing XML files, and an external library
nlohmann/json, on Github, for reading and writing JSON files:
It turns out that pugixml preserves the existing ordering of XML elements and
attributes when it reads a file, while nlohmann/json lexicographically sorts
the JSON name/value pairs by name. The latter library’s behavior could be
considered unfortunate if we’d rather see ordering preserved; while the
former library’s behavior could be considered unfortunate if we wish to fully
respect GNDS’ “no-ordering” rule and discourage the creation of code that might
inadvertently depend on data ordering. In any event, our operator==
for
Tree
respects order-independent equivalence, as it should.
What’s assert about? If you’re not familiar with assert
, it’s actually
a longstanding and quite useful macro that C++ inherited from the C language.
assert
simply checks that its argument is true, and causes the code to exit
immediately, with an error message that says something along the lines of
“assertion failure …”, if it isn’t true.
This documentation uses assert
throughout its examples. It’s great for that
purpose: an expression like assert(foo)
can be read, simply and concisely,
as: “we’re asserting to you, the reader, that foo
is true here.” In the
above code, for instance, we’re telling you directly that FromXML
and
FromJSON
equal each other when we’re at the assert
line.
GNDStk’s test suite, by the way, makes use of the CATCH library, described
online as a “test framework for unit-tests.” CATCH, on the one hand, has far
more macros and other testing machinery than we get with just assert
by
itself. However, a philosophy we’re keeping in mind with GNDStk is that we
don’t want to require potential users to learn more than they need to. A
system like CATCH, while more powerful, can obscure, to the uninitiated
user, where even a basic construct like a main()
is located, or how
different source files are fitted together to create a group of tests.
Therefore, for your benefit throughout this manual, we’ll keep things
simple. We’ll provide complete, working, compile-able codes except where
it’s obvious that something is just a code fragment; and employ the simple
but clear assert
where we wish to draw attention to the fact that the
asserted expression is true.
3.2.6. Regarding JSON Files¶
The specifications for Version 1.9 of the GNDS standard can be found here:
which, at the time of this writing, is the latest available standard. Section 2.4 of the downloadable PDF document discusses limitations of some “meta-languages” (roughly speaking, file formats) such as JSON, in comparison with what XML is able to represent.
Here are three points quoted directly from the document:
1. for meta-languages that do not support attributes, either group all attributes together under a child node called
attributes
or convert each attribute to a child node and add a suffix like_attr
to the node name.2. for meta-languages that do not support multiple child nodes with the same name, add a unique suffix to each node name. For example, if multiple
reaction
elements appear in the file, they becomereaction0
,reaction1
, etc. To avoid parsing strings to determine the original node name, anodeName
attribute (or child node) should also be added indicating the original unmodified node name.3. for meta-languages that do not preserve the order of child elements, an attribute or child node with the (0-based) index should be added to the node. For example, in HDF5 the attribute
nodeIndex
could be added to each child in a group.
For (1), GNDStk does the first suggested action: it groups all of a node’s
attributes under a child node called attributes
. We consider that to be
cleaner than using an _attr
suffix.
For (2), GNDStk does exactly as illustrated: multiple elements of the same name
are suffixed with 0
, 1
, etc. And, then, a JSON name/value pair with the
name nodeName
, as suggested, is created in order to preserve the original
unsuffixed element’s name.
For (3), GNDStk does nothing in particular right now. Our understanding of GNDS
is that it’s designed so that elements – nodes – can appear in any order.
Here’s a small XML fragment taken directly from the n-094_Pu_239.xml
example
GNDS file that we’ve been using for our examples:
<axes>
<axis index="1" label="energy_in" unit="eV"/>
<axis index="0" label="radius" unit="fm"/>
</axes>
Those axis
child nodes already contain a 0-based index
attribute, so
perhaps the specification’s admonishment #3 is something we can consider to have
been satisfied already by whomever has created an existing, valid GNDS file (so
that no further treatment is required); or something that we must satisfy if
we are to create our own valid GNDS files.
The GNDS document then puts forth the following example XML fragment – slightly reformatted here for clarity, and with a proper XML declaration node (the first line) added for completeness:
<?xml version="1.0" encoding="UTF-8"?>
<employees>
<employee>
<name first="Doc" last="Jones"/>
</employee>
<employee>
<name first="Grumpy" last="Smith"/>
</employee>
<employee>
<name first="Happy" last="Earp"/>
</employee>
</employees>
A viable JSON equivalent is then suggested.
Calling the above XML file employees.xml
, let’s bring forth our
tried-and-true GNDStk methodology for converting from XML to JSON:
#include "GNDStk.hpp"
using namespace njoy::GNDStk::core;
int main()
{
Tree("employees.xml").write("employees.json");
}
and see what happens. Here’s exactly the output JSON file that the above code, applied to the sample XML, creates:
{
"employees": {
"employee0": {
"name": {
"attributes": {
"first": "Doc",
"last": "Jones"
}
},
"nodeName": "employee"
},
"employee1": {
"name": {
"attributes": {
"first": "Grumpy",
"last": "Smith"
}
},
"nodeName": "employee"
},
"employee2": {
"name": {
"attributes": {
"first": "Happy",
"last": "Earp"
}
},
"nodeName": "employee"
}
}
}
This illustrates how GNDStk creates JSON files, consistent with the suggestions in the GNDS specification.
If you try the above code, on the given input, you’ll see that GNDStk prints
two warnings. The same warning twice, actually: once during input, and once
during output. The warning tells us that <employees>
– the top-level node
of the above XML document – is not recognized as a valid GNDS top-level
node. (Valid GNDS top-level nodes, per the standard, are reactionSuite
,
covarianceSuite
, PoPs
, thermalScattering
, and
fissionFragmentData
.) It’s just a warning, not an error, so don’t worry
about it for now.
Naturally, GNDStk reverses the modifications when we read from a JSON file
into our internal format. Specifically: values in an attributes
block are
transformed into metadata in the enclosing node, and values from nodeName
name/value pairs replace index-suffixed names.
At this time, GNDStk provides no other options, such as the _attr
suffix
that the GNDS specification suggested as a possibility, for handling JSON.
Neither can it read JSONs that may have been created in a different manner.
We’re not aware, at the time of this writing, of the existence any official
JSON-format GNDS files. If and when such files come into existence, and if
such files use a different scheme than we do for addressing the issues
described above, then we’ll provide capabilities at least for reading those
files, and perhaps for writing them in that manner as well.
3.3. Data Structure “Direct”¶
In this section, we’ll talk about some of the basic internal constructs of some
of GNDStk’s classes: more importantly, Tree
and Node
; less importantly,
XML
and JSON
. We’ll describe member data – with an important message
to users first – and also some of the member functions that you’re likely to
find useful. For now, here, we won’t cover the plethora of member functions that
support our “smart query system.” Those need their own dedicated, and detailed,
discussion.
3.3.1. About Direct Access¶
Right away, we strongly suggest that most users avoid direct access of member
data in these classes! An exception is the name
string in Tree
and
Node
, which you might well wish to access. It’s rather inoculous. Other
member data, at this time, consists of containers for metadata and child nodes.
In most cases, we hope you’ll prefer to use GNDStk’s rich variety of powerful higher-level capabilities, in our “smart query system,” for pulling data from, or pushing data to, the metadata and child-node containers. (We’re talking right now about capabilities that are still in our core interface – not in our “high-level” interface that provides classes tailored to specific versions of the GNDS standard.) The basics of our “smart query system” are described in an upcoming section of this document.
Our query system was designed precisely so that you’ll have something much more concise and powerful than you will by directly accessing the containers in question. And, most likely, also safer to use, insofar as working directly with the internals of data structures, especially those that were designed by other people, invariably runs some risks. C++ containers aren’t rocket science (and, besides, some GNDStk users may well be rocket scientists), so we do in fact provide public access to these structures, in the interest of supporting users who are comfortable and capable with the C++ language.
Finally, we believe that if you understand the basic internal data format, then you may find the behavior of the higher-level capabilities, and our motivation for creating them, to be more clear.
3.3.2. Tree vs. Node¶
We’ve already seen Tree
in some examples. It’s the class to use when you
want to read or write an entire GNDS tree. Tree
is derived from another
important class: Node
. At the time of this writing, Tree
contains no
additional data beyond what it gets from Node
. It does, however, contain
some additional member functions, and it makes some slight changes to some of
the member functions that otherwise gets from its Node
base.
Here’s a short sketch of our arrangement:
class Node
{
using metaPair = std::pair<std::string,std::string>;
using childPtr = std::unique_ptr<Node>;
public:
std::string name;
std::vector<metaPair> metadata;
std::vector<childPtr> children;
// constructors, member functions, ...
};
class Tree : public Node
{
// a few additional and/or different member functions
};
The GNDS standard is essentially a tree structure, and this is reflected in
our classes, with Tree
being intended for the top-level (root) node, and
Node
for all others.
Some readers may realize, correctly, that a typical tree structure’s top-level node could be treated in exactly the same way as all of its other nodes. One doesn’t generally need different data types for a tree’s root node and its other nodes, including leaves. Roughly speaking, tree nodes all “look the same,” with similar contents as well as relationships to their child nodes.
That’s all true, and it could be described as the theoretical/mathematical view
of tree structures. From a practical/engineering standpoint, some utility can
sometimes be had in treating a top-level node differently from the others.
That’s the reasoning for Tree
versus Node
.
As a derived class, Tree
automatically inherits most of its functionality
from Node
, as we want it to. In a handful of respects, however, Tree
will reflect the fact that it’s there to represent an entire GNDS hierarchy,
not just a portion thereof. For example, it tries to ensure that the top-level
GNDS node isn’t any valid GNDS node, but one of the few that’s valid as a
top-level GNDS node. (GNDStk, it turns out, emits a warning, but not an error,
if you try to write a Tree
that doesn’t have a top-level GNDS node with a
valid name.)
One could also imagine extra functionality that a Tree
, but not a Node
,
could be equipped with. In the typical case that a GNDS tree is read from a
file, for example, we could have the Tree
structure store the file name.
Then, perhaps, we could equip Tree
with a member function like
overwrite()
or rewrite()
that would replace the original file (say,
after a user has made changes that they wished to make to the GNDS data) without
requiring that the file name be repeated. (Analogy: a image-editing GUI that
provides, in its File menu, an item like Overwrite <original.jpg>
, in
addition to a Save As...
and an Export
.) GNDStk does not, at the time
of this writing, provide this particular capability. By making Tree
different from Node
, however, we allow for the possibility of such things
being added, painlessly, at a later time.
3.3.3. Content Preservation¶
An important initial design decision that we made for our Tree
and Node
classes is that they faithfully represent precisely the content from any GNDS
file we may read into them. The fundamental motivation here is simple: data
evaluators work hard to create good data, and we don’t want to take any
actions that might, in any way, change or lose anything.
Consider, as a simple example, this small fragment of content from near the
beginning of our favorite n-094_Pu_239.xml
example GNDS file:
<mass>
<double label="eval" value="1.00866491574" unit="amu"/>
</mass>
We could probably all agree that the label "eval"
and unit "amu"
should
be stored as strings. But what about the value "1.00866491574"
? We could
store it as a double
, if we’re presumptuous enough to assume that a user
intends to use it as a double
– not a float
, say, or
a long double
. We’d also be assuming, there, that a user doesn’t mind the
expensive of presumptively “floating-point” content from GNDS files being
converted en masse from the original XML character strings to floating-points,
regardless of which GNDS content the user might actually access. On top of
that, we’d be glossing over the various complexities that can (and do) arise
when decimal representations of floating-point numbers are converted to internal
binary floating-points, and back again. (The “back again” part is especially
relevant if someone plans, say, to read a GNDS file, add new data and/or fix
old data in selected areas, and then write the entire GNDS file back out again.)
Instead of making wild assumptions, we’ll opt instead to preserve original content – that is, to respect precisely what exists in a GNDS file to begin with.
To this end, all individual data, regardless of what they may appear to be
(string, floating-point, integer, single character, etc.), are stored as
strings. More precisely, as C++ std::strings
. Node names ("mass"
) are
stored as strings. Metadata key/value pairs are stored as C++ std::pairs
of strings; think {"label","eval"}
. Even the content in GNDS values
nodes, like this one (the first in n-094_Pu_239.xml
):
<values>
2500 8.9172 2550 8.9155 2650 8.9139 ...
... 28500 8.4901 29500 8.4741 3e4 8.4659
</values>
are stored, in a Node
, as long strings. (We could reasonably split out such
thing into std::vector<std::string>s
, too, but decided to not even do that.
To perform such a split everywhere, automatically, would take time, and a user
might not even intend to access any specific portion of GNDS data.)
No worries, though: our core interface, and especially the smart query system that we’ve spoken of, has plenty of functionality for serving its internal strings to you as floating-points, for instance; or for re-forming long strings, like the ones just described, into vectors of strings, or vectors of floating-points, or vectors of just about anything you may wish to create. When we speak of content preservation, then, we’re saying that an input text file – XML or JSON, for now – is factored into its underlying tree structure, but with its individual meaningful parts (neglecting, as usual, whitespace) still stored as text, with no modifications.
A given user’s application code will almost certainly have its own internal
classes that contain GNDS data, or data computed from GNDS data, in ways that
work well for the user’s application. Someone may also have classes specifically
intended to mirror the content in various GNDS nodes, just in a different way.
(GNDStk’s own “high-level interface” will provide precisely such classes.) Such
classes can certainly make assumptions we didn’t want GNDStk to make – like,
for example, that we do want double
for that numerical value above. Or, for
that matter, that perhaps the unit
, "amu"
above should be an entry in
some C++ enumerator for allowable units – no longer a string at all. We’re
happy to report that our core interface, and in particular our smart query
system, is designed to help you interact well, and easily, with GNDStk’s
internal string storage.
We’ll write more about the above considerations elsewhere. For now, let’s return to the main point of this chapter, and describe GNDStk’s two major classes that store GNDS data.
3.3.4. Node¶
We’ll write first about Node
(for general nodes), because Tree
(for the
root node only) derives from Node
. Recall that the member data in Node
looks like this:
class Node
{
using metaPair = std::pair<std::string,std::string>;
using childPtr = std::unique_ptr<Node>;
public:
std::string name;
std::vector<metaPair> metadata;
std::vector<childPtr> children;
// constructors, member functions, ...
};
In short, inlining the metaPair
and childPtr
types and omitting the
std::
prefix for brevity:
// Node's data members
string name;
vector< pair<string,string> > metadata;
vector< unique_ptr<Node> > children;
The above evinces a simple tree structure that’s entirely sufficient for representing the contents of any GNDS node.
Let’s provide a short but concrete example. Here’s some XML content from near
the top of the n-094_Pu_239.xml
GNDS file:
<evaluated label="eval" date="2017-12-01" library="ENDF/B" version="8.0.5">
<temperature value="0.0" unit="K"/>
<projectileEnergyDomain min="1e-05" max="20000000.0" unit="eV"/>
</evaluated>
Here, an outer evaluated
node (XML “element”) contains four metadata
key/value pairs (XML “attributes”) and two child elements. The first child
element, temperature
, contains two metadata pairs but no further child
nodes. The second child element, projectileEnergyDomain
, contains three
metadata pairs but no further child nodes.
At the risk of continuing a narrative of statements that are no doubt obvious,
here’s precisely how the above evaluated
node is represented in a Node
:
name: "evaluated"
metadata[0]: {"label", "eval"}
metadata[1]: {"date", "2017-12-01"}
metadata[2]: {"library", "ENDF/B"}
metadata[3]: {"version", "8.0.5"}
children[0]: pointer to another Node, with:
name: "temperature"
metadata[0]: {"value", "0.0"}
metadata[1]: {"unit", "K"}
children[1]: pointer to another Node, with:
name: "projectileEnergyDomain"
metadata[0]: {"min", "1e-05"}
metadata[1]: {"max", "20000000.0"}
metadata[2]: {"unit", "eV"}
Here, {"foo", "bar"}
is a C++ std::pair<std::string,std::string>
,
and is thus accessible in the customary manner: .first
for the "foo"
and .second
for the "bar"
.
We use C++ std::unique_ptr<Node>s
for the pointers to child nodes.
3.3.5. Pointers about Pointers¶
A couple of early users asked us about the motivation for using pointers, so we’ll briefly address, here, the concerns that they raised, in case other users wonder the same things.
One person wondered why children
is a vector of pointers – not a vector
of Nodes, which would appear at least to be simpler. Of course, a Node
can’t directly contain another Node
– C++ wouldn’t allow it – but
could indeed contain a vector
of Nodes
. (C++ vectors
themselves
involve pointers, so pointers are still involved, they’re just not explicit.)
Without delving into a discussion that’s well beyond the scope of this
document, we’ll say only that implementing a Node’s children
as a vector
of Nodes
would likely wreak havoc on efficiency, both in space (memory)
and in time, when objects like Tree
and Node
are being read from a
file or otherwise created or modified. Considerable memory fragmentation
could also come about.
Another user wanted to write code that copied some of children's
pointers.
The attempt to do so was stymied due to std::unique_ptr's
intentional
lack of a copy constructor, as unique_ptr
is designed to be the exclusive
“owner” of the object to which it points. GNDStk uses unique_ptr
quite
intentionally, precisely to deal with the ownership issue cleanly and clearly
while also benefitting from unique_ptr's
automatic handling of an object’s
memory footprint.
A code shouldn’t attempt to take any actions that would break the ownership
rules unique_ptr
manifests, and a C++ compiler will say so loudly if one
tries. Anyone who really wishes to make their own pointer – say, a raw
pointer – to an object to which one of our unique_ptrs
already refers,
can always dereference the unique_ptr
(giving a reference to a const
or non-const
Node, and effectively losing the unique_ptr
aspect), then
take the address to get a pointer again: basically &(*uptr)
, where uptr
is a unique_ptr
in one of our children
vectors. (Do not, of course,
delete
the Node through such a pointer; leave its management to the original
unique_ptr
!) We recommend that anyone who does this, or anything similar,
be sufficiently familiar with the C++ language, as well as justifiably confident
that there isn’t a better way to accomplish the goal at hand.
3.3.6. Tree¶
Tree
derives from Node
, so what we’ve already spoken about, in terms
of member data, still applies. Some additional points are in order, however,
owing to Tree's
status as the root node in our internal representation
of a GNDS hierarchy.
3.3.7. Direct-Access Examples¶
3.3.8. XML and JSON¶
3.4. Smart Query System¶
3.5. GNDS Creation¶
node ctors tree ctors add()s
3.6. Advanced Examples¶
Largely continue query system discussion. Not sure about “advanced examples” characterization.