An important aspect of a language used for Big Data is the ability to identify data.
The hierarchical nature of the sections makes it so that each section and each value in a context can be identified through a path.
Each section object can be identified either with an index (count) with respect to its parent. For example the third section_single_configuration_calculation in the second section_run would have local index 2 or 2l (l=local, and index starts at 0). An index (count) with respect to the containing context can also be used, so the previous section object would have index 8c (c=context) if the first section_run has 6 section_single_configuration_calculation.
Thus the following paths
section_run/1l/section_single_configuration_calculation/2l section_run/1/section_single_configuration_calculation/2 section_run/1c/section_single_configuration_calculation/8c
are different ways to refer to the same section object.
Clearly one can also refer to single value energy_total by adding
/energy_total/8c to the previous paths: one can construct the paths to identify related information.
With a unique identifier for the context (NOMAD Gid), and a path to identify any sub element, we can refer to any piece of data in a fully context free way, that does not depend on where and how it is stored. This is a very important property: no matter how the data is stored, in memory, Database or any file format we have a clear way to identify it and also the related data.
A NOMAD URI is a URI rfc3986 that starts with the prefix
nmd://, and consists of one or more context identifiers (for nested contexts that are fully listed), optionally followed by a path.
The exact meaning of the path depends on the type of the context, and for contexts referring to the parsed and normalized data (starting with either S or N) in which data is organized with the NOMAD meta info, URIs use the paths and identifiers as described previously:
We are evaluating a formal registration as defined in rfc4395.