NOMAD URIs

An important aspect of a language used for Big Data is the ability to identify data.

The hierarchical nature of the sections makes it so that each section and each value in a context can be identified through a path.

Each section object can be identified either with an index (count) with respect to its parent. For example the third section_single_configuration_calculation in the second section_run would have local index 2 or 2l (l=local, and index starts at 0). An index (count) with respect to the containing context can also be used, so the previous section object would have index 8c (c=context) if the first section_run has 6 section_single_configuration_calculation.

Thus the following paths

section_run/1l/section_single_configuration_calculation/2l
section_run/1/section_single_configuration_calculation/2
section_run/1c/section_single_configuration_calculation/8c

are different ways to refer to the same section object. Clearly one can also refer to single value energy_total by adding /energy_total/0l or /energy_total/8c to the previous paths: one can construct the paths to identify related information.

With a unique identifier for the context (NOMAD Gid), and a path to identify any sub element, we can refer to any piece of data in a fully context free way, that does not depend on where and how it is stored. This is a very important property: no matter how the data is stored, in memory, Database or any file format we have a clear way to identify it and also the related data.

A NOMAD URI is a URI rfc3986 that starts with the prefix nmd://, and consists of one or more context identifiers (for nested contexts that are fully listed), optionally followed by a path. The exact meaning of the path depends on the type of the context, and for contexts referring to the parsed and normalized data (starting with either S or N) in which data is organized with the NOMAD meta info, URIs use the paths and identifiers as described previously:

nmd://NWGq3on-jgxo8pw5cr-nEclMNqYW2/C-Gl21uIOi_OQQQo8NSKu3FRHSH5m/section_run/0c/section_single_configuration_calculation/8c

We are evaluating a formal registration as defined in rfc4395.

Updated:

Leave a comment

Comments are moderated. Your email address is neither published nor stored, only an md5 hash of it. Required fields are marked with *

Loading...