NASA Ames Format for Data Exchange
Quick links to
FFI Summary
3 tables summarising the nine NASA Ames Format
styles and displaying all possible file headers.
|
Note: The BADC recommends NASA-Ames format version 1.3.
Contents
The NASA Ames Format for Data Exchange, often referred to as NASA Ames Format,
grew out of NASA aircraft campaigns and was first formalised at the Ames
Research Centre, California, during the 1987 Stratosphere Troposphere Exchange
Project (STEP), when uniform rules to record data were needed to facilitate
the data exchange between the participants and allow shared use of a
minimised amount of software to analyse and display different datasets.
The issue was that the adopted data format should meet the following
requirements:
- it had to be portable (readable on any machine by any programming
language);
- it had to be self-describing (that is, the data had to include an
attachment containing all the information needed to read, understand and
interpret them thus ensuring the reader's autonomy);
- it had to be readable by humans (to retain the benefit of its
self-description!).
The first and third requirements implied the adoption of a text format (namely
ASCII).
The second condition was met by including in each data file a header
containing the descriptive information (metadata).
Very well suited to field campaigns involving several teams that need to share
their observations, the NASA Ames Format is not well adapted to very voluminous
datasets.
In this case, although less portable, a binary format is recommended.
Any set of functions of 1 to 4 variables can be recorded using the NASA Ames
format, which makes it particularly suitable for atmospheric datasets, whether
modelled or observed.
Some NASA Ames file format indices (see below) are better adapted to airborne
platforms (balloons, aircraft).
The number of values taken by one (and only one) of the independent variables
is not defined a priori in the data file so that the data provider
does not need to know in advance how many values it takes.
This single independent variable is called unbounded, although it is
bounded in a mathematical sense.
All other independent variables, if they exist, are bounded, meaning that
the number of values they take is explicitly defined in the data file
and implying that the data provider has determined these numbers before
formatting the data.
The NASA Ames Format nomenclature distinguishes between
 |
up to four independent variables (usually but not
necessarily time and/or space) in most cases, real numbers,
one FFI allowing the use of one alphanumeric independent variable
(i.e. a piece of text); |
 |
the primary dependent variables (functions of the former)
real numbers; |
 |
the auxiliary dependent variables (depending solely on the
unbounded independent variable) real numbers or character
strings. |
The fact that dependent variables are functions of the independent variables
implies that only one dependent variable value is associated to one given
set of values of the independent variables.
For real numbers, this means that the independent variable values can be
ordered in strictly increasing or decreasing order (the variable is
monotonic), which is indeed a requirement of the NASA Ames format.
There is always a way to order a finite number of objects, so that this is a
condition that can always be met (if necessary, by choosing an appropriate
independent variable such as a subset of the integers).
Except in a few instances, auxiliary variables are optional.
The underlying philosophy is that data are stored in a file-based system, a
dataset being formed out of a series of numbered files.
Typically, files belonging to a same dataset share some common feature such
as the people who issued the data, the experiment, the platform,...
This, however, is not a requirement and it is up to the data provider to
organise the data files into datasets (possibilities ranging from one single
dataset including all files to one file per dataset).
The number of files of the dataset and the number of the file within the data
set are two elements of information that appear in the file header.
Whilst the first definitions of NASA Ames Format included rules regarding file
names, these have been dropped from most recent versions, that now allow any
naming convention or no convention at all.
File names and their extensions may of course include elements of information
on the data (e.g. site name, date, etc) or provide a way to
sort them out.
File name rules have been set up for specific NERC thematic programmes
(Polluted Troposphere,
UFAM,
ACSOE,
SOAPEX,
UTLS-Ozone,
URGENT,)
The NASA Ames Format is actually a set of nine formats that comply with an
overall common structure but make provision of different features adapted
to various cases (depending on the number of independent variables, whether
their values are regularly spaced out, etc).
To each of the nine formats is univokely associated a File Format Index (FFI),
which is a 4-digit number.
Each file is made of two parts.
At the top of the file, the file header includes information on the data
(metadata).
The actual data are recorded in the lines that follow the header.
In many cases, some of the independent variables are defined in the header and
are not repeated in the data section (e.g. for a regular grid).
An accurate description of each format is provided in the BADC NASA Ames FFI
Summary (see Formatting your data
below).
Here is a brief description of the contents of the two file sections.
 |
Header or Metadata section
The header includes, in a defined order and format, all the information
needed to read and understand the data. Namely:
 |
the number of the file within the dataset;
|
 |
the number of lines in the header; |
 |
the FFI (which unambiguously defines the structure of the data
section) and additional required information on the data format;
|
 |
the number, nature and units of all types of variables
(independent, primary and, when used, auxiliary), ordered as in the
data section; |
 |
information on the source of the data (name and affiliation of data
providers, experiment, instrumentation, model used, etc.);
|
 |
information on the data, data processing and data quality
(location, date, revision date, etc.).
|
Comment fields are provided at the end of the header for any type of
additional information that would not fit in the predefined formatted
lines.
|
 |
Data section
The data proper are subdivided in a hierarchy of two-dimensional blocks,
the last independent variable (which is always the unbounded
variable see NASA Ames Format: for which type of data? above)
being the most slowly varying one.
Note that error margins can be supplied as either primary or auxiliary
variables, if needed.
|
Each file format index is illustrated by one or several examples.
Format Specification for Data Exchange, Version
1.3 (Gaines and Hipskind, 1998) is the primary reference for NASA Ames
formatting.
As far as possible, the BADC documentation keeps the same nomenclature and
notation system as in this original document.
The NASA Ames
Format checker is an interactive facility provided by BADC, that allows you
to check your NASA Ames formatted files online.
It is based on a program written by S. Gaines, NASA Ames Research Center.
For programmes currently submitting NASA Ames formatted files, the BADC provides
a Web based file uploader.
In the process, files are checked for compliance with the NASA Ames standard.
 |
|
|
| Last Modified:
|  |