MB-System Unix Manual Page

mbdatalist

Section: MB-System 5.0 (1)
Updated: 18 April 2017
Index
 

NAME

mbdatalist - parses recursive datalist structures, performing one or more tasks on the swath files referenced in the datalist(s). The default action is to print out out the complete list of data files, formats, and file weights; other possible tasks include creating ancillary (fbt, fnv, and inf) files as part of setting up an MB-System processing environment, removing orphaned lock files, and creating convenience datalists that referenced processed swath files.

 

VERSION

Version 5.0

 

SYNOPSIS

mbdatalist
[
--verbose {-V}
--help {-H}
--copy {-C}
--report {-D}
--format=FORMATID {-FFORMATID}
--input=FILE {-IFILE}
--make-ancilliary {-N}
--update-ancilliary {-O}
--processed {-P}
--problem {-Q}
--bounds=W/E/S/N {-RW/E/S/N}
--status {-S}
--raw {-U}
--unlock {-Y}
--datalistp {-Z}
]

 

DESCRIPTION

MBdatalist is a utility for parsing datalist files. Datalist files, or lists of swath data files and their format ids, are used by a number of MB-System programs. These lists may contain references to other datalists, making them recursive. See the MB-System manual page for details on the format and structure of datalists. The program mbdatalist outputs each swath data filename, format id, and file weight encountered as it descends through the input datalist tree. If a swath data file rather than a datalist is provided as input, the same swath data filename and format will be the sole output.

This program can be used in shellscripts to read datalists in the same fashion as MB-System programs like mbgrid and mbprocess. This program can also be used to check and debug complex recursive datalist structures.

The program mbprocess operates on "raw" swath data files, producing a "processed" swath data file (see the mbprocess man page for explanation). The MB-System algorithm for reading datalists will, if a flag is set, replace a swath file name with the associated "processed" file name when that "processed" file exists. This flag may be set by embedding "$PROCESSED" as a line in a datalist or it may be set first by the calling program. The flag may also be set to preclude reporting "processed" file names (embedding "$RAW" in a datalist accomplishes this). When setting this flag within datalists, the first encounter of a $PROCESSED or $RAW tag will prevail over later instances of either tag. The --processed and --raw options force mbdatalist to output processed file names when they exist (--processed) or to only output unprocessed (raw) file names (--raw).

Programs such as mbgrid try to check statistics or "inf" files to see if the corresponding data files include data within the specified geographic bounds. Other programs look for "fast bathymetry" or "fast navigation" ("fbt" or "fnv") files in order to read the data more quickly. The --make-ancilliary option causes mbdatalist to create these three types of ancillary files for each swath data file. The --update-ancilliary option causes mbdatalist to create the "inf", "fbt", and "fnv" files only when they don't already exist or are out of date (older than the data file).

Datalists may also contain a third value, called the grid weight, which is used by mbgrid to priortize data. The larger the grid weight, the more importance mbgrid attaches to the related bathymetry data. Grid weights can be applied to datalist entries which are themselves datalist files, causing these weights to be associated with all of files referenced therein. However, the default behavior is for any grid weight in a particular datalist entry to override values derived from higher levels in the recursive structure. This behavior can be reversed if a $NOLOCALWEIGHT tag is placed in the datalist, or in a datalist higher up in the structure. See the MB-System manual page for a more complete description.

The --boundsW/E/S/N option causes the program to check each data file with an "inf" file for overlap with the desired bounds, and only report those files with data in the desired area (or no "inf" file to check). This behavior mimics that of mbgrid, allowing users to check what data files will contribute to gridding some particular area.

The --problem option causes the program to check each data file for the existence of any ancillary files (e.g. navigation files, edit save files, etc.) referenced in its mbprocess parameter file (if the parameter file exists). The program will list any problem found with the processing parameters, and will also list any data problem noted in the "inf" files. The possible data problems include:
        No survey data found
        Zero longitude or latitude in survey data
        Instantaneous speed exceeds 25 km/hr
        Average speed exceeds 25 km/hr
        Sounding depth exceeds 11000 m
        Unsupported Simrad datagram

The --datalistp option causes the program to generate a datalist file named "datalistp.mb-1" and then exit. This datalist has the following form:

        $PROCESSED

        datalist.mb-1 -1

This file is a commonly used convenience because it allows users to easily reference the swath files listed (directly or recursively) through the datalist "datalist.mb-1" with the $PROCESSED flag on. So, in order to grid the processed bathymetry rather than the raw bathymetry, run mbgrid with "datalistp.mb-1" as the input rather than "datalist.mb-1".

The --status option causes mbdatalist to report the status of the files it lists, including whether the file is up to date or needs reprocessing, and if the file is locked. MBprocess sets locks while operating on a swath file to prevent other instances of mbprocess from simultaneously operating on that same file. This allows one to run mbprocess multiple times simultaneously on a single datalist, either on a single multiprocessor machine or on multiple computers mounting the same filesystem. The consists of creating a small text file named by appending ".lck" to the swath filename; while this file exists other programs will not modify the locked file. The locking program deletes the lock file when it is done. Orphaned lock files may be left if mbprocess crashes or is interrupted. These will prevent reprocessing by mbprocess, but can be both detected with the --status option and removed using the --unlock option.

The --report option causes mbdatalist to list the datalist files rather than the swath files referenced through the datalists. Each datalist file path is preceded by its recursion level within the overall datalist structure.

Finally, this program can be used to copy the swath files referenced in a datalist structure to a single directory and to create a datalist there (names "datalist.mb-1") that references those swath files. This is accomplished using the --copy option. The --copy copy function will not be done if the --make-ancilliary, --update-ancilliary, or --problem options are specified, but is compatible with the --processed, --bounds, and --raw options.

 

MB-SYSTEM AUTHORSHIP

David W. Caress

  Monterey Bay Aquarium Research Institute
Dale N. Chayes

  Center for Coastal and Ocean Mapping

  University of New Hampshire
Christian do Santos Ferreira

  MARUM - Center for Marine Environmental Sciences

  University of Bremen

 

OPTIONS

--copy

Causes the swath files referenced in the input datalist structure to be copied to the current directory and creates a datalist (names "datalist.mb-1") that references the copied swath files. The copy function will not be done if the --make-ancilliary, --update-ancilliary, or --problem options are specified. If the --processed, --bounds, and --raw options are specified these functions will modify which swath files are copied. Any ancillary files (e.g. *inf metadata files) will also be copied, but processed data files derived from the target copied files will not be copied.
--report

Causes a listing to be printed of the unique datalist files referenced through the recursive datalist structure. Each line begins with the recursion level of that datalist file within the overall structure followed by the full path of the datalist file indented by a number of tabs equal to the recursion level.
--format
format
Sets the data format associated with the datalist or swath data file specified with the --input option. By default, this program will attempt to determine the format from the input file suffix (e.g. a file ending in .mb57 has a format id of 57, and a file ending in .mb-1 has a format id of -1). A datalist has a format id of -1.
--help
This "help" flag cause the program to print out a description of its operation and then exit immediately.
--input
FILE
Sets the input filename. If format > 0 (set with the -f option) then the swath data filename specified by infile is output along with its format and a file weight of 1.0. If format < 0, then infile is treated as a datalist file containing a list of the input swath sonar data files to be processed and their formats. The program will parse the datalist (recursively, if necessary) and output each swath filename and the associated format and file weight.
--make-ancilliary
This argument causes MBdatalist to generate three types of ancillary data files ("inf", "fbt", and "fnv"). In all cases, the ancillary filenames are just the original filename with ".inf", ".fbt", or ".fnv" appended on the end. MB-System makes use of ancillary data files in a number of instances. The most prominent ancillary files are metadata or "inf" files (created from the output of mbinfo). Programs such as mbgrid and mbm_plot try to check "inf" files to see if the corresponding data files include data within desired areas. Additional ancillary files are used to speed plotting and gridding functions. The "fast bath" or "fbt" files are generated by copying the swath bathymetry to a sparse, quickly read format (format 71). The "fast nav" or "fnv" files are just ASCII lists of navigation generated using mblist with a --update-ancilliarytMXYHSc option. Programs such as mbgrid, mbswath, and mbcontour will try to read "fbt" and "fnv" files instead of the full data files whenever only bathymetry or navigation information are required.
--update-ancilliary
This argument causes MBdatalist to generate the three ancillary data files ("inf", "fbt", and "fnv") if these files don't already exist or are out of date.
--processed
Normally, mbdatalist allows $PROCESSED and $RAW tags within the datalist files to determine whether processed file names are reported when available ($PROCESSED) or only raw file names are reported ($RAW). The --processed option forces mbdatalist to output processed file names when they exist.
--problem
This option causes the program to check each data file for the existence of any ancillary files referenced in its mbprocess parameter file (if the parameter file exists). The relevant ancillary files include edit save files generated by mbedit or mbclean, navigation files generated by mbnavedit or mbnavadjust, tide files, and svp files. An error message is output for each missing ancillary file.
--bounds
W/E/S/N
The bounds of the desired area are set in longitude and latitude using w=west, e=east, s=south, and n=north. This option causes the program to check each data file with an "inf" file for overlap with the desired bounds, and only report those files with data in the desired area (or no "inf" file to check). This behavior mimics that of mbgrid, allowing users to check what data files will contribute to gridding some particular area.
--status
This option causes mbdatalist to report the status of the files it lists, including whether the file is up to date or needs reprocessing, and if the file is locked. MBprocess sets locks while operating on a swath file to prevent other instances of mbprocess from simultaneously operating on that same file. Locking consists of creating a small text file named by appending ".lck" to the swath filename; while this file exists other programs will not modify the locked file. The locking program deletes the lock file when it is done. Orphaned lock files may be left if mbprocess crashes or is interrupted. These will prevent reprocessing by mbprocess, but can be both detected and removed using mbdatalist.
--raw
Normally, mbdatalist allows $PROCESSED and $RAW tags within the datalist files to determine whether processed file names are reported when available ($PROCESSED) or only (raw) unprocessed file names are reported ($RAW). The --raw option forces mbdatalist to only output raw file names.
--verbose
Normally, mbdatalist only prints out the filenames and formats. If the --verbose flag is given, then mbinfo works in a "verbose" mode and outputs the program version being used.
--unlock
This option causes mbdatalist to remove any processing locks on files it parses. MBprocess and other programs may set locks while operating on a swath file to prevent other programs from simultaneously operating on that same file.The consists of creating a small text file named by appending ".lck" to the swath filename; while this file exists other programs will not modify the locked file. The locking program deletes the lock file when it is done. Orphaned lock files may be left if MB-System programs crash or are interrupted. These can be detected using the --status option of mbdatalist.
--datalistp
The --datalistp option causes the program to generate a datalist file that will first set a $PROCESSED flag and then reference the input file specified using the --input=FILE option. The output datalist is named by adding a "p.mb-1" suffix to the root of the input file (the root is the portion before any MB-System suffix).
By default, the input is assumed to be a datalist named datalist.mb-1, resulting in an output datalist named datalistp.mb-1 with the following contents:

        $PROCESSED

        datalist.mb-1 -1

If the input file is specified as a datalist like datalist_sslo.mb-1, then the output datalist datalist_sslop.mb-1 will have the following contents:

        $PROCESSED

        datalist_sslo.mb-1 -1

If the input file is specified as a swath file like 20050916122920.mb57, then the output datalist 20050916122920p.mb-1 will have the following contents:

        $PROCESSED

        20050916122920.mb57 57

 

EXAMPLES

Suppose we have two swath data files from an EM3000 multibeam and another two from an Hydrosweep MD multibeam. We might construct two datalist files. For the EM3000 we might have a file datalist_em3000.mb-1 containing:
        0004_20010705_165004_raw.mb57 57

        0005_20010705_172010_raw.mb57 57

For the Hydrosweep MD data we might have a file datalist_hsmd.mb-1 containing:
        al10107051649.mb102 102

        al10107051719.mb102 102

Further suppose that we have found it necessary to edit the bathymetry in 0005_20010705_172010_raw.mb57 and al10107051719.mb102 using mbedit, and that mbprocess has been run on both files to generate processed files called 0005_20010705_172010_rawp.mb57 and al10107051719p.mb102.

If we run:
        mbdatalist --input=datalist_em3000.mb-1

the output is:
        0004_20010705_165004_raw.mb57 57 1.000000

        0005_20010705_172010_raw.mb57 57 1.000000

Here the file name is followed by the format and then by a third column containing the default file weight of 1.0.

Similarly, if we run:
        mbdatalist --input=datalist_hsmd.mb-1

the output is:
        al10107051649.mb102 102 1.000000

        al10107051719.mb102 102 1.000000

If we insert a line
        $PROCESSED

at the top of both datalist_hsmd.mb-1 and datalist_em3000.mb-1, then the output of mbdatalist changes so that:
        mbdatalist --input=datalist_em3000.mb-1

yields:
        0004_20010705_165004_raw.mb57 57 1.000000

        0005_20010705_172010_rawp.mb57 57 1.000000
and:
        mbdatalist --input=datalist_hsmd.mb-1

yields:
        al10107051649.mb102 102 1.000000

        al10107051719p.mb102 102 1.000000

Now suppose we create a datalist file called datalist_all.mb-1 that refers to the two datalists shown above (without the $PROCESSED tags). If the contents of datalist_all.mb-1 are:
        datalist_em3000.mb-1 -1 100.0

        datalist_hsmd.mb-1   -1   1.0

where we have specified different file weights for the two datalists, then:
        mbdatalist --input=datalist_all.mb-1

yields:
        0004_20010705_165004_raw.mb57 57 100.000000

        0005_20010705_172010_raw.mb57 57 100.000000

        al10107051649.mb102 102 1.000000

        al10107051719.mb102 102 1.000000

Now, if we use the --processed option to force mbdatalist to output processed data file names when possible, then:
        mbdatalist --input=datalist_all.mb-1 --processed

yields:
        0004_20010705_165004_raw.mb57 57 100.000000

        0005_20010705_172010_rawp.mb57 57 100.000000

        al10107051649.mb102 102 1.000000

        al10107051719p.mb102 102 1.000000

To demonstrate the datalist file listing function, consider the datalist file named datalist.mb-1 that is located at the top of MBARI's shipboard swath mapping database structure. This file references datalists under directories for each of the institutions that we have sourced survey data from (e.g. CCOM, GEOMAR, IFREMER, etc.), and each of those datalists reference datalist files in directories for individual surveys or expedition legs, which in turn reference swath files for those surveys (or in some cases reference more datalists if the expedition leg is organized into multiple surveys). We use the --report option to obtain the following listing (which actually runs a lot longer than shown here):
yields:
        <00> datalist.mb-1

        <01>    CCOM/datalist.mb-1

        <02>            CCOM/NR07-1/datalist.mb-1

        <01>    GEOMAR/datalist.mb-1

        <02>            GEOMAR/SONNE100/datalist.mb-1

        <02>            GEOMAR/SONNE47/datalist.mb-1

        <02>            GEOMAR/SO108/datalist.mb-1

        <02>            GEOMAR/GEOMETEP/datalist.mb-1

        <02>            GEOMAR/SO83/datalist.mb-1

        <02>            GEOMAR/SO92/datalist.mb-1

        <02>            GEOMAR/SO99/datalist.mb-1

        <02>            GEOMAR/SO109-1/datalist.mb-1

        <02>            GEOMAR/SO109-2/datalist.mb-1

        <02>            GEOMAR/SO111/datalist.mb-1

        <02>            GEOMAR/SO112/datalist.mb-1

        <02>            GEOMAR/SO141/datalist.mb-1

        <02>            GEOMAR/SO142/datalist.mb-1

        <01>    IFREMER/datalist.mb-1

        <02>            IFREMER/CHARCOT/datalist.mb-1

        <02>            IFREMER/FOUNDATION/datalist_mb71.mb-1

        <02>            IFREMER/GEOMETEP4/datalist.mb-1

        <02>            IFREMER/MANZPA/datalist.mb-1

        <02>            IFREMER/NOUPA/datalist.mb-1

        <02>            IFREMER/OLIPAC/datalist.mb-1

        <02>            IFREMER/PAPNOU87/datalist.mb-1

        <02>            IFREMER/PAPNOU99/datalist.mb-1

        <02>            IFREMER/POLYNAUT/datalist.mb-1

        <02>            IFREMER/SEAPOS/datalist.mb-1

        <02>            IFREMER/ZEPOLYF1/datalist.mb-1

        <02>            IFREMER/ZEPOLYF2/datalist.mb-1

        <02>            IFREMER/ZEPOLYF3/datalist.mb-1

        <02>            IFREMER/BENTHAUS/datalist.mb-1

        <02>            IFREMER/SISMITA/datalist.mb-1

        <02>            IFREMER/ACT/datalist.mb-1

 

SEE ALSO

mbsystem(1)

 

BUGS

No true bugs here, only distantly related arthropods... Yum. Seriously, it would be better if the copy function preserved the modification times of the copied swath files and ancillary files. Copying of processed files should also be an option.


 

Index

NAME
VERSION
SYNOPSIS
DESCRIPTION
MB-SYSTEM AUTHORSHIP
OPTIONS
EXAMPLES
SEE ALSO
BUGS


Last Updated: 18 April 2017