Writing HDF files Using CAPS - Scientific Data Conversion
The HDF file format offers a number of different options for storing data. The software described here supports only the HDF Scientific Data Set (SDS) and HDF attribute.
CAPS is an extension of Tcl/Tk. Tcl provides a powerful set of text manipulation facilities and CAPS provides numeric array support and access to HDF. The software is simple to setup and executable for a number of platforms can be located in the download area. The executable includes the Tcl and HDF libraries and the CAPS extension. A README comes with the download and the instructions should be followed closely. Once the software has been loaded it can be run by typing wish, or via a screen icon.
Suppose we want to write a set of tabulated data in ASCII into an HDF file in binary
format. The first stage of the process is to use Tcl to open the ASCII data file
and read in the data. Once the data has been read, it is converted to a binary array that
can, in an object oriented manner, write itself into an HDF file. To execute the examples,
wish or tclsh must be running. At the command line
prompt type: source <filename> where <filename> is the name of
the file in which the Tcl script is written.
Example 1.
Creating an HDF file called science.hdf with an SDS called science_data.
#
# science.dat is an existing ASCII file containing
# data in rows and columns (see end of this example).
#
#Use standard Tcl commands to read and format the data.
#
if [catch {open science.dat r} fileId] {
puts stderr "Cannot open science.dat:
$fileId"
} else {
#
# Read the data into a Tcl variable.
#
set data [read $fileId]
#
# Format so that it can be converted to a binary array.
#
set fdata ""
foreach line [split $data \n] {
if {[string length
$line] > 0} {
lappend fdata $line
}
}
set fdata [list $fdata]
#
# Convert the data into a 2 dimensional binary array using
# CAPS nap (numeric array processor) commands.
#
nap bdata = $fdata
#
# Now we can write the data to an existing or new HDF file with a new
# Scientific Data Set (SDS name).
#
#
data file
SDS
# object method name
name
#
$bdata hdf
science.hdf science_data
}
#
# The contents of the HDF file can be checked using
# the CAPS HDF browser selected from the CAPS browser menu.
# wish rather than tclsh should be run to access this menu.
The above example expects data (science.dat) in in ASCII numeric form in rows and columns such as:
1.23 4.56 1.0e10
-23 11.2 35.6
1 2 3
This data is formatted in the example as:
{{1.23 4.56 1.0e10}
{-23 11.2 35.6}
{1 2 3}}
which is the form that the nap command uses to specify a 2 dimensional
array. The numeric array object (nao) created by the nap command and referenced by bdata
is also a Tcl command and the hdf method allows it to write data to an HDF
file. The code only for example 1 is shown here:
if [catch {open science.dat r} fileId] {
puts stderr "Cannot open science.dat:
$fileId"
} else {
set fdata ""
foreach line [split [read $fileId] \n] {
if {[string length
$line] > 0} {
lappend fdata $line
}
}
set fdata [list $fdata]
nap bdata = $fdata
$bdata hdf science.hdf science_data
}
Note, that we could have combined the file read and line split steps into:
foreach line [split [read $fileId] \n] {
Getting the braces {} in position is best done using the Tcl list command
rather than trying to explicitly insert them. The Tcl parser tries to process explicit
braces and it can be quite difficult to get the desired outcome.
Example 2.
HDF SD dimension scale variables
The HDF output file from example 1 should be removed before trying this example.
#
# CAPS uses the term coordinate variable for HDF dimension scales.:
# If a CAPS numeric array object has coordinate variables these
# will be written to the HDF file as an HDF dimension scale automatically
# when the data is written. For example, if the data were in a latitude
# and longitude grid the dimension scales can be written to define that grid.
#
# Suppose or data has an associated latitude and longitude
# grid
#
5.0 5.5 6.0
longitude
#
# 30.0 1.23
4.56 1.3e5#
# 31.0 -23 11.2
35.6
# 32.0 1
2 3
#
# latitude
if [catch {open science.dat r} fileId] {
puts stderr "Cannot open science.dat:
$fileId"
} else {
set fdata ""
foreach line [split [read $fileId] \n] {
if {[string length
$line] > 0} {
lappend fdata $line
}
}
set fdata [list $fdata]
nap bdata = $fdata
#
# Now we create coordinate variables for latitude and longitude.
# Note that this information could also be read from the data file.
# nap "latitude = {30.0 31.0 32.0}" would create a similar
# numeric variable, but the default type would be double.
# nap latitude = float(latitude) converts latitude to 32 bit float.
#
nap latitude = ap(30.0f,32.0f,1.0f)
nap longitude = ap(5f,6f,0.5f)
#
# Here "f" means 32 bit floating point. Other options are
# d double precision, s short, l long, i int, b byte.
#
# Now we attach the coordinate variables to the data object.
#
$bdata set coordinate latitude longitude
#
# The latitude and longitude will be written as HDF
# dimension scales.
#
$bdata hdf science.hdf science_data
}
The nap command can be used to create n dimensional arrays which can be written to an HDF file in a similar way to that already demonstrated.
nap "vector = {1 3 4 6 7 9}"
creates a vector with 6 elements.
nap "array = { {1 2 3 4} {5 6 7 8}}"
creates a 2x4 array.
nap "threed = {{{1 2 3 4} {5 6 7 8}} { {5 6 7 8} {1 2 3
4}}}"
creates a 2x2x4 array. Note that the quotes are necessary to prevent the Tcl parser removing the left brace before the nap command parser can process the expression. In cases where no braces are present in an expression the quotes can be omitted.
Example 3.
Writing an attribute to an HDF file.
#
# You can write text to an HDF file
#
nap text1 = 'Hello this is test message 1'
#
# The text can be written to the HDF file
#
$text1 hdf science.hdf :text1
#
# The : indicates an attribute is to be written.
# A global attribute starts with a :.
# The attribute can also be associated with an SDS name by preceding
# the : with the SDS name
#
nap "text2 = 'Hello this is test message 2
and it goes over
several lines'"
$text2 hdf science.hdf science_data:text2
Binary I/O
Binary files can be created using the nap object oriented command. A small binary file can be created:
set tclChan [open test.file w+]
nap array = reshape({1 2 3 4 5 6 7 8},{2
4})
$array write $tclChan
close $tclChan
Note that the open and close are standard Tcl commands. Binary files can be read with the nap_get command, the syntax is:
nap_get binary <tcl
channel number> [<data type>] [<shape>]
The Tcl channel number is returned from the standard Tcl open command. The Tcl channel should be closed after using the nap_get command. The data type refers to the standard NAP data types: character, byte, short, integer, long, float and double. The <shape> is any nap expression, but as with any nap expression, be careful to include quotation marks to prevent Tcl removing outer braces, etc. For example:
set tclChan [open test.file
r]
[nap_get binary $tclChan long "{2
4}"]
1 2 3 4
5 6 7 8
close $tclChan
or to read the binary data and write to an HDF file:
set tclChan [open test.file
r]
nap bdata = [nap_get binary $tclChan long
"{2 4}"]
$bdata hdf science.hdf new_data
close $tclChan
It is important to specify the data type. The input data type defaults to byte if not
specified. It is necessary to use the tcl seek command to rewind to the start if the data
needs to be read again (seek $c 0 start). There is also a tcl library
called bin_io.tcl that provides scripts for reading certain standard Fortran file types.
All the examples have been tested and work under the Sun version of CAPS. However, we recommend that you always attempt to verify the contents of the HDF files created are correct. HDF provides a utility called hdp for this purpose.
More details showing how CAPS can be used to write HDF files can be found in the CAPS documentation. Please contact us if you need access to this area (Peter.Turner@csiro.au). There are also may books available describing the Tcl/Tk language.
WARNING: In Tcl spaces are important:
if{[string length $line] > 0} {
is not valid, there must be a space between the if and the left brace.
if {[string length $line] > 0} {
is OK.
Also note: The CAPS HDF browser automatically truncates data displayed as text including attributes.
updated 6 June 2000
![]() |
© Copyright, CSIRO Australia 2001 |