Package overview

From Madagascar
Revision as of 13:51, 5 October 2008 by Nick (talk | contribs) (transfer contents)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Madagascar (formerly known as RSF) is an open-source software package for geophysical data analysis and reproducible numerical experiments. Its mission is to provide

  • a convenient and powerful environment
  • a convenient technology transfer tool

for researchers working with digital image and data processing. The technology developed using the Madagascar project management system is transferred in the form of recorded processing histories, which become "computational recipes" to be verified, exchanged, and modified by users of the system.

Specifically, Madagascar includes three parts

  • a computational part
  • a display part
  • a project management (literate programming/technology transfer/reproducibility) part

While each of these is potentially useful in its own right, Madagascar is most powerful when the components are used together.

Much of the documentation requires a preliminary grasp of all three parts. This tutorial provides the requires overview, briefly introducing the three parts, with links to more complex descriptions. The reader new to Madagascar is encouraged to read through this entire document before following the links.

Madagascar Computation

Madagascar computations imply:

  • a data format suitable for very large data sets
  • a set of executables (Madagascar components) suitable for composing very large computations
  • an API for developing new components

Madagascar data format

Madagascar computations use RSF formatted data. RSF represents regularly sampled arrays, rectangles, ... hyperrectangles. Irregularly sampled data can be handled as a pair of datasets, one containing data and the second containing corresponding

RSF metadata is treated as "the data"; one of the metadata components is a pointer to the raw binary data, normally in machine native format. It is possible to append the data to the metadata. RSF metadata is in ASCII format for human readability.

Madagascar components

Madagascar components may be implemented in multiple languages including Fortran, Matlab and Python. However, majority are implemented in C.

Madagascar components take a file name or a list of file namess (along with key-value pairs specific to the program) as command line input. If stdin is piped, it is treated as the input, or as the first of multiple inputs. Madagascar components produce rsf headers as output to stdout.

Madagascar components are self-documenting. When invoked without any command line inputs they output their own manual page.

Almost all Madagascar components take RSF as input. Most Madagascar components produce RSF as output, redirected to stdout. In the case where the output of a component is piped to another program, the concatenated header/value format of RSF is invoked automatically.

Further information about components can be found in

  • Programs describes the most commonly used components in detail
  • Task-centric program list categorizes and briefly describes all components
  • on disk html documentation (automatically generated) in <path to Madagascar>/doc/

Madagascar API

This intgroductory document does not cover extending Madagascar. Developers wishing to add Madagascar components are referred to API and Demo .

Madagascar Display: VPlot

In contrast to most other Madagascar Components, graphics Components produce vplot data as output.

VPlot is a program which implements a device independent graphics format that allows both vector and raster components (as such, it is comparable to Postscript). VPlot also implements a number of output devices. In typical usage an XTerm window is available. The user gets immediate feedback from relatively small calc

Unfortunately, vplot documentation is out of date. The closest thing to a manual is here: [[ http://sepwww.stanford.edu/theses/sep60/60_25.pdf ]] ; it describes a slightly different form of vplot. Updating of vplot documentation is in progress.

Fortunately, the beginning user does not need to know vplot. A wide range of Madagascar grpahics components are available. These are typically at the output of a chain of pipes.

See [[1]] for a list of these modules.

Example

Here is an example of a Madagascar pipe. In this case it takes a subsection of a file, low pass filters it, and saves the result

<python>

< data.rsf sfwindow n1=100 | sfbandpass fhi=60 > data2.rsf 

</python>

In this more elaborate case, the final output is passed to a graphics program and plotted.

<python>

< data.rsf sfwindow n1=100 | sfbandpass fhi=60 | sfcontour | xtpen

</python>

More extensive examples are seen at Programs . The novice reader should probably read the material below before proceeding to that page.

Madagascar Reproducibility and Project Management

Madagascar uses and extends SCons, an open-source software construction package, to document and maintain data processing flows. Documented projects become computational recipes that can be easily exchanged among Madagascar users.

SCons is a rule-based package in Python typically used a a build system. Familiarity with any build system will be helpful in understanding SCons. SCons statements, as python statements, are invoked in the sequence they are written, but as such they only define rules. The rules are invoked in accordance with a dependency graph which SCons builds based on those rules. Components regarded as "up-to-date" are not rebuilt.

SCons allows user-contributed Builders (meta-rule categories) and Madagascar uses this capability extensively. The idea is that building an output file based on a workflow chain is very much analogous to building a software package based on a software tool chain. The calculation is seen simply as a build with dependencies. This is a considerable benefit in developing alternative workflows using a given dataset. The system maintains an awareness of already completed calculations. Without user inetrvention, redundant calculations are avoided.

Madagascar calculations are thus expressed as SCons scripts. SCons extensions follow SCons conventions in beginning with an uppercase letter. The most common Madagascar extensions are Flow(), Result() and End(). A Flow() invocation wraps a Madagascar computational components, while a Result() usually encapsulates a graphical output. Finally an End() actually invokes the rules defined in the file.

Finally Madagascar enables a collection of reproducible documents, organized in living books. Each reproducible book contains a collection of Madagascar recipes (SConstruct files) used to generate book figures. The recipes cover a variety of data processing and imaging tasks described in the books. Figures and recipes serve dual purpose with respect to Madagascar maintenance. They provide demos for introducing new users to the functionality of the package and, at the same time, regression tests for assuring the system stability under change.

How it All Comes Together

Here is an example code, described in detail on the SCons page.

<python> from rsfproj import *

  1. Download the input data file

Fetch('lena.img','imgs')

  1. Create RSF header

Flow('lena.hdr','lena.img',

    'echo n1=512 n2=513 in=$SOURCE data_format=native_uchar',
    stdin=0)

  1. Convert to floating point and window out first trace

Flow('lena','lena.hdr','dd type=float | window f2=1')

  1. Display

Result('lena',

      
      sfgrey title="Hello, World!" transp=n color=b bias=128
      clip=100 screenratio=1 
      )

  1. Wrap up

End() </python>

Getting Madagascar

Madagascar runs on Unix/Linux platforms as well as on Miscrosoft Windows. It depends on Python, C, SCons, and vplot. In practice the user will also want an X Windowing system on their desktop. See download and Installation instructions.

License

The Madagascar package is released in an open-source form under the standard GNU GPL license. In simple words, there are no restrictions on the use of the software (including copying, modifying, selling, etc.) However, there are restrictions on the software redistribution intended to prevent the package from losing its open-source status. Users are encourages to submit their modifications back to the original distribution to the benefit of the whole user community.

Madagascar Community

Madagascar seeks to become an active and widely supported open source platform. Active mailing lists are maintained and annual meetings take place. See RSF-user mailing list and RSF-devel mailing list . Your participation is welcome.