Difference between revisions of "SCons"

From Madagascar
Jump to navigation Jump to search
Line 104: Line 104:
 
In the example above, '''Fetch''' specifies the rule for getting the file <tt>wz.35.H</tt>: connect to the default data sever and download the file from the [http://www.reproducibility.org/data/wz data/wz] directory.
 
In the example above, '''Fetch''' specifies the rule for getting the file <tt>wz.35.H</tt>: connect to the default data sever and download the file from the [http://www.reproducibility.org/data/wz data/wz] directory.
  
* '''Flow(<target[s]>,<source[s]>,<command>,[options])''' defines a rule for creating targets from sources by running the specified command through Unix shell. For example...  
+
* '''Flow(<target[s]>,<source[s]>,<command>,[options])''' defines a rule for creating targets from sources by running the specified command through Unix shell. The optional parameters that control its behavior are summarized below.
  
 
{|class="wikitable" align="center" cellspacing="0" border="1"
 
{|class="wikitable" align="center" cellspacing="0" border="1"
Line 131: Line 131:
 
|}
 
|}
  
 +
In the example above, there are three <tt>Flow</tt> commands. The first one...
  
 
* '''Plot(<target>,<source[s]>,<command>,[options])''' or '''Plot(<target>,<command>,[options])''' is similar to '''Flow''' but generates a graphics file (Vplot file) instead of an RSF file. If the source file is not specified, it is assumed that the name of the output file (without the <tt>.vpl</tt> suffix) is the same as the name of the input file (without the <tt>.rsf</tt> suffix).
 
* '''Plot(<target>,<source[s]>,<command>,[options])''' or '''Plot(<target>,<command>,[options])''' is similar to '''Flow''' but generates a graphics file (Vplot file) instead of an RSF file. If the source file is not specified, it is assumed that the name of the output file (without the <tt>.vpl</tt> suffix) is the same as the name of the input file (without the <tt>.rsf</tt> suffix).
Line 146: Line 147:
 
| view || None || if set, show the output on the screen instead of saving it in a file
 
| view || None || if set, show the output on the screen instead of saving it in a file
 
|}
 
|}
 +
 +
In the example above, there are two plot commands.
  
 
* '''Result(<target>,<source[s]>,<command>,[options])''' or '''Result(<target>,<command>,[options])''' is similar to '''Plot''', only the output graphics file is put not in the current directory but in a separate directory (<tt>./Fig</tt> by default). The output is intended for inclusion in papers and reports.
 
* '''Result(<target>,<source[s]>,<command>,[options])''' or '''Result(<target>,<command>,[options])''' is similar to '''Plot''', only the output graphics file is put not in the current directory but in a separate directory (<tt>./Fig</tt> by default). The output is intended for inclusion in papers and reports.
Line 158: Line 161:
 
|}
 
|}
  
* '''End()''' takes no arguments and signals the end of data processing rules. It provides the following targets, which operate on all previously specified '''Result''' figures.
+
In the example above, <tt>Result</tt> defines a rule that combines the results of two <tt>Plot</tt> rules into one plot by arranging them side by side. The rules for combining different figures together (which apply to both <tt>Plot</tt> and <tt>Result</tt> commands) include:
 +
** SideBySideAniso
 +
** OverUnderAniso
 +
** SideBySideIso
 +
** OverUnderIso
 +
** TwoRows
 +
** TwoColumns
 +
** Overlay
 +
** Movie
 +
 
 +
* '''End()''' takes no arguments and signals the end of data processing rules. It provides the following targets, which operate on all previously specified '''Result''' figures:
 
** '''scons view''' displays the resuts on the screen.
 
** '''scons view''' displays the resuts on the screen.
 
** '''scons print''' sends the results to the printer (specified with '''PSPRINTER''' environmental variable).
 
** '''scons print''' sends the results to the printer (specified with '''PSPRINTER''' environmental variable).

Revision as of 16:16, 19 November 2010

Scons-logo-transparent.png

SCons (from Software Construction) is a superior alternative to the classic make utility.

SCons is implemented as a Python script, its "configuration files" (SConstruct files) are also Python scripts. Madagascar uses SCons to compile software, to manage data processing flowing, and to assemble reproducible documents.

Useful SCons options

  • scons -h (help) displays a help message.
  • scons -Q (quiet) suppresses progress messages.
  • scons -n (no exec) outputs the commands required for building the specified target (or the default targets if no target is specified) without actually executing them. It can be used to generate a shell script out of SConstruct script, as follows:

<bash> scons -nQ [target] > script.sh </bash>

Compilation

SCons was designed primarily for compiling software code. An SConstruct file for compilation may look like <python> env = Environment() env.Append(CPPFLAGS=['-Wall','-g']) env.Program('hello',['hello.c', 'main.c']) </python>

and produce something like

bash$ scons -Q
gcc -o hello.o -c -Wall -g hello.c
gcc -o main.o -c -Wall -g main.c
gcc -o hello hello.o main.o

to compile the hello program from the source files hello.c and main.c.

Madagascar uses SCons to compile its programs from the source. The more frequent usage, however, comes from adopting SCons to manage data processing flows.

Data processing flows with rsf.proj

The rsf.proj module provides SCons rules for Madagascar data processing workflows. An example SConstruct file is shown below and can be found in bei/sg/denmark <python> from rsf.proj import *

Fetch('wz.35.H','wz')

Flow('wind','wz.35.H','dd form=native | window n1=400 j1=2 | smooth rect1=3') Plot('wind','pow pow1=2 | grey')

Flow('mute','wind','mutter v0=0.31 half=n') Plot('mute','pow pow1=2 | grey')

Result('denmark','wind mute','SideBySideAniso')

End() </python> Note that SConstruct by itself does not do any job other than setting rules for building different targets. The targets get built when one executes scons on the command line. Running scons produces

bash$ scons
scons: Reading SConscript files ...
scons: done reading SConscript files.
scons: Building targets ...
retrieve(["wz.35.H"], [])
< wz.35.H /RSF/bin/sfdd form=native | /RSF/bin/sfwindow n1=400 j1=2 | /RSF/bin/sfsmooth rect1=3 > wind.rsf
< wind.rsf /RSF/bin/sfpow pow1=2 | /RSF/bin/sfgrey > wind.vpl
< wind.rsf /RSF/bin/sfmutter v0=0.31 half=n > mute.rsf
< mute.rsf /RSF/bin/sfpow pow1=2 | /RSF/bin/sfgrey > mute.vpl
/RSF/bin/vppen yscale=2 vpstyle=n gridnum=2,1 wind.vpl mute.vpl > Fig/denmark.vpl
scons: done building targets.

Obviously, one could also run similar commands with a shell script. What makes SCons convenient is the way it behaves when we make changes in the input files or in the script. Let us change, for example, the mute velocity parameter in the second Flow command. You can do that with an editor or on the command line as

sed -i s/v0=0.31/v0=0.32/ SConstruct

Now let us run scons again

bash$ scons -Q
< wind.rsf /RSF/bin/sfmutter v0=0.32 half=n > mute.rsf
< mute.rsf /RSF/bin/sfpow pow1=2 | /home/fomels/RSF/bin/sfgrey > mute.vpl
/RSF/bin/vppen yscale=2 vpstyle=n gridnum=2,1 wind.vpl mute.vpl > Fig/denmark.vpl

We can see that scons executes only the parts of the data processing flow that were affected by the change. By keeping track of dependencies, SCons makes it easier to modify existing workflows without the need to rerun everything after each change.

SConstruct commands

  • Fetch(<file[s]>,<directory>,[options]) defines a rule for downloading data files from the specified directory on an external data server (by default) of from another directory on disk. The optional parameters that control its behavior are summarized below.
Fetch options
Name Default Meaning
private None if the data file is private
server $RSF_DATASERVER or http://www.reproducibility.org remote data server (or local for local files)
top data name of the top data directory on the data server

In the example above, Fetch specifies the rule for getting the file wz.35.H: connect to the default data sever and download the file from the data/wz directory.

  • Flow(<target[s]>,<source[s]>,<command>,[options]) defines a rule for creating targets from sources by running the specified command through Unix shell. The optional parameters that control its behavior are summarized below.
Flow options
Name Default Meaning
stdout 1 if output to standard out (0 for output to /dev/null, -1 for no output)
stdin 1 if take input from standard in (0 for no input)
rsfflow 1 if using Madagascar commands
suffix '.rsf' default suffix for output files
prefix 'sf' default prefix for programs
src_suffix '.rsf' default suffix for input files
split [] split the flow for data parallel processing
reduce 'cat' how to reduce the output from data parallel processing
local 0 if execute on the local node when using data parallel processing on a cluster

In the example above, there are three Flow commands. The first one...

  • Plot(<target>,<source[s]>,<command>,[options]) or Plot(<target>,<command>,[options]) is similar to Flow but generates a graphics file (Vplot file) instead of an RSF file. If the source file is not specified, it is assumed that the name of the output file (without the .vpl suffix) is the same as the name of the input file (without the .rsf suffix).
Plot options
Name Default Meaning
suffix '.vpl' default suffix for the output file
vppen None additional options to pass to vppen
view None if set, show the output on the screen instead of saving it in a file

In the example above, there are two plot commands.

  • Result(<target>,<source[s]>,<command>,[options]) or Result(<target>,<command>,[options]) is similar to Plot, only the output graphics file is put not in the current directory but in a separate directory (./Fig by default). The output is intended for inclusion in papers and reports.
Result options
Name Default Meaning
suffix '.vpl' default suffix for the output file

In the example above, Result defines a rule that combines the results of two Plot rules into one plot by arranging them side by side. The rules for combining different figures together (which apply to both Plot and Result commands) include:

    • SideBySideAniso
    • OverUnderAniso
    • SideBySideIso
    • OverUnderIso
    • TwoRows
    • TwoColumns
    • Overlay
    • Movie
  • End() takes no arguments and signals the end of data processing rules. It provides the following targets, which operate on all previously specified Result figures:
    • scons view displays the resuts on the screen.
    • scons print sends the results to the printer (specified with PSPRINTER environmental variable).
    • scons lock copies the results to a location inside the DATAPATH tree.
    • scons test compares the previously "locked" results with the current results and aborts with an error in case of mismatch.

The default target is set to be the collection of all Result figures.

Command-line options

Command-line options
Name Meaning
TIMER Whether to time execution
CHECKPAR Whether to check parameters
ENVIRON Additional environment settings
CLUSTER Nodes available on a cluster
MPIRUN mpirun command

Seismic Unix data processing flows with rsf.suproj

Document creation with rsf.tex

SConstruct commands

  • Paper
  • End

Default targets

Book and report creation with rsf.book

SConstruct commands

  • Book
  • Papers
  • End([options]) signals the end of book processing rules. It provides the following targets:
    • scons pdf
    • scons read
    • scons print
    • scons html
    • scons www

The default targret is set to be scons pdf.