2008 Implementation Workshop

From Madagascar
Revision as of 13:45, 5 October 2008 by Nick (talk | contribs) (New page: ==Madagascar 2008 Implementation Workshop: Towards full automation and better robustness== right| right| '''UNDER CONSTRUCTION. Thi...)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Madagascar 2008 Implementation Workshop: Towards full automation and better robustness

Fotolia 800161 XS.jpg
M8rsprint2.jpg

UNDER CONSTRUCTION. This page is being changed from an invitation into a description of the meeting.

The workshop took place over the Memorial Day weekend, from Fri May 23 to Mon May 26, at the Geophysics Department of Colorado School of Mines, in Golden, Colorado.

participants worked together on implementing features in Madagascar.


The "deliverables" of this meeting are:

  • A stronger geophysical open-source community
  • A short-term Madagascar road map, created by discussing, extending, and prioritizing the feature request list
  • A real move towards version 1.0

Pool of suggested features (to be moved to Feature requests/another page)

The m8r features under consideration can be grouped into several categories, which are listed below. The grouping attempts to describe the software equivalent of Maslow's pyramid of needs. First the basics (reproducibility, I/O, parallelization, graphics) should be consolidated, then efforts should move up to numerical tools like solvers, FFTs and transposes, then up to widely-used geophysical algorithms. Interested parties are invited to brainstorm below!

  1. Features that provide functionality that is needed in order to have a minimalistic fully-automated m8r project setup
    1. Vplot diffs: These would allow the m8r project to fulfill one of its main goals – having fully automatic regression tests. See Feature request tracker
    2. rsfbook completion, also from the Feature request tracker
    3. Moving the wiki to Dreamhost so that it actually functions as a wiki again
    4. Automatically sync-ing the Wiki "Guide to programs" with the self-doc;
    5. Saving a static copy of the wiki so that consulting important parts of the documentation does not require a centralized, brittle software stack (local internet connectivity + remote web server + php + SQL + Mediawiki subject to constant spam attacks) to be running at that very moment. That would also allow splitting the "Guide to Programs" into one-page-per-program files, to be concatenated together into one searchable page in the static copy of the wiki, and for user-contributed parts to be included in the HTML self-doc;
  2. Features that make m8r more user-friendly
    1. A sane configuration script for the tex2pdf reproducible paper engine, that pulls a complete list of missing dependencies and has a way of extracting them from TeX Live (or instructing the user to do so). Right now the huge size of TeX Live (>1Gb) and its fast pace of advancement precludes its installation on legacy systems (i.e. RHEL 2,3,4). The goal is to make the pdf paper generating mechanism as easy to install as the rest of m8r, i.e. configuration either fails with a helpful error message or everything installs fine and it Just Works.
    2. sfdatadoc – see Feature request tracker
    3. Go through all the programs to make sure that there are no undocumented or unclearly specified parameters. Also look at inconsistencies (when parameters with similar meaning get different names in different programs). Establish and implement coding conventions.
    4. Binary packages – see Feature request tracker (Note 1: there has been noise since 2006 on SCons developer mail lists about providing a SCons command that would automatically create RPMs and maybe debs as well. Anyone know the implementation status of that? Note 2: Packages are typically split into a "package core" providing executables and help for them and a "package-devel" providing a Software development kit consisting of development libraries, headers and other include files, API documentation, etc. Doing things this way would require deep changes of the current Madagascar build mechanisms)
    5. man pages – see Feature request tracker
    6. Decide how to ensure that the Task-centric program list is always sync-ed with the source code. (keywords picked up by rsfdoc?)
    7. Create "Migrant's dictionary" for newcomers from other packages, by using their task-centric pages such as this one to show which m8r programs correspond to which programs from their "home country"
    8. TKSU-based GUI – see Feature request tracker (Note: the Madagascar plugin for OpendTect is also on track)
  3. Features that extend m8r's capabilities
    1. M8r-based programming
      1. Java API (more details on the GSOC2008 page)
      2. Extending the Python interface (more details on the GSOC2008 page)
      3. Finishing the Octave interface
      4. A tool to convert "normal" rsf to a "transfer-ready" rsf that has the binary converted to portable XDR, then gzipp-ed, the number of bytes of the compressed file and the MD5sum of the binary and of the compressed binary in the ASCII header, and this is shown by a keyword like form=xdr.gz . I checked with various types and sizes of data and even in the worst-case scenario, a factor of 2 in compression is attained. This would allow safe, fully-automatic data transfers.
      5. Introduce a sane way to control the optimization level for all languages in a m8r build, using a single flag. A researcher using Madagascar should not be an expert in compiler usage, he should just set the level. For example:
        • Level A, have all warnings and debugging info turned on. Link to rsflib version compiled similarly.
        • Level B, compile with -O2 and link against rsflib version compiled similarly.
        • Level C, same as level B, and also detect if compiler has interprocedural optimization abilities. If yes, compile everything with appropriate flags. If compiler cannot deal with optimizing across multiple files, have Python feed to compiler concatenated source code files that make a single program instead of "include" statements.
        • Level D, same as level C, and also have a special SCons "Optiflow" rule type, in which programs to be run are actually recompiled with parameters hardwired into them, so that optimizations such as dead branch elimination and loop unrolling are feasible. Use SCons to perform compilations in parallel. A fully-optimized compilation of most of m8r for a single flow may seem like overkill, but is necessary in the case of huge datasets that take weeks and months to process.
        • Level E, same as level D, but compile with -O3 and also, where available, use special alternative versions of the codes optimized by hand and with less error checking.
    2. Graphics
      1. Bezier curves in vplot – see Feature request tracker
      2. xtpen antialiasing – see Feature request tracker
      3. bargraph – see Feature request tracker
      4. graph3 completion – see Feature request tracker
    3. Geophysical/numerical tools
      1. Harlan's CG – see Feature request tracker
      2. conjgrad extensions – see Feature request tracker
      3. kirmod in layered media – see Feature request tracker
      4. minidds – see Feature request tracker

M8rmap.png