Job Provenance Demo
From EgeeWiki
Multiple Ligand Trajectory Docking - A Case Study of Complex Grid Jobs Management
Contents |
[edit]
EGEE User Forum
We performed live demonstration in booth 42. You can see a copy of our poster presented here Image:JPDemo-poster.pdf.
[edit]
Screenshots
More screenshots with description and user manual
[edit]
Project objectives
The project main aim is to provide a case study of a GRID infrastructure service called Job Provenance usage in a context of selected appliaction domain.
- This case study is focused to Job Provenance service but the whole GRID infrastructure {in our case gLite based) is involved.
- The main aim of the project is not only to demonstrate the Job Provenance service but also to gain real world experiences for futher development and dissemination of the Job Provenance.
- The software developed during this project (the end user's GUI) is designed as an example of strongly customized tool. It is not only application domain dependend but it is also optimized for a particular team requirements and procedures. So the important part is how to design and build such kind of application using available services and tools, not the application itself.
[edit]
The story
- Each user community (application domain, research approach, even each team) have its own strategy how to organize its work. Sometimes very formal and strict (typically for security reasons) sometimes very relaxed. But usually each team have some tool to organize experiments, share resources and share experiences.
- The kind of the tool that collects and keeps these information may be very different but usually if there are valuable resources, lab appliances and long or expensive experimets, there is at least something like logbook.
- Many teams these days don't use only simple logbooks but create complex software tools too keep all the information and even to help with preparation of new experiments based on logged data (previous experiments records). These tools became important part of scientific work as they allow not only to effectively share the resources but also improve collaboration inside the research team.
- If the resources used are computational resources, typically managed by GRID middleware, the software tool not only maintain and organize collected information but can be directly involved in the process of experiment - prepare and control the experiment lifecycle. Moreover it is very often the only interface the researchers use for experiment planning and evaluation.
- Typical such tool is build on top of some general database engine and interacts with the GRID infrastructure services only to submit and control computations (in terms of jobs) and harvest all available information about such jobs. All the interpretation of collected data and evaluation of the experiments is done in the local database. It includes the process of preparation on new jobs based on evaluation of previous experiments (for example by screening or expert interpretaion).
- The Job Provenance service is a new GRID infrastructure service which aim is to provide a universal customizable backend service to build tools with funcionality described above. It collects and stores for long-term all the data about job lifecycle in a GRID and allows the users to attach its own information to the jobs. The Job Provenance service provides of course a query interface to access all the stored data.
[edit]
Facts and references
- The application domain is computational chemistry, reactivation of acetylcholinesterase in particular (see this link).
- The application problem solved: Development of Means against Combat Chemical Substances - find the reactivator for acetylcholinesterase inhibited by nerve agents (sarin, cyclosarin, soman, tabun or VX).
- The user group (application experts) comes from NCBR, National Centre for Biomolecular Research
- The demo is being prepared to be presented on the EGEE User Forum (agenda).
- The demo abstract.
- Job Provenance service
- Hosting environment for the demo is VOCE, the Virtual Organization for Central Europe, gLite based GRID environment.
[edit]
Design
[edit]
Job types
- Snapshot analysys jobs
- objective: the input trajectory is divided into snapshots; each snapshot is then analyzed and selected characteristics of the shapshot are logged as JP attributes. The snapshost are represented by files stored at SE.
- input trajectory is divided into snapshots by local process and prepared as files in SE; the snapshot analysys jobs do registration of snapshots into JP (computes selected characteristics of each snapshot and logs it as job's JP attributes)
- input: snapshot file, in current implementation pre-chopped up from the trajectory
- output: snapshot file, characteristics of the snapshot in form of the JP tags
- Ligand preparation jobs
- objective: interactive job of ligand preparation (ligand file, ligand descritpion described by JP tags), it is model example of interactive job track stored in JP
- input: N/A
- output: ligand file, JP tags describing its properties and the process of that particular ligand instance preparation (by who, method used)
- Docking jobs
- objective: the docking process
- input: snapshot, ligand, docking process parameters (grid size, algorithm used, etc.)
- output: conformers (a file describing potential dockings found), docking protocol file (to be transformed into JP tags by JP plug-in) with description of the docking process
- submit: parametric job based on job template and ligand-snapshot matrix
[edit]
Job sumbit
[edit]
Infrastructure
[edit]
JP index server
- on skurut68-1.cesnet.cz:8903
- using HEAD
su glite screen -x export GLITE_LOCATION=/home/glite/etics_HEAD/stage ulimit -c unlimited $GLITE_LOCATION/bin/glite-jp-indexd -q both -c /home/glite/.certs/hostcert.pem -k /home/glite/.certs/hostkey.pem -n -m jpis/@localhost:jpis_ncbr -p 8903 -x /opt/glite/etc/glite-jpis-config-ncbr.xml -d | /usr/local/bin/tscat | tee jpis-ncbr.log
[edit]
Technical resources
Job Provenance source tree - this CVS branch (HEAD) contains JP source code used for demo
Local UF 2007 CVS - this CVS contains code and job templates specific for the demo
[edit]


