Appendix B: good practice ideas and examples

Download as PDFDownload as PDF

This appendix contains ideas and examples of good practice in Grid support gathered from various conferences and meetings with organisations that are making effective use of Grid technologies. It is hoped to expand this collection in future editions: please send any suggestions for ideas to include to Andrew.Cormack@ja.net.

Where possible, references to the source of the idea are included, though some of the case studies were given by personal communication or on condition of anonymity.

Service Definitions

  • Documentation of the services provided by the UK’s National Grid Service is at: http://www.grid-support.ac.uk/content/category/7/29/49/
  • A Security Policy for UK e-Science projects is at: http://www.pparc.ac.uk/Rs/Fs/Es/New_Security_Policy.asp
  • The LHC Computing Grid project is carrying out a series of ‘Security Challenges’ to ensure that security measures implemented on components of the Grids are working as intended. The security challenges are designed to test both the technical and human implementation of security measures – for example, running a job on the Grid and then asking sites to trace which users, processes and files were related to the job. This activity should help to ensure that when a truly malicious job appears on the Grid, service managers both have the tools to identify and deal with it and are familiar with how to use them: https://twiki.cern.ch/twiki/bin/view/LCG/LCGSecurityChallenge

Network Support

Network Design

A wide range of technical measures for adapting networks to the requirements of Grids can be found in the companion Technical Guide Deploying Grids. In some cases there will be a number of possible measures to choose from, working at different levels of the transmission system. If lack of bandwidth or congestion is a problem then it is likely that solutions will need to re-arrange or improve the physical infrastructure. Security requirements may be addressed at the physical layer or at the transport layer, for example by using VLANs, or at the application layer by using tunnels. Application-layer gateways can often simplify other security measures: for example, a number of sites are planning to use Condor-G to link their Campus Grids together using Globus protocols rather than running the more complex Condor protocols over wide area networks.

In some cases it may be necessary to connect a demanding Grid resource directly to the external network if its traffic would be too disruptive to existing security measures. If this is done then the Grid resource must ensure that it has its own security measures as it will have no other protection from the hostile ‘background noise’ on the Internet. Any internal systems that need to connect to the exposed Grid server may also need special security measures to prevent them acting as bridges for any hostile activity.

Network Performance

The GN2 project included a Performance Enhancement and Response Team that investigated network performance problems and built up a knowledge base of common problems and solutions as well as strategies for investigating problem reports. The results are published as a Wiki (http://pace.geant2.net/cgi-bin/twiki/view/PERTKB/WebHome) and a report (http://www.geant2.net/upload/pdf/GN2-05-176v4.1.pdf).

System Support

Campus grids are often constructed where there is an existing managed workstation service – for example, covering open access terminal rooms. Since the software images run on these systems are managed centrally and installed automatically on all workstations it is relatively straightforward to install grid software as part of this standard image. This also allows upgrades and patches to be rolled out relatively easily:

http://www.nesc.ac.uk/action/esi/download.cfm?index=2457

Care may be needed where the workstations forming a Campus Grid are (or are perceived to be) intended for use by a particular faculty or department, rather than as a general university facility. In these circumstances there may be objections to running jobs from other departments or organisations unless local users and system managers are consulted. In some cases it may be appropriate to have separate Grids incorporating central and departmental machines, or at least to apply different access rules or priorities (if the software allows this).

User and Application Support

The computing service at one organisation has regular meetings with departmental users. During these meetings the opportunity of using the Campus Grid was discussed and promising projects or activities identified. Computing service staff then worked with these projects to help them port their applications to the Grid. Although it will not be possible to offer this type of one-to-one assistance in the future, it is hoped to gather sufficient experience and good practice information that future projects will need less intensive support.

Successful early adopters should be used to promote the use of the Grid. Where researchers have had a good experience and achieved things that would not have been possible without the use of Grids, they are likely to report this to colleagues both within and outside the organisation so that use can grow by word of mouth. Internal newsletters and published papers can be effective ways to promote Grid use.

Plan for your Grid to be successful! Know in advance what you will do when demand fills the available resources, and don’t make promises to early adopters that you will no longer be able to deliver when they are not the only users of the Grid.

A common problem experienced by Campus Grids is that the available workstations run Windows® operating systems whereas large applications are written for various flavours of UNIX®. Rebooting machines into ‘Grid mode’ overnight is a possible solution but needs to be scheduled carefully to avoid disrupting the primary use. Operating systems virtualisation may be preferable, where a UNIX® virtual machine can run on a Windows® platform:

http://www.nesc.ac.uk/action/esi/download.cfm?index=2462

[Note that since the presentation in the above URL was given, the VMWare ‘player’ software has been made available free of charge.]