Server side installation instructions: setting up a NorduGrid resource

General notes:

 :: Preparation ::  Grid software ::  Security ::  Configuration ::  Start-up :: 

Pre-installation steps:

General requirements for equipment

Hardware, operating system etc

The NorduGrid middleware, also known as the Advanced Resource Connector (ARC) does not impose heavy requirements on hardware. Any 32-bit architecture will do, as well as some 64-bit ones (alpha). CPU frequency from 400 MHz and up has been tested, and RAM of 128 MB and up. Disk space required for the ARC installation is 35 MB, while external software (most notably, Globus Toolkit 2) requires 150 MB. Network connectivity of servers (front-ends, gatekeepres, database servers, storage arrays etc) is required to be both out- and inbound. In case you are behind a firewall, a range of ports will have to be opened. For clusters, the worker nodes can either be on a private or a public network.

A shared file system, such as NFS, is desired (due to simplicity) but not required if the local resource management system provides means for file staging to/from computing nodes or if execution happens on the same machine (like it does with fork). Local non-Unix user authentication is supported through callouts to external executables or functions in dynamically loadable libraries. Actual implementation (e.g., for AFS) requires site-specific modules to be provided.

The NorduGrid ARC middleware is expected to run on any system supported by Globus. At the moment, only GNU/Linux of the following distributions have been tested: RedHat 6.2 through 9, Fedora 1, Mandrake 8.0 and 9.1, SuSE 8.1 through 9.0 and Debian 3.0.

DNS Requirements for GSI (Globus security infrastructure)

In order for the authentication of a server's host certificate to be successful, the reverse DNS lookup of the IP address of the server must result in the hostname given in the host certificate.

This means that the reverse DNS lookup for a host running a GSI enabled service must be configured properly - a "host not found" result is not acceptable. When a server has several hostnames/aliases the host certificate should be requested with the hostname that is used in the reverse lookup table in the DNS.

This reverse lookup must work for all clients trying to connect to the server, including clients running on the machine itself. Even if the host is a dedicated server and no user interface commands is being run on it, other clients such as uploader and downloader processes run by the Grid Manager require GSI authentication to work.

Since the hostname in the host certificate is fully qualified the reverse lookup must yield the fully qualified hostname. If the /etc/hosts file is used for local lookups instead of DNS make sure that the fully qualified hostname is listed before any shortnames or aliases for the server host.

If e.g. the /etc/hosts file of the server looks like this

1.2.3.4    somename    somename.domain.com

any clients running on that machine can NOT contact servers on the machine itself since the result of a reverse lookup will be the unqualified hostname "somename" which will not match the fully qualified hostname in the host certificate. Such an /etc/hosts file should be modified to read

1.2.3.4   somename.domain.com   somename

Time synchronization

Since authorization on the Grid relies on temporary proxies, it is very important to adjust the clock on your boxes with a reliable time server. If the clock on a cluster is off by 3 hours, the cluster will either reject a newly created user proxy for the first 3 hours of its lifetime and then accept the proxy for 3 hours longer than it is supposed to, or start rejecting the proxy three hours too early, depending on in which direction the clock is off.

Clusters:

  1. First, you have to create (some) UNIX accounts on your cluster dedicated for Grid. These local UNIX accounts will be used to map Grid users locally and every Grid job or Grid activity will take place via these accounts. In the simplest scenario, it is enough to create a single account, e.g. a user called grid, but you can also have separate accounts for the different Grid user groups. You may group the created Grid accounts into UNIX groups and use the local UNIX authorization methods to restrict the Grid accounts.
  2. Create disk areas on the front-end which will be used by the Grid services. A typical setup is given in the table below with example locations indicated. NFS means that the directory has to be available on the nodes. It is recommended to put the grid area and the cache directory onto separate volumes (partitions, disks). The cache can be split into 2 subdirectories:
    1. a "control" subdirectory for control files and
    2. a "data" subdirectory for the cached data itself.
    For security reasons, the control directory should not be remotely accessible.


  3. Function Location Description Example
    grid area (required) NFS the directory which accomodates the session directories of the Grid jobs /scratch/grid
    cache directory (optional) NFS the place where the shared input files of the Grid jobs are kept /scratch/cache
    runtime environment scripts (optional) NFS the place for the initialization scripts of the pre-installed software environments /SOFTWARE/runtime
    control directory (required) local to the front-end the directory for the internal control files of the Grid Manager /var/spool/nordugrid/jobstatus


    Further notes on the Grid directories: some of the NFS requirements can be relaxed with a special cluster setup and configuration. For the possible special setups please consult the Grid Manager documentation.
  4. Check the network connectivity of the computing nodes. For the NorduGrid middleware, internal cluster nodes are NOT required to be fully available on the public internet (however, user applications may eventually require it). Nodes can have inbound, outbound, both or no network connectivity. This nodeaccess property should be set in the configuration (see below).
  5. Make your firewall Grid-friendly: there are certain ports and port ranges which need to be opened in case your Grid resource is behind a firewall. All of the requirements come from the Globus internals (you can read more on Globus and firewalls). NorduGrid ARC needs the following ports to be opened:
    • For MDS, default 2135
    • For GridFTP, default 2811
    plus a range of ports for GridFTP data channels. gridftpd by default handles 100 connections simultaneously; each connection should not use more than 1 additional TCP port. Taking into account that Linux tends to keep ports used even after the handle is closed for some time, it is a good idea to triple that amount. Hence about 300 ports should be enough for the default configuration. If you are using the globus-config package you should set GLOBUS_TCP_PORT_RANGE=9000,9300 in /etc/sysconfig/globus and open that range together with the ports 2135 and 2811. The above ports for MDS and GridFTP services correspond to the recommended default settings which can be modified in nordugrid.conf, globus.conf and /etc/sysconfig/globus.
  6. Configure the PBS batch system in order to fit the Grid. In a typical scenario a Grid queue (or queues) has to be created, all or some of the cluster nodes assigned to the Grid queues and PBS queue and user limits set for the Grid queue and the Grid accounts. The NorduGrid PBS configuration instructions is a good starting point. DO NOT use PBS routing queues as grid queues – they are not supported.

Storage Element:

  1. Install a standard Linux box with a dedicated disk storage area. In case the SE wants to serve several Grid user groups (or Virtual Organizations) it is preferable to dedicate separate disks (volumes, partitions, etc.) for the different Grid user groups.
  2. Creating Grid accounts: you have to create (some) UNIX accounts dedicated for Grid. These local UNIX accounts will be used to map Grid users locally and the data stored on the storage element will be owned by these accounts. In the simplest scenario, it is enough to create a single account, e.g. a user called grid, but you can also have separate accounts for the different Grid user groups. You may find it useful to put all the Grid accounts into the same UNIX group.
  3. Make your firewall Grid-friendly: there are certain ports and port ranges which need to be opened in case your Grid resource is behind a firewall. All of the requirements come from the Globus internals (you can read more on Globus and firewalls). NorduGrid ARC needs the following ports to be opened:
    • For MDS, default 2135
    • For GridFTP, default 2811
    plus a range of ports for GridFTP data channels. gridftpd by default handles 100 connections simultaneously; each connection should not use more than 1 additional TCP port. Taking into account that Linux tends to keep ports used even after the handle is closed for some time, it is a good idea to triple that amount. Hence about 300 ports should be enough for the default configuration. If you are using the globus-config package you should set GLOBUS_TCP_PORT_RANGE=9000,9300 in /etc/sysconfig/globus and open that range together with the ports 2135 and 2811. The above ports for MDS and GridFTP services correspond to the recommended default settings which can be modified in nordugrid.conf, globus.conf and /etc/sysconfig/globus. what is the recommended PORT_RANGE?

Collecting & Installing the Grid software (middleware):

The same basic server software is needed both for cluster and storage resources. The NorduGrid download area contains all the required software including the necessary external packages both as precompiled binaries for many Linux distributions and source distributions. Binaries are available either as relocatable RPMs or as tarballs. RPMs is the preferred way of installation for many distributions; if you are not familiar with it, read our "RPM for everybody" guide.

NorduGrid ARC middleware depends on several external packages, most notably, the Globus Toolkit 2 (GT2). This manual addresses only Globus installation; other dependencies are described in the ARC build instructions. Globus Toolkit itself depends on Grid Packaging Tools (GPT).

  1. install the NorduGrid version of the Globus Toolkit 2 (preferably the latest release) onto your system by getting and installing the following packages in this order:
    1. Grid Packaging Tools
    2. Globus Toolkit 2
    3. Globus configuration files
    To do this, go to the NorduGrid Downloads area and in External software select gpt, globus, globus-config. NorduGrid provides the Globus Toolkit as precompiled binaries (RPMs or tarballs) for a variety of Linux systems, as well as the Globus source distribution, suitable for re-building on a new system. Such a re-build would need a set of external packages, as described in notes on NorduGrid distribution of GT2. Unless it is relocated, GPT and Globus are installed under /opt/gpt and /opt/globus respectively. In any case, check that the variables GLOBUS_LOCATION and GPT_LOCATION are set according to your Globus installation.
    On some systems, some Perl packages may be required; they are also available in the download area's External Software section. Although you can use any other Globus distribution, we recommend to use the NorduGrid Globus distribution since it contains some critical fixes and allows you to install the NorduGrid ARC middleware from binary packages. The modifications within the NorduGrid Globus distribution are collected and described in Specifics of the NorduGrid release of the Globus Toolkit 2. Furthermore, you may bump into unforseen problems if not using the globus-config configuration used within NorduGrid. If you choose to use another Globus installation, you will need to get the NorduGrid source distribution and recompile it against your Globus installation and also you will have to recourse to the original Globus configuration files and documentation.
  2. Check that all the necessary external dependencies are satisfied, or download and install missing packages, if any.
  3. Download and install the required NorduGrid middleware packages from the NorduGrid Downloads area, "NorduGrid middleware" section, "releases". The same section contains nightly and unstable development tags. For a stable production site ONLY use releases. You definitely need the nordugrid-server package, while the nordugrid-client, nordugrid-devel and the nordugrid-doc packages are optional but recommended. It is useful to have the client installed on a server for testing purposes. You may need to install some of the external packages in order to satisfy dependencies.

Re-building NorduGrid ARC middleware on top of an existing Globus installation

See also detailed instructions on how to build the NorduGrid ARC middleware.

  1. Check that the variables GLOBUS_LOCATION and GPT_LOCATION are set according to your Globus installation. Unless you forced the installation into a specific location, default locations should be /opt/globus and /opt/gpt respectively.
  2. Check that all the necessary external dependencies are satisfied, or download and install missing packages, if any.
  3. Get from the NorduGrid Downloads area, "NorduGrid Toolkit" section, "releases" or "tags" – "source" the NorduGrid ARC source RPM nordugrid-<x.y.z-1>.src.rpm and rebuild it: rpm --rebuild nordugrid-<x.y.z-1>.src.rpm
  4. Alternatively, you can get a tarball nordugrid-<x.y.z>.tar.gz, and follow the usual procedure: tar xvzf nordugrid-<x.y.z>.tar.gz
    cd nordugrid-<x.y.z>
    ./configure
    make
    make install

Setting up the Grid Security Infrastructure: Certificates, Authentication & Authorization

Read carefuly the following section, as your resource will not be able to function if it has improper or outdated credentials.

The following considerations apply for both clusters and storage elements. You may find useful our certificate mini Howto.

  1. Your site needs to have certificates for the Grid services issued by your regional Certificate Authority (CA). The minimum is a host certificate but we recommend to have an LDAP (or MDS) certificate as well. In order to generate a certificate request, you will need to get and install the necessary Certificate Authority (CA) credentials, e.g., from the NorduGrid Downloads area, "CA certificates" section. You need the credentials (public key, configuration files etc.) of that CA to which you are going to submit the request. In case your resource is in a Nordic country, install the certrequest-config package from the NorduGrid Downloads area, "CA certificates" section. This contains the default configuration for generating certificate requests for Nordic-based services and users. If you are located elsewhere, contact your local CA for details. Generate a host certificate request with grid-cert-request -host <my.host.fqdn> and a ldap certificate request with grid-cert-request -service ldap -host <my.host.fqdn> and send the request(s) by e-mail to the corresponding CA for signing. Upon receipt of the signed certificates, place them into the proper location (by default, /etc/grid-security). Check that the files are owned by root and the private keys are only readable by root and that none of the files has executable permissions.
  2. Set up your authentication policy: decide which certificates your site will accept. You need the credentials of all the CAs which certified the services you plan to use and users you plan to accept. For example, if your host certificate is issued by the NorduGrid CA, and your user has a certificate issued by the Estonian CA, and she is going to transfer files between your site and Slovakia, you need the NorduGrid, Estonian and Slovak CA credentials. You are strongly advised to obtain credentials from each CA by contacting them. To simplify this task, the NorduGrid Downloads area, "CA certificates" section has a non-authoritative collection of most known CA credentials. As soon as you deside on the list of trusted certificate authorities, you simply download and install packages containing their public keys etc. Before installing any CA package, you are advised to check the credibility of the CA and verify its policy!
  3. The Certificate Authorities are responsible for maintaining lists of revoked personal and service certificates, known as CRL (Certificate Revocation List). It is the site (that is, yours) responsibility to check the CRLs regularly and deny access to Grid users presenting a revoked certificate. Outdated CRL will render your site unuseable. NorduGrid provides an automatic tool for regular CRL check-up. We recommend to install the nordugrid-ca-utils from the NorduGrid Downloads area, "NorduGrid Toolkit" section. The utility periodically keeps track of the CA revocation lists.
  4. Set up your authorization policy: decide which Grid users or groups of Grid users (Virtual Organizations) are allowed to use your resource, and define the Grid mappings (Grid users to local Unix users). The Grid mappings are listed in the so-called grid-mapfile. Within NorduGrid, there is an automatic tool which keeps the local grid-mapfiles synchronized to a central user database. If your site joins NorduGrid, you are advised to install the nordugrid-gridmap-utils from the NorduGrid Downloads area, "NorduGrid Toolkit" section. After installation you need to edit the /etc/grid-security/nordugridmap.conf and optionally create the /etc/grid-security/local-grid-mapfile (the latter file name is configurable in the nordugridmap.conf file) containing your local mapings. For further info on authorization read the NorduGrid VO documentation. IMPORTANT: you either maintain the grid mappings by hand editing the /etc/grid-security/grid-mapfile directly, or use the nordugrid-gridmap-utils (nordugridmap script ran through cron) to create and maintain the mappings file for your site. In the latter case, the utility keeps the grid-mapfile synchronized with the central NorduGrid user list. If you install the nordugrid-gridmap-utils you ONLY have to edit the nordugridmap.conf and optionally the local-grid-mapfile: the /etc/grid-security/grid-mapfile is periodically overwritten by the nordugridmap script!

Configuring the Grid resource:

Next step is the configuration of your resource. Some files have to be edited for both the computing cluster and the Storage Element (or for a combined resource of a cluster and SE). The configuration templates serve temporarily as a configuration document with detailed description of the configuration parameters and options. The configuration file consists of dedicated blocks for cluster and SE related services. Not having a certain block means not running the corresponding service on the resource.

  1. Create your /etc/nordugrid.conf by using the configuration template nordugrid.conf.template from the /opt/nordugrid/share/doc directory (provided you installed the NorduGrid ARC middleware under /opt/nordugrid/). With the nordugrid.conf you can configure the basic services and processes like the GridFTP server, the Grid Manager, the job submission interface, Grid storage areas and the information providers.
  2. Create your /etc/globus.conf by using the configuration template nordugrid-globus.conf.template from the /opt/nordugrid/share/doc directory (provided you installed the NorduGrid ARC middleware under /opt/nordugrid/). With globus.conf you basically set up your core information system services (configure the OpenLDAP server and back-ends together with the registration processes). If your site is going to provide resources via the NorduGrid production grid, you will need to check the latest NorduGrid GIIS Information for the list of country-level and core NorduGrid Grid Information Index Services to which your host will have to register.
  3. Optionally, you can setup Runtime Environments on your computing cluster. Setting up a Runtime Environment means installing a specific application software package onto the cluster in a centralized and shared manner (the software package is made available for the worker nodes as well!), and placing a Runtime Environment initialization script (named after the Runtime Environment) into the dedicated directory.

Startup scripts, services, logfiles, debug mode, test-suite:

  1. After a successfull installation and configuration of a NorduGrid resource the following services must be started:
    • Launch the GridFTP server: /etc/init.d/gridftpd start
    • Launch the Information System (LDAP server) and the registration processes: /etc/init.d/globus-mds start
    • Launch the Grid Manager daemon (not needed for a Storage Element): /etc/init.d/grid-manager start
    The Grid Manager and the gridftpd can be run under any account (configurable in the nordugrid.conf), while the LDAP server currently requires system administrator privileges. Moreover, the current startup scripts require root privileges (although the Grid Manager and the gridftpd daemons themself can happily run under a non-privileged account). Make sure that the host and service certificates are owned by the corresponding users (those in which name the services are started).
  2. The log files can be used to check the services:
    • the Information System uses the /var/log/infoproviders and the /var/log/globus-mds files
    • gridftpd writes log into the /var/log/gridftpd.log, the debug level can be set in the nordugrid.conf
    • the Grid Manager uses /var/log/grid-manager.log for general information and /var/log/gm-jobs.log for logging job information, the debug level is set in the nordugrid.conf.
    Please note that the log files are not fully relocatable yet, log rotation can be problematic too. The startup scripts log failure to the syslog. Once the server is up and running you should consult the corresponding server's log file. If a service fails even to start up, the syslog file of your system should be checked. This is normally /var/log/messages. Globus MDS also reports debug messages to syslog. These can be directed to a file by inserting a line like: local4.*       /var/log/mds-debug.log into /etc/syslog.conf
  3. Debug information: both in the nordugrid.conf and the globus.conf, different debug levels can be set and core dump files can be enabled. If you experience service crashes, please try to produce core files. The Grid Manager and gridftpd daemons are compiled non-stripped, while the slapd is stripped. Please note that enabling debugging results in serious performance losses (especially in the case of the MDS LDAP server), therefore use the default level of debuging in a production system.
  4. The NorduGrid client comes fit with the ngtest utility. Use it to test the basic functionality of the computing resource. The utility includes several tests which can be interesting to test your cluster with e.g. simple up- and download tests. A complete list of test-cases is obtained by issuing ngtest -list-cases Prior to submitting test jobs, make sure you possess a valid user certificate, have generated a valid Grid proxy and have credentials of all the necessary CAs installed. Consult the User Guide for detailed information on certificates, proxies and CA credentials. For a quick installation validation, run the default test (number 0) against your resource: ngtest -c <my.host.fqdn> -d 1 This will execute a complex Grid job, including staging of files to the computing resource (downloading input files from several locations and caching), compiling a small binary executable and running test calculation on the resource. We recommend to run at least this default test against a newly installed resource and to fetch the job output by using: ngget -a -d 1 See ngtest description or man-page for more details on the test suite. The gsincftp GridFTP client (comes with the NorduGrid Globus) can be used for testing the Storage Element setup: gsincftp gsiftp://<my.host.fqdn> This instruction opens an FTP connection using Grid access control methods. You should be able to browse the remote site using usual FTP (as implemented in the ncftp client) instructions.