	PBS Configuration Instructions for the NorduGrid Testbed
		

Introduction:
-------------

PBS is a very powerful Local Resource Manager System (batch system) 
with dozens of configurable options. Server, queue and node attributes
can be used to configure the cluster's behaviour.

In order to correctly interface PBS to the NorduGrid architecture 
(mainly the information provider scripts)
there are a couple of configuaration requirements
asked to be implemented by the local system administrator.


Required configuration:
------------------------

1, The computing nodes MUST be declared as cluster nodes (job-exclusive),
at the moment time-shared nodes are not supported by the NorduGrid setup.
If you intend to run more than one job on a single processor then you can use
the virtual processor feature of PBS.

2, For each queue you MUST set one of the max_user_run or max_running
attributes and its value SHOULD BE IN AGREEMENT with the number of available
resources (i.e. don't set the max_running = 10 if you have only six (virtual) 
processors in your system). If you set both max_running & max_user_run then
obviously max_user_run has to be less equal than max_running.

3, Temporarily do NOT set server limits like max_running,
please use queue-based limits intead.

4, Avoid using the max_load and the ideal_load directives. The nodes's mom 
config file  (PBS_LOCATION/mom_priv/config)  should not contain any
max_load or ideal_load directives.
PBS closes down a node (no jobs are allocated to it) when the load on the node
reaches the max_load value. The max_load value is meant for controlling
time-shared nodes. In case of job-exclusive nodes there is no need for 
setting these directives, moreover incorrectly set values can close 
down your node.


Optional configuration, hints:
-------------------------------

3, If possible please use queue-based attributes instead of server level
ones (Temporarily do not use server level attributes at all).

4, You may use the "acl_user_enable = True"  with "acl_users = user1,user2"
attribute to enable user access control for the queue.

5, It is advisory to set the "max_queuable" attribute in order to avoid
a painfully long dead queue.

6, Assigning nodes to a queue: In case of an inhomogeneous cluster 
it is preferable to group identical nodes together and create a
queue for the identical nodes. With the settings below you can have a 
queue serving the "Athlon" nodes while another queue could be created
for the "dual" nodes. In order to assign nodes to a queue
you can use node properties from the $PBS/server_priv/nodes file
together with the "resources_default.neednodes". 
The example nodes file below (which sets the "Athlon" node property
for node1 and node4) together with the 
"set queue pc resources_default.neednodes = Athlon" qmgr setting 
results in a PBS configuration where the queue pc is assigned to the 
node with Athlon processor. By default the jobs from the pc queue will
execute on Athlon nodes. With PBSPRO this assignment can be done even easier,
simple set the "queue=pc" in the nodes file for the required nodes, after that
no need to set the "pc" queue's resources_default.neednodes qmgr attribute.
For OpenPBS however you need both the node "property" and the 
resources_default.neednodes to be set.



Checking Your Configuration:
----------------------------

1, The node definition can be checked by

$PBS_PATH/bin/pbsnodes -a

All the nodes MUST have "ntype=cluster"


2, The required queue attributes can be checked as:

$PBS_PATH/bin/qstat -f -Q  queuename

There MUST be a max_user_run or a max_running attribute listed with
a REASONABLE value.

  

Example configuration files:
-----------------------------

#$PBS/server_priv/nodes  file:
-------------------------------------------------------------
node1      np=1    Athlon single
node2      np=2    PIII   dual
node3      np=2    PIII   dual
node4      np=1    Athlon dual




#pbs.conf file
--------------------------------------------------------------
#Example PBS configuration with a short (default ) and long queue
#
#cut & save as pbs.conf  then use qmgr < pbs.conf
#

#
# Set server attributes.
#
 
set server log_events = 511
set server mail_from = adm
set server scheduler_iteration = 60
set server default_queue = short
set server query_other_jobs = true
set server node_pack=false
set server scheduling=true
set server resources_default.neednodes = 1
 
#
# Create and define queue short
#
 
create queue short
set queue short queue_type = Execution
set queue short enabled = True
set queue short started = True
set queue short Priority = 100
set queue short resources_max.cput = 02:00:00
set queue short resources_default.cput = 01:00:00
set queue short max_user_run = 3
set queue short max_running = 6
set queue short max_queuable = 100

#
# Create and define queue long
#
 
create queue long
set queue long queue_type = Execution
set queue long enabled = True
set queue long started = True
set queue long Priority = 50
set queue long resources_default.cput = 6:00:00
set queue long resources_min.cput=2:00:00
set queue long resources_default.nice= 19
set queue long max_user_run = 2
set queue long max_running = 3
set queue long max_queuable = 20

