wiki:manuals/site-info.def
Last modified 11 years ago Last modified on 11/20/08 11:22:51

Advanced "site-info.def" for YAIM/dCacheConfigure.sh

YAIM is an installation script for the glite/egee/lhc project. It is the standard glite installation tool for the glite/egee/lhc grid. The dCacheConfigure.sh are derived form the code originally used in YAIM to configure dCache. Desy also maintains repositories and basic instructions for installing dCache via YUM and dCacheConfigure.sh. This page attempts to help you set up dCache on larger production services.

YAIM philosophy

YAIM was originally just an admin script to document installation of glite middle ware from a common configuration file. While the use of a single configuration file has remained YAIM has grown to a highly modular set of scripts in the bash shell and controlled via the command line and site-info.def, eventually forking into the sub project dCacheConfigure.sh. The use of a single configuration file uses an "ssh" like format. An example is shown below:

DCACHE_POOLS="dublin.desy.de:23:/pool-path1 dublin.desy.de:33:/pool-path2 cork.desy.de:100:/pool-path"

This line says that 3 pools should be set up 2 on dublin.desy.de and 1 on cork.desy.de. The pools are 23Gb, 33Gb, and 100Gb respectively, if the size is omitted the pools default to the full partition size, so if two pools reside on the same partition you MUST set the sizes explicitly.

Planning your dCache layout

dCache is a cluster file system service, that can be organized in many ways. While dCache can be run on a single computer, it is most often important to split the dCache service to multiple computers. It is generally advisable to split CPU and IO components, so that no component overloads. For LHC most sites, we would recommend using multiple computers, an "admin node", a "nameserver node", an "SRM node" and a series of "pool nodes", Some of the pool nodes should run "doors" (dCache jargon for data access protocols).

Example Settings for dCache Layout

MY_DOMAIN=desy.de
#
#  Please use j2sdk 1.5, j2sdk 1.4 will also work but performance will not be as good.
#
JAVA_LOCATION="/usr/java/jdk1.5.0_11/"
#
# please make sure, you are using the 'Full qualified host names'.
#
DCACHE_ADMIN="dcache.desy.de"
#
# the pools : hostname:size:path
#
DCACHE_POOLS="dcache.desy.de:7:/dCachePools/pool1 dcache.desy.de:7:/dCachePools/pool2"
#

# This option sets the name server it defaults to the admin node if 
# not stated.
#
DCACHE_NAME_SERVER="dcache.desy.de"

# dCache supports two name servers these can be PNFS or Chimera 
# but not both. PNFS is the mature solution but Chimera is both 
# simpler and higher performance. For these reasons we provide 
# support for both PNFS and Chimera. We recommend fresh installs
# use Chimera but do not intend to force sites to change to 
# Chimera.
#
# If your site does not specify an explicit Name server the 
# name server will default to Chimera. If your site wishes to 
# explicitly use PNFS or Chimera it is recomended to specify the
# variable DCACHE_PNFS_SERVER or DCACHE_CHIMERA_SERVER explisitly.
#
# DCACHE_PNFS_SERVER=${DCACHE_ADMIN}
# DCACHE_CHIMERA_SERVER=${DCACHE_ADMIN}

# the various components
#
# 
# The following values have defaults and are optional.
#
# 
#  DCACHE_DOOR_SRM  : admin node
#  DCACHE_DOOR_GSIFTP,DCACHE_DOOR_GSIDCAP,DCACHE_DOOR_DCAP : pool nodes
#
DCACHE_DOOR_SRM="dcache.desy.de"
DCACHE_DOOR_GSIFTP="dcache.desy.de"
DCACHE_DOOR_GSIDCAP="dcache.desy.de"
DCACHE_DOOR_DCAP="dcache.desy.de"
#
# For the first installation, the following variables have
# to be set to 'yes'.
#
#  DO NOT set these values to yes on existing production services,
#  dCache internal databases will be deleted.
#
# RESET_DCACHE_CONFIGURATION=yes
# RESET_DCACHE_PNFS=yes
# RESET_DCACHE_RDBMS=yes
#
#
# check if the right VO's are defined.
#
VOS="ops dteam"
#

Default values and "site-info.def" for dCache

site-info.def for dCache tries to default as many configuration values as possible but has a few required values. These are proken down by variable on the following table.

Values Name Required Defaults To Example Value Explanation
DCACHE_ADMIN Yes N/A ford.desy.de Sets the dCache cluster controller
DCACHE_POOLS Yes N/A clinton.desy.de:500:/pool/1 Sets the path to the storage pool
DCACHE_NAME_SERVER Defaults DCACHE_ADMIN roosevelt.desy.de Sets host to provide the dCache name server this will default to the admin node if not set.
DCACHE_CHIMERA_SERVER Defaults DCACHE_ADMIN roosevelt.desy.de Sets host to provide the dCache name server as Chimera. Should be set for nodes when not admin
DCACHE_PNFS_SERVER Defaults DCACHE_ADMIN roosevelt.desy.de Sets host to provide the dCache name server as PNFS unless you need PNFS don't set this.
DCACHE_DOOR_LDAP Defaults DCACHE_ADMIN nixon.desy.de Sets host for the dCache information provider for Glue, this defaults to the admin node is unset and should be integrated with the GIP project.
DCACHE_DOOR_SRM Defaults DCACHE_ADMIN bush.desy.de:8443 Sets host and port for the SRM door, the host will default to each pool node if unset the port will default to 8443 if its not set.
DCACHE_DOOR_GSIFTP Defaults DCACHE_POOLS kennedy.desy.de:2811 Sets host and port for the GsiFTP door, the host will default to each pool node if unset, the port will default to 2811 if its not set.
DCACHE_DOOR_GSIDCAP Defaults DCACHE_POOLS kennedy.desy.de:22128 Sets host and port for the gsidcap door, the host will default to each pool if unset, the port will default to 22128 if its not set.
DCACHE_DOOR_DCAP Defaults N/A kennedy.desy.de:22125 Sets host and port for the dcap door, the host will default to none if unset, the port will default to 2811 if its not set.
DCACHE_DOOR_XROOTD Defaults N/A kennedy.desy.de:1094 Sets host and port for the xrootd door, the host will default to none if unset, the port will default to 2811 if its not set.
DCACHE_PORT_RANGE_PROTOCOLS_SERVER_GSIFTP Defaults 50000,52000 20000,24000 Sets the dCache gsiFTP server's port range. I set the examples I must use for the certification process.
DCACHE_PORT_RANGE_PROTOCOLS_SERVER_MISC Defaults 60000,62000 24000,24100 Sets the dCache (GSI)dCap and xrootd server's port range. The examples I use for the certification process.
DCACHE_PORT_RANGE_PROTOCOLS_CLIENT_GSIFTP Defaults 60000,62000 24100,24200 Sets the dCache gsiFTP client port range. Due to the SRM method srmcp, the SRM can coordinate dCache to act as a gridFTP client and transfer files directly the dCache server from a gsiFTP server. The examples I use for the certification process.
DCACHE_PNFS_VO_DIR="/pnfs/${MY_DOMAIN}/data" Defaults "/pnfs/${MY_DOMAIN}/data" "/pnfs/desy.de/data" Sets the root path within the dCache name server (PNFS or Chimera) to present data under.

How can I run these scripts on a production server.

dCacheConfigure.sh is primarily designed to aid automated installs, initially for small sites, and later options got added to support larger and larger sites. Its designed for getting dCache up and running for sites with little effort and tight deadlines, for this reason it was essential to allow dCacheConfigure.sh to be rerun with a mis configured dCache setup. Since storage services need to persist the data over upgrades, and wiping the configuration and setting up from scratch dCache YAIM has made options in the site-info.def to define the nature of an upgrade.

The following table describes the keys function if set to yes any other value or with capitalization will be interpreted as no, as this is the safest option. On your first install all of these values should be set to yes once data is stored within dCache, it is strongly recomended to set these values to no.

Key Function and effect if set to yes
RESET_DCACHE_CONFIGURATION When set to yes dCache YAIM will delete all the configuration files and copy the configuration templates into thier place and then set the configuration correctly. This is especially useful on first installs of dCache, and usuful for major version upgrades such as 1.7.0.X -> 1.8.0.X
RESET_DCACHE_PNFS If set to yes running YAIM will delete your PNFS names space service so all file paths will be lost. This is NOT recommended to ever be set to yes once data is stored in dCache. It is separated from RESET_DCACHE_RDBMS in that pnfs-gdbm is not postgresql based.
RESET_DCACHE_RDBMS If set to yes running YAIM will delete your local postgresql database and potentially with it your PNFS/Chimera names space service, or your SRM database, if PNFS/Chimera/SRM is intalled locally and running postgresql. This is NOT recommended to ever be set to yes once data is stored in dCache.