wiki:manuals/AdvancedStieInfoDefForYaim
Last modified 10 years ago Last modified on 06/09/08 18:35:59

Advanced "site-info.def" for YAIM

YAIM is an installation script for the glite/egee/lhc project. It is the standard glite installation tool for the glite/egee/lhc grid. The package glite-yaim-dcache contains the dCache configuration component for YAIM. Desy also maintains repositories and basic instructions for installing dCache via YAIM. These instructions are fine for small sites but for larger installs further help may be needed, this page attempts to help you set up dCache on larger production services.

YAIM philosophy

YAIM was originally just an admin script to document installation of glite middle ware from a common configuration file. While the use of a single configuration file has remained YAIM has grown to a highly modular set of scripts in the bash shell and controlled via the command line and site-info.def. The use of a single configration file uses an "ssh" like format. An example is shown below:

DCACHE_POOLS="dublin.desy.de:23:/pool-path1 dublin.desy.de:33:/pool-path2 cork.desy.de:100:/pool-path"

This line says that 3 pools should be set up 2 on dublin.desy.de and 1 on cork.desy.de. The pools are 23Gb, 33Gb, and 100Gb respectively, if the size is omited the pools default to the full partition size.

Planning your dCache layout

dCache is a cluster file system service, that can be organized in many ways. While dCache can be run on a single computer, it is most often important to split the dCache service to multiple computers. It is generally advisable to split CPU and IO components, so that no component overloads. For LHC most sites, we would recommend using multiple computers, an "admin node", a "nameserver node", an "SRM node" and a series of "pool nodes", Some of the pool nodes should run "doors" (dCache jargon for data access protocols).

Default values and YAIM for dCache

YAIM for dCache tries to default as many confutation values as possible but has a few required values.

Values Name Required Defaults To Example Value Explanation
DCACHE_ADMIN Yes N/A ford.desy.de Sets the dCache cluster controller
DCACHE_POOLS Yes N/A clinton.desy.de:500:/pool/1 Sets the path to the storage pool
DCACHE_PNFS_SERVER Defaults DCACHE_ADMIN roosevelt.desy.de Sets host to provide the dCache name server this will default to the admin node if not set.
DCACHE_DOOR_LDAP Defaults DCACHE_ADMIN nixon.desy.de Sets host for the dCache information provider for Glue, this defaults to the admin node is unset and should be integrated with the GIP project.
DCACHE_DOOR_SRM Defaults DCACHE_ADMIN bush.desy.de:8443 Sets host and port for the SRM door, the host will default to each pool node if unset the port will default to 8443 if its not set.
DCACHE_DOOR_GSIFTP Defaults DCACHE_POOLS kennedy.desy.de:2811 Sets host and port for the GsiFTP door, the host will default to each pool node if unset, the port will default to 2811 if its not set.
DCACHE_DOOR_GSIDCAP Defaults DCACHE_POOLS kennedy.desy.de:22128 Sets host and port for the gsidcap door, the host will default to each pool if unset, the port will default to 22128 if its not set.
DCACHE_DOOR_DCAP Defaults N/A kennedy.desy.de:22125 Sets host and port for the dcap door, the host will default to none if unset, the port will default to 2811 if its not set.
DCACHE_DOOR_XROOTD Defaults N/A kennedy.desy.de:1094 Sets host and port for the xrootd door, the host will default to none if unset, the port will default to 2811 if its not set.
DCACHE_PORT_RANGE_PROTOCOLS_SERVER_GSIFTP Defaults 50000,52000 20000,24000 Sets the dCache gsiFTP server's port range. I set the examples I must use for the certification process.
DCACHE_PORT_RANGE_PROTOCOLS_SERVER_MISC Defaults 60000,62000 24000,24100 Sets the dCache (GSI)dCap and xrootd server's port range. The examples I use for the certification process.
DCACHE_PORT_RANGE_PROTOCOLS_CLIENT_GSIFTP Defaults 60000,62000 24100,24200 Sets the dCache gsiFTP client port range. Due to the SRM method srmcp, the SRM can coordinate dCache to act as a gridFTP client and transfer files directly the dCache server from a gsiFTP server. The examples I use for the certification process.
DCACHE_PNFS_VO_DIR="/pnfs/${MY_DOMAIN}/data" Defaults "/pnfs/${MY_DOMAIN}/data" "/pnfs/desy.de/data" Sets the root path within the dCache name server (PNFS or Chimera) to present data under.

How can I run YAIM again on a production server.

dCache YAIM is primarily designed to aid automated installs, initially for small sites, and later options got added to support larger and larger sites. dCache YAIM also was designed for getting dCache up and running for sites with little effort and tight deadlines, for this reason it was essential to allow YAIM to be rerun with a misconfigured dCache setup. Since storage services need to persist the data over upgrades, and wiping the configuration and setting up from scratch dCache YAIM has made options in the site-info.def to define the nature of an upgrade.

The following table describes the keys function if set to yes any other value or with capitalization will be interpreted as no, as this is the safest option. On your first install all of these values should be set to yes once data is stored within dCache, it is strongly recomended to set these values to no.

Key Function and effect if set to yes
RESET_DCACHE_CONFIGURATION When set to yes dCache YAIM will delete all the configuration files and copy the configuration templates into thier place and then set the configuration correctly. This is especially useful on first installs of dCache, and usuful for major version upgrades such as 1.7.0.X -> 1.8.0.X
RESET_DCACHE_PNFS If set to yes running YAIM will delete your PNFS names space service so all file paths will be lost. This is NOT recommended to ever be set to yes once data is stored in dCache. It is separated from RESET_DCACHE_RDBMS in that pnfs-gdbm is not postgresql based.
RESET_DCACHE_RDBMS If set to yes running YAIM will delete your local postgresql database and potentially with it your PNFS/Chimera names space service, or your SRM database, if PNFS/Chimera/SRM is intalled locally and running postgresql. This is NOT recommended to ever be set to yes once data is stored in dCache.

Now I have set up my site-info.def

You should then configure your dCache cluster with dCacheConfigure.sh or the superseded YAIM. These pages will give instructions on how to use them.