wiki:TaipeiWorkshop2013

Version 41 (modified by oleg, 8 years ago) (diff)

--

| dCache Hands-on Taipei |

Prerequisites

We would like to ask every participant to bring a notebook that has some kind of ssh client installed. Please check on your machine.

You need to login to your virtual machines (VMs). You have two VMs, one server and one client:

  • <user number>_ws_server.grid.sinica.edu.tw
  • <user number>_ws_client.grid.sinica.edu.tw

You will have to use two consoles, one for the server and one for the client. Use your ssh-client to connect to the machines using the following commands.

Client:

ssh dcache-user@<user number>_ws_client.grid.sinica.edu.tw -A -X

Server:

ssh root@<user number>_ws_server.grid.sinica.edu.tw -A -X

After this step you should have to consoles with ssh connections to server and client. Make sure you know which console is the client, which is the server.

Content of the hands-on

In this hands-on we will cover the following topics

  • Installation of dCache server
  • Interaction with dCache
  • Certificates and dCache
  • Hardware Lifecycle (adding storage, migrating)
  • Interfacing with Tertiary Storage

Installation of dCache server

(SERVER)

In this part we install dCache as a single node instance. Which means that all our services will run on the same host, it is shows us the basics and necessary setup. At the end of this part of hands on we will have a running dCache that can be mounted locally as well as on your client.

  1. Find the dCache server rpm on your server in in the root's home direcrory /root/:
[root]# ls dcache-*
dcache-2.5.0-1.noarch.rpm
  1. Install the rpm on your server machine:
[root]# rpm -ivh dcache-2.5.0-1.noarch.rpm
Preparing...                ########################################### [100%]

1:dcache                 ########################################### [100%]
...

dCache server has two dependencies: java-1.7.0-openjdk, postgresql-server (version >8.4). You will find that openjdk and postgresql-server are already installed on your machine and postgresql setup was done for you as well. Run the following command to check for the java and postgresql packages.

[root]# rpm -qa |grep -E "postgresql-server|java-1.7.0-openjdk"
postgresql-server-8.4.13-1.el6_3.i686
java-1.7.0-openjdk-1.7.0.9-2.3.7.1.el6_3.i686

Look this up later: For more information on this please see dCache Book - Prerequisites. There are also several trivial steps that you have to do to setup postgresql, but they are not part of this tutorial and can be found in dCache Book - Readying the PostgreSQL server for the use with dCache We don't focus on detal configuration of postgres server, already preparied postgres configuration script

[root]# ./configurepgsql.sh
Initializing database:                                     [  OK  ]
Starting postgresql-9.2 service:                           [  OK  ]
Stopping postgresql-9.2 service:                           [  OK  ]
Starting postgresql-9.2 service:                           [  OK  ]
psql:/usr/share/dcache/chimera/sql/create.sql:23: NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "t_inodes_pkey" for table "t_inodes"
CREATE TABLE
......

lot of lilnes 

......

CREATE TRIGGER

Configuration files

We make use of flat files, to define and change the settings for the layout and the behaviour of dCache on different levels (per host, per domain or per cell). There are three main places for the configuration files:

  • /usr/share/dcache/defaults

This directory is filled with files defining the default settings for all dCache services, as they are shipped by dCache.org. Do not modify these files, as they will be replaced by subsequent updates!

  • /etc/dcache/dcache.conf

The central configuration file, that ideally should be nearly identical on all nodes of the dCache setup. Maybe it is possible to have only one difference among all nodes at all: the parameter pointing to the layout configuration files. To get to know what settings can be made in dcache.conf you can look through dcache.properties in the defaults directory.

  • /etc/dcache/layouts

Layout files are the place to define the actual topology for the dCache services/domains on this node. So typically no layout file of one node is identical to any other layout file of another node in the setup.

dCache.org provides us with premade layout files that state a possible distribution of services over domains: head.conf, pool.conf and single.conf. Right now you could start dCache and it would use the empty dcache.conf file and the fallback layout file single.conf. With this the most important core services will be configured to run in one single domain with default settings. Alternatively, head.conf has predefined the mandatory services in a decent number of domains to be run on the headnode of your setup. Of course, you will need at least one other node using pool.conf to provide some disk space to dCache.

We would like to have our own layout file that we just use for this hands-on, therefor we create it from a template:

[root]# cp /etc/dcache/layouts/{single,isgc2013_ws}.conf 

We need to tell dCache to use the layout file we just created and add this to /etc/dcache/dcache.conf. Do not make the mistake of entering the entire file name there.

dcache.layout=isgc2013_ws

Adjust the layout file

Firstly we need to tell dCache that there will be many domains communicating with each other:

[root]# vi /etc/dcache/layouts/isgc2013_ws.conf

As we described in the introduction dCache services run inside so called domains. Since we will have many domains, we will need some mechanism for these domains to communicate. The cells framework is used for this communication, which is why we activate it by changing the following line in /etc/dcache/layouts/isgc2013_ws.conf:

broker.scheme=none

to

broker.scheme=cells

As mentioned we want to be able to mount dCache locally using NFSv41. Therefor a service called nfsv41 needs to be started, which we will keep in a separate domain for convenience of restarting it separately from the rest of dCache. These lines need to be added at the end of /etc/dcache/layouts/isgc2013_ws.conf. They add a domain to dCache - [nfs-Domain] - that holds the nfsv41.

[nfs-Domain]
[nfs-Domain/nfsv41]

Before we can start dCache we have to empty /etc/dcache/gplazma.conf as there is no security configured in a blank dCache:

[root]# echo "" > /etc/dcache/gplazma.conf

Check thats all done properly before start dcache:

[root]# dcache check-config
No problems found.

Now start dCache by:

[root]# dcache start
Starting dCacheDomain done
Starting nfs-Domain done

Check if dCache has started up correctly for domains:

[root]# dcache status
DOMAIN       STATUS  PID   USER
dCacheDomain running 18314 dcache
nfs-Domain   running 18359 dcache

and services:

[root]# dcache services
DOMAIN       SERVICE         CELL            LOG
dCacheDomain admin           alm             /var/log/dcache/dCacheDomain.log
dCacheDomain broadcast       broadcast       /var/log/dcache/dCacheDomain.log
dCacheDomain poolmanager     PoolManager     /var/log/dcache/dCacheDomain.log
dCacheDomain loginbroker     LoginBroker     /var/log/dcache/dCacheDomain.log
dCacheDomain spacemanager    SrmSpaceManager /var/log/dcache/dCacheDomain.log
dCacheDomain pnfsmanager     PnfsManager     /var/log/dcache/dCacheDomain.log
dCacheDomain cleaner         cleaner         /var/log/dcache/dCacheDomain.log
dCacheDomain dir             dirLookupPool   /var/log/dcache/dCacheDomain.log
dCacheDomain gplazma         gPlazma         /var/log/dcache/dCacheDomain.log
dCacheDomain pinmanager      PinManager      /var/log/dcache/dCacheDomain.log
dCacheDomain billing         billing         /var/log/dcache/dCacheDomain.log
dCacheDomain srm-loginbroker srm-LoginBroker /var/log/dcache/dCacheDomain.log
dCacheDomain httpd           httpd           /var/log/dcache/dCacheDomain.log
dCacheDomain topo            topo            /var/log/dcache/dCacheDomain.log
dCacheDomain info            info            /var/log/dcache/dCacheDomain.log
nfs-Domain   nfsv41          NFSv41-vt-021   /var/log/dcache/nfs-Domain.log

Then also check the log files:

[root@vt-021 data]# tail -F /var/log/dcache/*
==> /var/log/dcache/dCacheDomain.log <==
01 Mar 2013 18:22:01 (gPlazma) [] NodeList has 1 entries
01 Mar 2013 18:22:01 (gPlazma) [] examining plugin with class class org.dcache.gplazma.plugins.JaasPlugin
01 Mar 2013 18:22:01 (gPlazma) [] Adding plugin [jaas, org.dcache.gplazma.plugins.JaasPlugin]
01 Mar 2013 18:22:01 (gPlazma) [] Created 1 plugin metadata entries
01 Mar 2013 18:22:03 (PinManager) [] [AspectJ] javax.* types are not being woven because the weaver option '-Xset:weaveJavaxPackages=true' has not been specified
INFO 3/1/13 6:22 PM:liquibase: Successfully acquired change log lock
INFO 3/1/13 6:22 PM:liquibase: Reading from databasechangelog
INFO 3/1/13 6:22 PM:liquibase: Reading from databasechangelog
INFO 3/1/13 6:22 PM:liquibase: Successfully released change log lock
INFO 3/1/13 6:22 PM:liquibase: Successfully released change log lock

==> /var/log/dcache/nfs-Domain.log <==

2013-03-01 18:21:52 Launching /usr/bin/java -server -Xmx512m -XX:MaxDirectMemorySize=512m -Dsun.net.inetaddr.ttl=1800 -Dorg.globus.tcp.port.range=20000,25000 -Djava.net.preferIPv4Stack=true -Dorg.dcache.dcap.port=0 -Dorg.dcache.net.tcp.portrange=33115:33145 -Dorg.globus.jglobus.delegation.cache.lifetime=30000 -Dorg.globus.jglobus.crl.cache.lifetime=60000 -Djava.security.krb5.realm=EXAMPLE.ORG -Djava.security.krb5.kdc=localhost -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.auth.login.config=/etc/dcache/jgss.conf -Djava.awt.headless=true -DwantLog4jSetup=n -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/dcache/nfs-Domain-oom.hprof -javaagent:/usr/share/dcache/classes/spring-instrument-3.1.1.RELEASE.jar -Ddcache.home=/usr/share/dcache -Ddcache.paths.defaults=/usr/share/dcache/defaults org.dcache.boot.BootLoader start nfs-Domain
01 Mar 2013 18:21:54 (System) [] Created : nfs-Domain

This should show you that domains were created and no error messages should be flying around, so the output that tail show should stop at some point after starting dCache although some output is normal to inform about successful domain creation.

First Contact

Now that we have a running dCache that does nothing we would like to make a first contact. This will be achieved by mounting dCache locally via NFSv41.

dCache needs several things to come together to be able to store data in it. It needs a to authenticate and authorize people wanting to store data (gPlazma takes care of this), it needs a logical structure where file's names can be found in a directory structure, the namespace, which keeps file's meta data. The actual files are stored in a service that is called a pool that allows writing data to block devices and allows for a hierarchical storage management to be employed (More about this in the tertiary storage part of this hands-on).

dCache needs something that holds that actual data, the pools. So we first create a pool that is stored in a certain path in the local file system and set its maximum size. Execute the following command in your server console:

[root]# dcache pool create --size=419430400 /pools/nfsPool nfsPool poolDomain
Created a pool in /pools/nfsPool. The pool was added to poolDomain in
file:/etc/dcache/layouts/isgc2013_ws.conf.

After we created the pool we need to start the domain that was created by executing:

[root]# dcache status
DOMAIN       STATUS  PID   USER
dCacheDomain running 22269 dcache
nfs-Domain   running 22313 dcache
poolDomain   stopped       dcache
[root]# dcache start poolDomain
Starting poolDomain done

This now enables us to actually store files in dCache, but how? ... for example via a mounted NFSv41.

We need to set the NFS domain to make sure nfs server and client are inside the same namespace. This is done by adding the following line to /etc/dcache/dcache.conf:

nfs.domain = taipei-domain

As mentioned dCache also needs a namespace where file's meta data is stored. We need to create a directory in the dCache namespace by executing:

chimera-cli mkdir /data

Now we have to give the directory the right permissions:

chimeral-cli chown /data dcache

After this we can mount dCache locally by doing the following:

mkdir /nfs4
mount -o intr,minorversion=1 localhost:/data /nfs4

If you now switch to /nfs/4 you can create some file e.g.:

vi /nfs/4/myFirstNfsFile00001

Write something inside and quit vi with the following command. Be careful that you are not in edit mode.

:wq

Congratulations you wrote your first file into dCache.

Interaction with dCache

In this section we will look at the different ways of interacting with dCache to make use of its functionality. There is a multitude of ways to access dCache. For now we want to have a look at the simple and straight forward ways. One way you have seen in the chapter before - NFSv41. A different way to access dCache is WebDAV using the so called dcache.kpwd file for authorization.

You will have to have a webDav cell running so please add the following lines to your layout fiel:

[webdavDomain]
[webdavDomain/webdav]

Now you need to start your webDavDomain:

dcache start webdavDomain

Check your WebDAV settings: enable the HTTP access, disallow the anonymous access, disable requesting and requiring the client authentication and activate basic authentication.

webdavProtocol=http
webdavAnonymousAccess=NONE
webdavWantClientAuth=false
webdavNeedClientAuth=false
webdavBasicAuthentication=true

Adjust the /etc/dcache/gplazma.conf to use the kpwd plug-in (for more information see also the section called “Plug-ins”).

It will look something like this:

auth     optional kpwd
map      optional kpwd
session  optional kpwd
identity optional nsswitch

The /etc/dcache/dcache.kpwd file is the place where you can specify the username/password record. It should contain the username and the password hash, as well as UID, GID, access mode and the home, root and fsroot directories:

# set passwd
passwd dcacheuser <some hash> read-write 500 100 / / /

The passwd-record could be automatically generated by the dCache kpwd-utility, for example:

[root] # dcache kpwd dcuseradd -u 2000 -g 2000 -h / -r / -f / -w read-write -p test dcacheuser

Now you need to restart your webDavDomain:

dcache restart 

Certificates and dCache

This part will have a look at the security mechanisms that are used within the Grid.

For certificate based access we need one certificate that identifies the server, one for the client and we need to know whether we can trust the authority that issued the certificate. The server certificate was installed on your machine in /etc/grid-security. The client certificate is also already present in /home/dcacheuser0xx/.globus on the CLIENT.

To use WebDAV with certificates you change the entry in /etc/dcache/layouts/taipei.conf from the entries that were used for kpwd to the entries below:

[webdavDomain]
[webdavDomain/webdav]
webdavAnonymousAccess=NONE
webdavRootPath=/data/world-writable
webdavProtocol=https

Then you will need to import the host certificate into the dCache keystore using the command

[root] # dcache import hostcert

and initialise your truststore by

[root] # dcache import cacerts

Now you need to restart the WebDAV domain

[root] # dcache restart webdavDomain

and access your files via https://<dcache.example.org>:2880 with your browser.

Important

If the host certificate contains an extended key usage extension, it must include the extended usage for server authentication. Therefore you have to make sure that your host certificate is either unrestricted or it is explicitly allowed as a certificate for TLS Web Server Authentication. 

Hardware Lifecycle (adding storage, migrating)

Here you will see how to maintain a dCache that is in production. This part will include adding storage, balancing data between different pools and migrating data. What happens quite often that new hardware is coming in to increase storage or old hardware needs to be decommissioned. In such cases dCache behaves very nicely as all of commissioning can be done without disturbing functionality of the system. The component that takes care of this is the so called migration module. What we will will do in this part is to simulate such a every day process by adding a pool migrating data and decommissioning the old pool. Creating a new pool, you already know this, is done by:

dcache pool create --size=419430400 /pools/newPool newPool poolDomain

What you need to do next is to replicate the files"

ssh -l admin -p 22223 -c blowfish vt-021.grid.sinica.edu.tw -1

Enter the password that you will get from us during the handson. Switch into the PoolManager:

cd PoolManager

Then trigger the replication of files between pools:

rebalance pgroup default

The rebalance command distributes the files evenly amongst pools in the pool group. After this step we have the files distributed between nfsPool and newPool. Remember our goal was to decommission the nfsPool. Therefore we need to migrate the files away from it. First we should set the pool we want to empty to read-only so no new files are written to it,

..
cd nfsPool
pool disable -rdonly

After this we want to migrate the data completely to the new pool (newPool).

migration move newPool

We can watch the progress of the migration process by:

[] (nfsPool) admin > migration info 1
Command    : null
State      : FINISHED
Queued     : 0
Attempts   : 177
Targets    : newPool
Completed  : 177 files; 36462 bytes
Total      : 36462 bytes
Concurrency: 1
Running tasks:

When the state is FINISHED as can be seen from the output above the migration process is finished and we can disable the pool completely.

pool disable

Interfacing with Tertiary Storage

Finally we will look at how dCache can be used to access archive media (tape archives)