wiki:pnfsDump2MigratePnfs2Chimera
Last modified 10 years ago Last modified on 01/11/11 16:14:27

Migration from PNFS to Chimera

This page describes how to migrate namespace information from an existing PNFS instance to a Chimera instance; typically, this process is a necessary step when decommissioning a PNFS instance and often the Chimera is a new instance.

There is some additional information on Directory Iterators. This is not required but may be useful if you have already migrated or are experiencing problems migrating.

Some general recommendations when migrating from PNFS to Chimera:

  • always use the latest version of pnfsDump, new versions are released often,
  • practise the process before doing this on a production system; use both verifications.
  • switch off dCache: don't attempt to migrate with the databases set to read-only,
  • when migrating proper, it is recommended to verify the result with at least the md5sum check and as much of the StorageInfo check as time allows.
  • try to ensure you have a fall-back, should the migration prove unsuccessful; i.e., leave PNFS installed and database(s) intact until the migration is successfully verified.
  • if you are running a 64-bit version of PNFS then you must upgrade to 3.1.18 or pnfsDump will not work. 32-bit PNFS instances do not suffer from this issue.

1. Preparing Chimera

This procedure assumes you have a working chimera instance in addition to the existing PNFS instance. If you do not have chimera available, please follow the instructions on how to install chimera steps 1. to 7., but be aware that step 7. will fail if PNFS is still running. Please note that PostGreSQL v8.2 is known to be slow at importing SQL generated by pnfsDump. Consider upgrading PostGreSQL to v8.3 or newer before migrating.

The migration process will keep the existing PNFS IDs as the corresponding IDs in chimera. Doing this removes the need for any migration of the pools. However, you must prepare the chimera DB schema to accept the shorter PNFS IDs. This process is easy and the SQL commands to achieve it are stored in the prep-chimera-for-migration.sql file, which is supplied within the pnfsDump RPM. The following command demonstrates how to achieve this:

psql -U postgres -f /opt/pnfs/share/sql/prep-chimera-for-migration.sql chimera

This process is needed only once per Chimera instance.

2. Extracting information from PNFS

NB These instructions have changed. If you are using a version of pnfsDump older than v1.0.19 then you are recommended to upgrade to the latest version. The instructions for generating the output needed for migration with pnfsDump v1.0.18 or earlier is still available.

To successfully migrate the namespace, you will need:

  1. the appropriate SQL for updating Chimera so it contains the PNFS name-space (required),
  2. output suitable to verify the migration using md5sum(1) program (recommended),
  3. a list of file PNFS IDs suitable for the StorageInfo verification (recommended).

All three outputs can be generated using the pnfsDump utility in a single pass. This utility talks directly to the shared memory of pnfs so does not use the NFS interface. You may run pnfsDump without mounting PNFS, but the utility must be run on whichever machine is running the dbserver daemons. The README file supplied with pnfsDump (/opt/pnfs/docs/README.pnfsDump) contains a complete description of how to use pnfsDump; READ THIS FILE.

NB The example below show pnfsDump with the -d0 option. This switches off a safety delay, making pnfsDump as fast as possible. Doing this is not recommended for production systems. Simply removing the -d0 option will result in the default value being used, which should be safe; however, since the name-space should be migrated during a down-time, this is only useful when practising the procedure.

Only the Chimera SQL is required; however, it is strongly recommended that verification is done to confirm that the migration was successful. There are two (complementary) verifications that may be done: the first uses the md5sum(1) application and the second uses dCache to verify the StorageInfo for each file. Both checks are described in more detail below.

To support these verification steps, additional output files are needed. pnfsDump may be run so it generates all three outputs at the same time. This is recommended to reduce the time needed for gathering the information, so allowing (assuming a site has a limited amount of down-time) additional time for verification.

The three sections below describe the three different file outputs in more details.

Generating the output

The general format for running pnfsDump is:

/opt/pnfs/tools/pnfsDump [<general options>] <output defn> [<output defn> ...]

pnfsDump can generate multiple output streams concurrently. It will generate a stream for each '<output defn>'.

Each <output defn> has the form -o <output format> <filename> [<output options>]. So, -o chimera /tmp/chimera.sql -2 is a <output defn> that is generated using the chimera output format with the -2 option and saves the result in /tmp/chimera.sql.

The following is a template of how to use pnfsDump to generate the required output. There are three <output defn>, so three streams of information are recorded as /tmp/pnfs.sql, /tmp/verify.md5 and /tmp/files.lst.

/opt/pnfs/tools/pnfsDump -r <source> -vv -d0 -o chimera /tmp/chimera.sql -2 -p <dest> -o verify /tmp/verify.md5 -r -o files /tmp/files.lst -f

The different output defn are described in more details below. First, an explanation of the generic options.

The -r <source> option indicates in which directory pnfsDump should start. If this option isn't specified then the whole name-space will be migrated. PNFS uses some directories for accounting purposes, so migrating these is unnecessary and distracting. Most people have the dCache portion of the name-space starting with the directory /pnfs, so this would be a natural point to use.

To discover <source> PNFS should be running and mounted. The ID is discoverable using the id dot-command; i.e., for file example-file.txt, run 'cat .(id)(example-file.txt)' and the ID is returned. The following illustrates how this may be achieved:

# Start PNFS
root> /opt/pnfs/tools/pnfs.server start

# Mount PNFS
root> mount -overs=2,udp,noac localhost:/fs /pnfs/fs

# Identify the PNFS ID of the atlas directory
root> cat '/pnfs/example.org/data/.(id)(atlas)'
000200000000000000001060

So, to migrate everything underneath the /pnfs/example.org/data/atlas directory, pnfsDump is run with the option -r 000200000000000000001060.

The -v options control how much information pnfsDump produces. Without any -v only critical errors are reported. A single -v provides some summary information. Two -vs (as above) provides information about non-critical warnings and three -vs provides periodic information about the speed of processing PNFS.

The -d0 option is described above.

Generating the SQL

In the above blue-print of running pnfsDump, there is a section -o chimera /tmp/pnfs.sql -2 -p <dest>.

Over the period since Chimera was first available, the database schema has changed once. This was between dCache v1.9.1 series of releases and the v1.9.2 and later series. Up to and including all dCache v1.9.1 releases the version of Chimera supplied with dCache used "Chimera Schema v1". The Chimera that comes with dCache v1.9.2-1 and later versions of dCache uses "Chimera Schema v2".

The two schemata differ only subtly but, when generating the SQL, pnfsDump must know which Chimera schema is being used. This is specified by using the "-1" or "-2" option for Chimera Schema v1 and Chimera Schema v2 respectively. In the examples below, Chimera Schema v2 will be assumed (so, "-2").

In the above, <dest> is the ID of directory in Chimera; the migration will copy all entries underneath (the directory corresponding to) <source> (specified as -r <source> in the general options) so these entries will appear in Chimera under (the directory corresponding to) <dest>.

To discover <dest> mount Chimera and use the same id dot-command; for example:

# Unmount PNFS
root> umount /pnfs/fs

# Stop PNFS
root> /opt/pnfs/tools/pnfs.server stop

# Start Chimera
root> /opt/d-cache/libexec/chimera/chimera-nfs-run.sh start

# Mount Chimera
root> mount localhost:/pnfs /pnfs

# Create destination directory
root> mkdir -p /pnfs/example.org/data/atlas

# Ensure destination directory has same permissions as in PNFS
root> chmod ??? /pnfs/example.org/data
root> chmod ??? /pnfs/example.org/data/atlas

# Discover the directory's ID
root> cat '/pnfs/example.org/data/.(id)(atlas)'
000093FDA63E6DD04C85BC202

# Unmount Chimera
root> umount /pnfs

# Stop Chimera
root> /opt/d-cache/libexec/chimera/chimera-nfs-run.sh stop

# Start PNFS (for pnfsDump)
root> /opt/pnfs/tools/pnfs.server start

Given the ID of the destination directory of 000093FDA63E6DD04C85BC202, the Chimera SQL <output defn> would look like

-o chimera /tmp/chimera.sql -2 -p 000093FDA63E6DD04C85BC202

In the above example, the destination directory is /pnfs/example.org/data/atlas. In general the destination directory may be any directory within Chimera; however, entries that are registered in an external catalogue (LFC or experiment-specific catalogue) cannot be moved without altering all corresponding catalogue entries. For this reason, it is generally impractical to migrate entries from PNFS to a different directory within Chimera.

The following is typical SQL generated by pnfsDump (as the file /tmp/chimera.sql in above example)

---
--- BEGIN of Dump
---
--- Output generated by pnfsDump v1.0.19
---
--- using command-line
---
---     /opt/pnfs/tools/pnfsDump -vv -o/tmp/pnfs2chimera.sql -d0 -r 000200000000000000001060 chimera -2 -p 000093FDA63E6DD04C85BC202
56A22C7DD9B
---
--- taken on Fri Jan 30 18:05:48 2009
---

BEGIN;
    INSERT INTO t_tags_inodes VALUES('000200000000000000001080',32768,1,0,0,16,to_timestamp(1181230321),to_timestamp(1181230321),
to_timestamp(1181230321), E'StoreName atlas\012');
    SELECT update_tag('000093FDA63E6DD04C85BC20256A22C7DD9B','OSMTemplate','000200000000000000001080',1);


BEGIN;
    INSERT INTO t_tags_inodes VALUES('000200000000000000001088',32768,1,0,0,10,to_timestamp(1181230321),to_timestamp(1182266735),
to_timestamp(1182266735), E'generated\012');
    SELECT update_tag('000093FDA63E6DD04C85BC20256A22C7DD9B','sGroup','000200000000000000001088',1);

Generating the output for md5sum(1) verification

To generate output suitable for the md5sum verification, the following <output defn> is needed

-o verify /tmp/verify.md5 -r

The output (/tmp/verify.md5) when pnfsDump is run like this is:

#
#              pnfsDump md5sum verification script
#              -----------------------------------
#

#  Generated using pnfsDump v1.0.19 on Sat Jan 31 09:37:31 2009
#
#  Command-line:
#
#     /opt/pnfs/tools/pnfsDump -vv -o/tmp/pnfs-md5sum-verify -d0 -r 000200000000000000001060 verify -r
#
#  To verify, make sure the namespace is mounted (for example,
#  localhost:/pnfs mounted at /pnfs) and, if the PNFS root
#  directory is /pnfs/path/to/root, run:
#
#      cd /pnfs/path/to/root
#      md5sum -c this-file | grep -v ': OK$'
#
#  Where "this-file" is the path to this file.
#
#  This test is successful if the line above (starting "md5sum -c ...")
#  produces no output.
#
#  Some additional statistics are available at the end of this file.
#
55e9558b5b5f60f098563f6be07baef7  .(tag)(OSMTemplate)
329ffba8af828e6cff655df25b259694  .(tag)(sGroup)
805f908ef8bae91a09fff14e20b8b430  .(id)(test)
0d5433f15743d2851360b6a7b4b966b7  .(use)(2)(test)
17fcab618de7b4041a72d60c249713ac  .(id)(dump-atlas.syncat-20080911)
152d5d42f3078b53055ccba5164930dc  .(use)(2)(dump-atlas.syncat-20080911)
3d3ad5a79ffdfb20982f24fbb08c2cc0  .(id)(dump-atlas.syncat-20080930)
fde6d18c4b5a57054c8299c9351776ba  .(use)(2)(dump-atlas.syncat-20080930)
61950e90848d04e5295ca72fa44492a3  .(id)(archive)
61a18e2aebb6333e774ebde486ce0773  archive/.(tag)(OSMTemplate)
23600b0278e5bf8eceb59eae30571a29  archive/.(tag)(sGroup)
[many other similar lines follow]

Details on how to use this file are given below.

Generating the output for StorageInfo verification

The following command illustrates the <output defn> needed to generate the file necessary for the StorageInfo verification:

-o files /tmp/files.lst -f

Typical output when pnfsDump is run like this is:

00020000000000000001C808
00020000000000000001C910
00020000000000000001CCF0
000C000000000000000010C8
000D0000000000000019ABF0
000D0000000000000019E0A8
000D00000000000000193698
[many other similar lines follow]

Details on how to use this file are below.

3. Injecting the SQL

Once pnfsDump has completed, the SQL is ready to be inserted into Chimera. The following command demonstrates how this may be achieved, assuming the database used to store Chimera information is called chimera and the PostGreSQL user postgres exists, both values are their respective defaults.

psql -U postgres -f /tmp/chimera.sql chimera

This process should be left to run to completion. Depending on the size of the namespace, this may take several hours.

The SQL is designed to be safe: you may interrupt the insertion process and the Chimera namespace should remain consistent. Rerunning the command (so injecting the complete SQL file) at a later date, is also safe.

Note: It was observed, that this process is very very slow when using PostgreSQL 8.2 (and probably with all versions below 8.3). Also, there is some indication that running within a virtualised environment can greatly slow down the injection process.

4. Convert StorageInfo locations to URIs

If dCache is used to store files in an HSM then some information is held in one of the PNFS levels (typically, level 1). The format of this information may be specific to the script that interacts with the HSM, but there is a de facto standard for Osm/Tsm? information.

With Chimera, the location of a file on tape is described by one or more URIs: one for each tape location. Therefore, there must be a migration process that converts the StorageInfo data into the URIs.

Some example scripts are available:

The following is an example of this SQL procedure being run.

#  Convert how StorageInfo locations are stored.
psql -U postgres -f osm2chimera.sql chimera
echo "select osm2chimera();" | psql -U postgres chimera

NB. This process is only needed if an HSM backend is in use. For disk-only dCache instances this step may be skipped.

5. Verifying that the migration was successful

There are two methods to check that the migration was successful: the md5sum check and the StorageInfo check.

The md5sum(1) check

This check will use the command md5sum(1) on the mounted filesystem to verify the contents. This command has two modes of operation: it can generate MD5-checksum values and it can verify the integrity of files given a list of files and their corresponding MD5-checksums. This check uses the second mode based on the output generated by pnfsDump.

The verify output from pnfsDump lists various "dot commands" that emit elements of the PNFS structure along with the expected MD5-checksum values. The output will check:

  • the IDs of all files, directories and symbolic-links,
  • the contents of all stored data (the "levels") for each file,
  • that each directory has the correct number of correctly named tags,
  • the value of the tags stored in each directory.

To run these tests, simply invoke the md5sum command with the -c option and the file generated by running pnfsDump with the verify option (/tmp/pnfs-verify-md5sum in above example). Unfortunately, md5sum will generate a line of output for each file that passes. Since the majority will succeed (we hope!), it is better to filter out those OK messages so any error messages are not lost.

The following illustrates how this may be done:

# Stop PNFS, if running.
/opt/pnfs/tools/pnfs.server stop

# Start Chimera
/opt/d-cache/libexec/chimera/chimera-nfs-run.sh start

# Mount Chimera
mount localhost:/pnfs /pnfs

# Change directory to <dest>
cd /pnfs/example.org/data/atlas

# Run md5sum -c, but filter out the results that are OK.
md5sum -c /tmp/verify.md5 | grep -v : OK$

When run like this, the check is successful if no output is generated.

The StorageInfo check

NB You will need dCache v1.9.5-17 or later to run this test. Using an earlier version of dCache will result in all files failing.

The StorageInfo check uses the components of dCache to read the information about each file from a supplied list, from Chimera and PNFS, and verify that dCache will behave identically when supplied with the Chimera namespace instead of the PNFS namespace. To do this, the test must load information from both PNFS and Chimera.

When using the Chimera namespace, dCache will extract data directly from the chimera PostGreSQL database. When using PNFS, the PNFS file-system must be mounted. Therefore, to run the StorageInfo verification, the PNFS daemons must be running and the PNFS NFS file-system mounted. The chimera NFS daemons must not be running; this is to allow the PNFS NFS daemons to run.

The configuration file /opt/d-cache/config/chimera-config.xml describes how dCache is to contact the PostGreSQL server that holds the Chimera data. If the database is established on a remote machine or with a non-default database name, this file must be adjusted accordingly.

As of dCache v1.9.5-11, the code to run the StorageInfo check is supplied with dCache (previously, it was supplied separately). To conduct the tests, run the migration-check.sh script (located in /opt/d-cache/libexec) and supply the file-name containing the list of file IDs. For example:

/opt/d-cache/libexec/migration-check.sh /tmp/files.lst

The output will show progress by listing line numbers and dots. If any discrepancy is found, it will be reported and the check will halt. If you wish the migration check all of the supplied list whether or not errors are found, the -k option may be supplied:

/opt/d-cache/libexec/migration-check.sh -k /tmp/files.lst 2> errors.txt

A list of available command-line options is printed if no arguments are given to migration-check.sh.

6. Establish cache-location information

After completing the above steps, Chimera will have the namespace and tape storage information. However, it will not have any information about which files are stored on the pools.

There are two ways to add this information to Chimera: have the pools register their contents or migrate the information from PNFS's companion database.

Option 1: pools register their contents

The process of registering pools requires a working dCache instance.

Connect to the dCache instance via the admin interface. Instruct each pool to register their files to the cache-location storage (i.e., chimera). This is done with the pnfs register command.

ssh -c blowfish -p 22223 admin@admin.example.org

    dCache Admin (VII) (user=admin)


[admin.example.org] (local) admin > cd pool_1
[admin.example.org] (pool_1) admin > pnfs register
[admin.example.org] (pool_1) admin > ..
[admin.example.org] (local) admin > cd pool_2
[admin.example.org] (pool_2) admin > pnfs register
[admin.example.org] (pool_2) admin > ..
[admin.example.org] (local) admin > cd pool_3
[admin.example.org] (pool_3) admin > pnfs register
...etc...

Option 2: import companion into Chimera

Instead of the lengthy manual process of pool registration, the cache-location information may be imported into the Chimera database and added directly to the Chimera table.

a. Make a simply copy of cache-location data

Copy the companion data into the chimera database.

pg_dump -U  postgres -t cacheinfo companion | psql -U postgres chimera

b. Convert the format

With help of converions script companion2chimera.sql populate t_locationinfo table with imported data:

psql -U postgres -f companion2chimera.sql chimera
echo "select companion2chimera();" | psql -U postgres chimera

c. Drop cacheinfo table

Once the t_locationinfo table has been populated, the companion cacheinfo table is not needed any longer.

echo "drop table cacheinfo;" |  psql -U postgres chimera

Attachments