wiki:OldPnfsDump
Last modified 11 years ago Last modified on 11/06/09 19:07:27

Generating output with pnfsDump < 1.0.19

Since release v1.0.19, pnfsDump is able to generate all needed output in a single pass. Prior to that, to generate the output one must run pnfsDump three times.

The procedure for using pnfsDump v1.0.19 or later is described within the overall description of the migration process. However, the instructions for using pnfsDump v1.0.18 or earlier is kept in this separate page for reference.


To successfully migrate the name-space, you will need:

  1. the appropriate SQL for updating Chimera so it contains the PNFS name-space (required),
  2. output suitable to verify the migration using md5sum(1) program (optional),
  3. a list of file PNFS IDs suitable for the StorageInfo verification (optional).

All three outputs can be generated using the pnfsDump utility. This utility talks directly to the shared memory of pnfs so does not use the NFS interface. You may run pnfsDump without mounting PNFS, but the utility must be run on whichever machine is running the dbserver daemons. The README file supplied with pnfsDump (/opt/pnfs/docs/README.pnfsDump) contains a complete description of how to use pnfsDump; READ THIS FILE.

NB The examples below show pnfsDump with the -d0 option. This switches off a safety delay, making pnfsDump as fast as possible. Doing this is not recommended for production systems. Simply removing the -d option will result in the default value being used, which should be safe; however, since the namespace should be migrated during a down-time, this is only useful when practising the procedure.

Only the Chimera SQL is required; however, it is strongly recommended that some verification is done to confirm that the migration was successful. There are two (complementary) verifications that may be done: the first uses the md5sum(1) application and the second uses dCache to verify the StorageInfo for each file. Both checks are described in more detail below.

To support these verification steps, additional output files are needed. To generate these additional files, pnfsDump must be run with different output formats: change the chimera keyword and subsequent options (in the SQL output) to different values and being sure to alter which file the output will be stored as (the -o option).

The three sections below describe how to run pnfsDump to generated the SQL to inject into Chimera and how to generate output suitable for the two verification processes. These three invocations of pnfsDump may be run concurrently; however, it is possible that the underlying database is not fast enough to cope with the demand. If this is so, you will see SCL_TIMEOUT errors (error code -335). If you encounter this problem try either running fewer concurrent pnfsDump or increase the delay (-d) to a larger number. The following is an example of a SCL_TIMEOUT error:

mdmGetRecord failed for id 000F0000000000000128AC08: -335
Failed to obtain inode record; item will be skipped.
        item inode: 000F0000000000000128AC08
        item name: AOD.020566._00270.pool.root.2
        parent directory inode: 000F0000000000000126E210

If you see errors like this, the output may be incomplete and so the dump must be repeated.

Generating the SQL

The following command illustrates how the SQL may be generated:

/opt/pnfs/tools/pnfsDump -r <source> -o /tmp/pnfs-2-chimera.sql -vv -d0 chimera -2 -p <dest>

Over the period since Chimera was first available, the Chimera schema has changed once. This was between dCache v1.9.1 series and the v1.9.2 series. Up to and including all dCache v1.9.1 releases the version of Chimera supplied with dCache used "Chimera Schema v1". The Chimera that comes with dCache v1.9.2-1 and later versions of dCache uses "Chimera Schema v2".

The two schemata differ only subtly but, when generating the SQL, pnfsDump must know which Chimera schema is being used. This is specified by using the "-1" or "-2" option for Chimera Schema v1 and Chimera Schema v2 respectively. In the examples below, Chimera Schema v2 will be assumed (so, "-2").

In the above, <source> is PNFS ID of source directory in PNFS and <dest> is the ID of directory in Chimera; the migration will copy all entries underneath (the directory corresponding to) <source> so these entries will appear in Chimera under (the directory corresponding to) <dest>. If either is not specified then their corresponding root directories are used.

To discover <source> and <dest> the appropriate filesystems (PNFS or Chimera) must be mounted and look at the output from the id dot-command; i.e., for file example-file.txt, run 'cat .(id)(example-file.txt)' and the ID is returned. The following illustrates how this may be achieved when migrating all entries underneath /pnfs/example.org/data/atlas from PNFS to Chimera:

# Start PNFS
root> /opt/pnfs/tools/pnfs.server start

# Mount PNFS
root> mount -overs=2,udp,noac localhost:/fs/pnfs /pnfs

# Identify the PNFS ID of the atlas directory
root> cat '/pnfs/example.org/data/.(id)(atlas)'
000200000000000000001060

# Unmount PNFS
root> umount /pnfs

# Stop PNFS
root> /opt/pnfs/tools/pnfs.server stop

# Start Chimera
root> /opt/d-cache/libexec/chimera/chimera-nfs-run.sh start

# Mount Chimera
root> mount localhost:/pnfs /pnfs

# Create destination directory
root> mkdir -p /pnfs/example.org/data/atlas

# Discover the directory's ID
root> cat '/pnfs/example.org/data/.(id)(atlas)'
000093FDA63E6DD04C85BC202

# Unmount Chimera
root> umount /pnfs

# Stop Chimera
root> /opt/d-cache/libexec/chimera/chimera-nfs-run.sh stop

# Start PNFS
root> /opt/pnfs/tools/pnfs.server start

# Build SQL
/opt/pnfs/tools/pnfsDump -r 000200000000000000001060 -o /tmp/pnfs-2-chimera.sql -vv -d0 chimera -2 -p 000093FDA63E6DD04C85BC202

In the above example, the destination directory is /pnfs/example.org/data/atlas. In general the destination directory may be any directory within Chimera; however, entries that are registered in an external catalogue (LFC or experiment-specific catalogue) cannot be moved without altering all corresponding catalogue entries. For this reason, it is generally impractical to migrate entries from PNFS to a different directory within Chimera.

The following is typical SQL generated by pnfsDump:

---
--- BEGIN of Dump
---
--- Output generated by pnfsDump v1.0.7
---
--- using command-line
---
---     /opt/pnfs/tools/pnfsDump -vv -o/tmp/pnfs2chimera.sql -d0 -r 000200000000000000001060 chimera -2 -p 000093FDA63E6DD04C85BC202
56A22C7DD9B
---
--- taken on Fri Jan 30 18:05:48 2009
---

BEGIN;
    INSERT INTO t_tags_inodes VALUES('000200000000000000001080',32768,1,0,0,16,to_timestamp(1181230321),to_timestamp(1181230321),
to_timestamp(1181230321), E'StoreName atlas\012');
    SELECT update_tag('000093FDA63E6DD04C85BC20256A22C7DD9B','OSMTemplate','000200000000000000001080',1);


BEGIN;
    INSERT INTO t_tags_inodes VALUES('000200000000000000001088',32768,1,0,0,10,to_timestamp(1181230321),to_timestamp(1182266735),
to_timestamp(1182266735), E'generated\012');
    SELECT update_tag('000093FDA63E6DD04C85BC20256A22C7DD9B','sGroup','000200000000000000001088',1);

Generating the output for md5sum(1) verification

The following command illustrates how to generate the file necessary for the md5sum verification:

/opt/pnfs/tools/pnfsDump -r <source> -o /tmp/pnfs-verify-md5sum -vv -d0 verify -r

The output when pnfsDump is run like this is:

#
#              pnfsDump md5sum verification script
#              -----------------------------------
#

#  Generated using pnfsDump v1.0.7 on Sat Jan 31 09:37:31 2009
#
#  Command-line:
#
#     /opt/pnfs/tools/pnfsDump -vv -o/tmp/pnfs-md5sum-verify -d0 -r 000200000000000000001060 verify -r
#
#  To verify, make sure the namespace is mounted (for example,
#  localhost:/pnfs mounted at /pnfs) and, if the PNFS root
#  directory is /pnfs/path/to/root, run:
#
#      cd /pnfs/path/to/root
#      md5sum -c this-file | grep -v ': OK$'
#
#  Where "this-file" is the path to this file.
#
#  This test is successful if the line above (starting "md5sum -c ...")
#  produces no output.
#
#  Some additional statistics are available at the end of this file.
#
55e9558b5b5f60f098563f6be07baef7  .(tag)(OSMTemplate)
329ffba8af828e6cff655df25b259694  .(tag)(sGroup)
805f908ef8bae91a09fff14e20b8b430  .(id)(test)
0d5433f15743d2851360b6a7b4b966b7  .(use)(2)(test)
17fcab618de7b4041a72d60c249713ac  .(id)(dump-atlas.syncat-20080911)
152d5d42f3078b53055ccba5164930dc  .(use)(2)(dump-atlas.syncat-20080911)
3d3ad5a79ffdfb20982f24fbb08c2cc0  .(id)(dump-atlas.syncat-20080930)
fde6d18c4b5a57054c8299c9351776ba  .(use)(2)(dump-atlas.syncat-20080930)
61950e90848d04e5295ca72fa44492a3  .(id)(archive)
61a18e2aebb6333e774ebde486ce0773  archive/.(tag)(OSMTemplate)
23600b0278e5bf8eceb59eae30571a29  archive/.(tag)(sGroup)
[many other similar lines follow]

Details on how to use this check are given below.

Generating the output for StorageInfo verification

The following command illustrates how to generate the file necessary for the StorageInfo verification:

/opt/pnfs/tools/pnfsDump -r <source> -o /tmp/pnfs-verify-storageinfo -vv -d0 files -f

Typical output when pnfsDump is run like this is:

00020000000000000001C808
00020000000000000001C910
00020000000000000001CCF0
000C000000000000000010C8
000D0000000000000019ABF0
000D0000000000000019E0A8
000D00000000000000193698
[many other similar lines follow]