Last modified 12 years ago Last modified on 05/11/09 12:21:30

pnfsDump and Directory Iterators

This page describes DIs in pnfsDump and their implications for PNFS-to-Chimera migration. Information on DIs is also available within the pnfsDump documentation (/opt/pnfs/docs/README.pnfsdump).

Summary: if you've not migrated to Chimera yet, make sure you're using the latest version of pnfsDump. If you have migrated and your PNFS database has no inconsistencies then all files will have been migrated. If you're PNFS database was in an inconsistent state then there's a (small) risk that a (very small) number of files have not been migrated. The availability of these additional files is due to support in pnfsDump for overcoming limitations in PNFS's error recovery. A method for migrating those files is described below.


What is a DI?

The pnfsDump tool will scan through a PNFS instance by first listing for files and directories in a directory. This process is repeating for each of the sub-directories that were discovered and so on. By doing this, it will scan a complete name-space.

The Directory Iterator (DI) is the method pnfsDump uses to obtaining a list of a directories contents: the files, sub-directories and symbolic links. It has the same effect as doing 'ls' on a directory.

What DIs are available?

Since v1.0.12 pnfsDump supports two DIs: readdir and dbscan.

The readdir DI uses PNFS's support for the 'ls' command: typing 'ls' within a mounted directory and the readdir DI both rely on the same code in PNFS. The readdir DI has always been available from pnfsDump: prior to v1.0.12 this the only DI available in pnfsDump.

With the dbscan DI pnfsDump reads the PNFS database records and analyses them directly. This requires pnfsDump to have knowledge of how PNFS structures directories. The resulting directory listing is independent of PNFS's support for the 'ls' command.

The advantage of the dbscan DI is the method is more robust, giving better error reporting and may be able to recover more data if the PNFS database is in an inconsistent state. The disadvantage of dbscan DI is the method may be slower.

How are DIs specified?

The DI may be specified explicitly on the command line using the "-i" generic option; for example, specifying the option "-i readdir" will inform pnfsDump to use the readdir DI:

pnfsDump -vv -d0 -i readdir -o/tmp/output files

If no "-i" option is specified then a default DI is used. Prior to v1.0.13 the default DI is readdir. With v1.0.13 and later, the default DI is dbscan.

Why implement different DIs?

It's possible for a PNFS instance to have some subtle inconsistencies in its database that does not affect dCache's ability to serve files yet will affect directory listing.

A file that suffers from this problem will be present when explicitly requested:

paul@zitpcx6184> ls -l foo
rw-r--r-- 1 paul paul 177240 2009-04-12 20:42 foo

However, the file will not be present when asking for a directory list:

paul@zitpcx6184> ls -l |grep foo

dCache read and write operations are not unaffected by this because dCache doesn't do 'ls' on the directory; therefore, it is unaware of this inconsistency.

What are the implications for migration?

If the PNFS databases are in a consistent state then there are no issues for migration.

If the readdir DI is used to migrate a PNFS instance in an inconsistent state then there may be some files that were not migrated.

Our experience has shown that the number of entries not migrated is very low. NDGF have investigated this and found no files failed to be migrated due to the readdir DI issue; investigating the DESY ATLAS instance found less than 0.014% of the files were a problem.

How do I discover if the database is inconsistent?

The "paranoid mode" (the "-p" option) has been modified to check both the readdir and dbscan DI methods and compare the available files. If there is a discrepancy between the two DI methods then it is reported (with "-vv" verbosity). This makes paranoid mode rather slow.

Since v1.0.12, pnfsDump has a "difference" selection mode. When this selection mode is chosen, the output will only include those entries present in one DI and absent from the other DI. The difference mode is selected by appending '/' to the DI name; for example, specifying "-i dbscan/" on the command-line selects all entries present in dbscan that are absent in readdir.

The following command obtains a simple list of files that the dbscan DI finds that the readdir DI fails to find:

pnfsDump -i dbscan/ -vv -o/tmp/files-diff.txt -d0 files

What should I do?

If you have not migrated yet, be sure to use the dbscan DI. In practice, this simply means use the latest version of pnfsDump. In all otherwise respects, the migration procedure remains unchanged.

If you have already migrated you may wish to identify which files are missing: there may be none! The "files" output with the difference selection mode will give a list of these files. If you limited the migration to a particular directory (the "-r" generic option) then you may specify that here, too.

pnfsDump -r000200000000000000001060 -i dbscan/ -o/tmp/missing-files -d0 files

If there are missing files, they may be migrated by specifying the chimera output, as with the previous migration, but with "-i dbscan/" option. For example, if the migrated output was generated using the command:

pnfsDump -r000200000000000000001060 -d0 -o/tmp/chimera.sql chimera -p0000CA087D4435E241509D5A6A92A0E3577A

then the following command will generate the SQL to migrate the additional files:

pnfsDump -i dbscan/ -r000200000000000000001060 -d0 -o/tmp/chimera-diff.sql chimera -p0000CA087D4435E241509D5A6A92A0E3577A

The resulting SQL file (/tmp/chimera-diff.sql) may be injected into Chimera as with the regular migration process.

If you don't remember the command used to generate the Chimera output, it is recorded in the first few lines of the SQL output.

As always, be sure to take a backup before injecting data into the Chimera database.