wiki:gridftp-checksum-proposal
Last modified 11 years ago Last modified on 05/01/07 12:17:01

Changes suggested by Andrew

Progress as of May 1 (Andrew)

I have committed first round of changes to enable dynamic checksumming. All changes are under gridftp2_checksums branch of the "util pools movers doors vehicles" sub packages.

cvs up -r gridftp2_checksums util pools movers doors vehicles

For the moment , I have used plain file system to stub the implementation of storing different checksum types. That's how I was testing the changes in integration mode.

Now, I'm going to start looking into cog tool kit, and changes necessary there to enable end-to-end testing of the gridftp2 checksumming features. None of that depends on the implemented support of the multi checksum type in the pnfs. However, deliverable at large will clearly require that missing piece done. I don't think we've outlined the interface for the pnfs part . It might be a good idea to do that during upcoming meetings.

Goal : Have the pool/mover code support client type checksumming. Be able to propagate checksum algo from the door to the pool/mover.

Phase I

New class : ChecksumFactory

  • abstract factory that encapsulates creation of the factory object responsible for particular type checksum construction, retrieval (and storage). The motivation behind the class is to provide degree of freedom in the way we build and retrieve checksum objects ( also unloads serialization of the checksum from pools/ChecksumModuleV5.java and doors/AbstractGridftp.java )

Changed classes: AbstractGridftpDoor

  • Uses checksum factory to store checksums in pnfs
  • sets checksum string type in the !GFtpProtocolInfo when requesting file storage.

!GFtpProtocolInfo.java

  • Sets / Gets checksum string checksum type ( could use factory itself)

ChecksumMover

  • new method : getCheckSumFactory(ProtocolInfo protocol) - -- incapsulates casting of the ProtocolInfo within mover implementation

MutliProtocolPools

  • use getChecksumFactory to establish factory object that will be used to transparently build and store correct checksum type
  • setDigest with checksum created via above factory object
  • pass factory object to ChecksumModuleV5.setMoverChecksum

ChecksumModuleV5

  • setMoverChecksum : override default checksum cration with factory object from above -- store and retrieve pnfs checksum using that object as needed

Implementations of the ChecksumMover

  • trivial getChecksumFactory override to cast ProtocolInfo to GFtpProtocolInfo and return factory if available
  • possible improvement : new class ChecksumMoverImpl to assume checksum related ops. Refer ChecksumMoverImpl from each implementation of the mover ( eliminate code duplication, pimpl concept )

Issues :

Implementations of the Abstract ChecksumFactory have to be aware how to set / retrieve checksums of a particular type from pnfs. Existing mechanism assumes use of the sole flag which , at the moment, is not designed to support mutiple checksum values. Reason - store operation becomes non atomic and thus easy pray for the race condition. If flag is a container for several checksum types , each store attempt will have to retrieve its full value just to

change only relevant subpart.

Config tweaks to force checksum checks even if mover config does not enable them explicitly.

Phase II

more changes to AbstractGridftp (and gt4client) classes to allow for the new gftp2 checksum ops.

  • gt4 client work