wiki:developers-meeting-20081022
Last modified 12 years ago Last modified on 10/22/08 19:38:01

dCache development phone conference Oct 22

Particpants : Paul, Irina, Owen, Gerd, Timur, Valdimr, [Alex?]

Agenda

  • Update on 1.9.0-3 release,
  • Update on 1.9.1 release,
  • Fermi problems with checksums,
  • Problem with gsidcap doors in trunk,
  • Who's maintaining (/opt/d-cache/bin/)dcache?,
  • LIST command to dCache gridftp server (#3394)
  • AOCB.

Update on 1.9.0-3 release

Currently committed changes in 1.9.0-branch since the 1.9.0-2 release:

  • Pin manager: change to timeouts,
  • Added support in dCacheConfigure for DNS VO names,
  • Update to dcache-core dcache-pool scripts,
  • SRM client updates.

Pending work on the 1.9.0 branch

  • fixing inconsistency between dcache script and {dcache-core, dcache-pool},
  • regression fixes in healer.

There was a discussion on changing "AdminDomain?" --> "AdminDoorDomain?" in dcache script so it matches the value in dcache-core and dcache-pool.

The immediate plan for dCache-1.9.0 branch is:

  • release 1.9.0-3 now,
  • get functionality into 1.9.0 branch,
  • release 1.9.0-4 "soon": either Friday or Monday next week.

Update on 1.9.1 release

The 1.9.1 release has slipped. Paul apologised for not having reviewed Gerd's patches fast enough. The issue is with the new GLUE "metatable"-compatible GLUE schema usage that places different requirements on dCache's infoProvider -- the existing info provider isn't suitable anymore and the existing recommendation for how spaces should be publish will likely change --- however, consensus hasn't yet been achieved.

Gerd: is this a WLCG requirement?

Paul: yes, but it's a rather indirect process. There is a requirement to collect storage accounting information; this is to identify whether sites are fulfilling their pledged storage. It was decided this information should come from info-service, so WLCG's usage of GLUE was reviewed and changes made. One of these changes is that GlueSA must not overlap and must cover all spaces.

Gerd: do they realise that having non-overlapping storage, publishing SRM reservations and having dynamically allocated reservations are mutually inconsistent requirements?

Paul: yes, this is the process I am going through. I've proposed some changes to the the WLCG usage so SRM spaces are published somewhat separately. For DPM and Castor, this isn't an issue.

Gerd: Castor and DPM don't support dynamic reservations.

Pragmatically, Gerd suggested we aim to release 1.9.1 as soon as possible and add the missing functionality on the 1.9-branch, so it is included in next minor release (i.e., 1.9.3-1). This would include:

  • migration tool
  • support in the info service for the extra information needed for new info provider.

A few patches remain that need reviewing.

[Paul: after the meeting, I found these patches: 3829, 3836 and 3857]

Timur: can you pass me the tickets numbers?

Procedure:

  • Final patches are reviewed,
  • A 1.9.1 branch is made,
  • The process of migrating the gPlazma changes can start,
  • A 1.9.1-1 release process is started.

Fermi problems with checksums

Question: should this goes into 1.9.0-3 ?

[Paul: (note from after the meeting) I don't think a definitive conclusion was reached on this point]

The issue is that, when adding support for client supplying the chksum (a feature of GridFTP v2), problems were encountered. The specific scenario is that:

  • client uploads file with GridFTP v2 and also supplies a MD5 checksum,
  • server stores this checksum in PNFS level-2 metadata (the "c" field).
  • HSM client tools parse this level-2 metadata incorrectly; specifically,

they assume that the checksum is ADLER32 (type = 1), so they look for a field like "c=1:<chksum>". MD5 (type = 2) has entries like "c=2:<chksum>". This is particularly an issue with EntStore?.

New dCache behaviour is:

if no chksum is specified, then ADLER is calculated<br> if a chksum is specified, then it is stored.

There is an issue with end clients reading a file:

If an end-user wishes to read a file that has MD5 checksum stored, The pool should calculate the Adler32 *and* MD5 checksum concurrently. The adler32 is to pass to the client and MD5 is to check the file's consistency.

There was discussion on how to support calculating multiple checksums concurrently. Further investigation was required.

It was also discovered the PnfsManager was unable to process level-2 metadata if there was three checksum. This is because, with a single checksum values like:

"...;c=<type>:<chksum>;l=..."

were recorded. For two checksums, the additional c1 field is used:

"...;c=<type>:<chksum>;c1=<type>:<chksum>;l=..."

but for three checksums, the separator for the c1 field is needed. Unfortunately, this separator is a semi-colon:

"...;c=<type>:<chksum>;c1=<type>:<chksum>;<type>:<chksum>;l=..."

This breaks PnfsManager's ability to parse the metadata.

The proposed changes:

only store ADLER32 checksums in the "c" field, use a different separator other than semi-colon (e.g., pipe- or comma- symbol) either:

fix the "c1" field format

or

add a new field (e.g., "d"), not create new c1 fields.

"...;c=<type>:<chksum>;d=<type>:<chksum>,<type>:<chksum>;l=..."

Paul: there's support for parsing c1 fields correctly in pnfsDump. This has been released, but not widely deployed.

A fire-alarm went off at Fermi

Problem with gsidcap doors in trunk

[item skipped as Fermi people weren't here]

Who's maintaining (/opt/d-cache/bin/) dcache?

[this point was a minor point mostly involving Gerd and Owen]

Gerd: there are no exclusive owners of any code in dCache. Anyone is welcome to fix dcache script, just follow the usual patch-review process.

Owen: OK.

LIST command to dCache gridftp server (#3394)

[Gerd currently "owns" this ticket, so he could reported on this.]

Gerd had two items to report on this:

  1. there is no standard for how long format is displayed, only a de-facto standard from what is currently implemented. Currently there are two formats available:
    1. something following the POSIX/Unix "ls -l" output.
    2. a new format that is more compact.
  1. Recent ACL patches from Irina, David *, also touches the list output format.

Owen: the customer is only really interested in, for a given file:

  1. what operations are permitted to them: read, write, etc..
  2. what is the file's (creation) time.

Gerd expressed a desire not to revisit the FTP LS formatting, better to fix it once.

Action: Owen to discover how urgently this is needed: whether this is something needed in one week or one month or so.

Pending news that this is urgent, we should delayed working on it until the ACL patches are released.

AOCB

None.

DTNM

Next week, same time.


Last modified by Paul @ Sun Mar 7 00:38:17 2021