wiki:developers-meeting-20070816
Last modified 14 years ago Last modified on 08/17/07 15:51:46

SRM 2.2 deployment meeting with CERN and the Tier I's

Participants : Mathias (NDGF), Doris (gridKa), Greig(gridPP), Flavia(CERN), Maarten(CERN), Jane(BNL), Iris(BNL) Developers : Timur, Tigran, Gerd, Alex, Martin, Patrick

You need to call the following number: +41-22-76-77000 and ask for the
phone conf organized by Flavia Donno with name "dCache deployment".

Minutes kindly provided by Flavia

Minutes of the "SRM v2 dCache status update" meeting
====================================================
16 August 2007, 15:30-17:30 CERN time

Attendees: dCache developers team (Patrick Fuhrmann, Timur Perelmutov, Tigran Mkrtchyan, Martin Radicke, Gerd Behrmann, Alex), Flavia Donno, Maarten Litmaath, Greig A. Cowan, Doris Ressmann, Mattias Wadenstein, Jane, Iris

The meeting started listing the new open/blocking issues discovered in dCache and prioritizing the items to work on:

  1. srmMkDir returns SRM_FAILURE instead of SRM_DUPLICATION_ERROR when the directory to be created exists already (needed by lcg-utils/GFAL/FTS).
  1. srm RmDir returns SRM_FAILURE instead of SRM_DIRECTORY_NOT_EMPTY when the directory to be removed is not empty (needed by FTS)
  1. For a file stored in CUSTODIAL-NEARLINE Ls reports always FileLocality=NEARLINE. The current status of a file should be reported correctly in order to implement pre-stagers for the experiments. This bug won't be fixed soon. For the moment, clients should rely on the status of srmStatusOfBringOnlineRequest being SRM_SUCCESS as an indication of a file being correctly staged on disk.
  1. In dCache 1.8.0-11, an administrative interface has been made available that allows a site admin to reserve space specify a VO group/role together with the link group to be used for the space reservation. All sites will need to reconfigure their CUSTODIAL-NEARLINE space using this tool. The best way to proceed would be to remove existing files written in that space, release the currently allocated space and use the new administrative tool to reserve the needed space for LHCb and ATLAS. dCache 1.8.0-11 should be available to sites to install on Friday, August 17th, 2007. Release notes list the bug fixes and features available in this new release.
  1. SARA at the moment does not work for LHCb. Flavia tried to help and spotted that at this site no disk link groups were defined for LHCb. The developers confirmed that this might be the problem. The SARA site admins are unresponsive this week (probably on vacation) ? The developers can provide help to SARA to get ready for the LHCb SRM v2 exercise.
  1. srmLs seems to return incomplete results. Flavia will reproduce the reported errors (it happens sistematically at IN2P3) and submit it to the developers. Flavia will also use the dCache new client srmLs to see if the results reported by this tools are consistent with those reported by S2 and WLCG DM clients.
  1. srmRm is not always working at IN2P3. It is not clear that this is a configuration problem. Flavia will reproduce the reported errors and submit it to the developers.
  1. Using S2, it was observed that the srmStatusOfReserveSpaceRequest for dCache can make a final status change over time. In particular, in a continous polling at NDGF it was observed that the status changed from SRM_NO_FREE_SPACE (final state) to SRM_REQUEST_IN_PROGRESS (temporary state). The details are reported in an e-mail submitted to the GSSD list and in a dCache ticket(no ticket ID are returned by dCache when a ticket is submitted but no answers are provided).
  1. It was agreed that the other problems reported in the "Implementations problems" twiki page: https://twiki.cern.ch/twiki/bin/view/SRMDev/ImplementationsProblems can be solved later on.

We asked the sites if they had problems to report to the developers.

FZK

Point 10: I confirmed that creating the directory by hand, causes the expected behaviour and the file information is listed correctly! I only expect the information of the file "to identify it on tape", not the file itself!!

Also I promised in the meeting to create separate pools for LHCb (for RAW and RDST) when updating to the next patch level. However I checked the existing LHCb files at the moment, I don't have a single file for either of these categories. In which pools should files be located with are in the directory "generated or test"? Do I have to supress the possibility to write files not belonging to either RAW or RDST directories?

This kind of shows: that I still have some difficulties to get away from the idea of having directory tags, but using space tokens instead.

  1. With previous versions of dCache, when a file was removed from PNFS a copy of it could be found in the directory trash/1. This is no longer the case with dCache 1.8 and the directory is no longer created automatically. - The suggestion from Patrick was to manually create the directory and see if the problem persists.


Quote Doris :

I confirmed that creating the directory by hand, causes the expected behaviour and the file 
information is listed correctly! I only expect the information of the file "to identify 
it on tape", not the file itself!!


Quote Doris on different issue :

Also I promised in the meeting to create separate pools for LHCb (for RAW and RDST) when updating 
to the next patch level. However I checked the existing LHCb files at the moment, I don't have a single 
file for either of these categories. In which pools should files be located with are in the directory "generated or test"? 
Do I have to supress the possibility to write files not belonging to either RAW or RDST directories?

This kind of shows: that I still have some difficulties to get away from the idea 
of having directory tags, but using space tokens instead.
  1. In dCache 1.8 there is no way to check if a file in some pool (with a given space token) really belongs there. - This info will be added in the billing DB.
  1. A discussion on the implementation of space reservation for T0D1 in dCache took place. The dCache developers will come back with clarification on the current behaviour of dCache in this case.

EDINBURGH

  1. Currently DN entries are added manually when using gPlazma. A configuration tools should be foreseen to do this automatically. - Owen Synge is working on this.

NDGF

  1. When does the ATLAS exercise start ? - Sites will be tested and space reserved next week (week of the 20th of August). Experiment software integration tests are foreseen to start the week after. Real tests will probably begin the week after CHEP.

The most prominent outcome of this meeting was certainly that dCache has implemented space tokens differently from the WLCG understanding. For REPLICA/ONLINE

When writing files into a space, decribed by a space token, the space occupied by this file is substracted from this particular space token. This is agreed and implemented. Though when removing files which has been previously written into the space token, the space is not returned to the token. So even though files are removed from a space token they previously have been written to, the space of the space token is not increasing. This differs from the idea the GSSD working group has. They assume that the space is returned to the space token so that if all files, which have been written into a space token, are removed, the space token should have its initial size (space).

  • FZK
    • The SRM returns SRM_FAILURE instead of DUP if a directory is created which already exists. This blocks lcg utils from working. This is agreed to be changed within the next versions. The code is already in CVS.
    • SRM returns SRM_FAILURE instead of NOT_EMPTY if a directory is removed which is not empty. This problem was solve and seems to have reappeared. It will be fixed soon.
  • IN2P3 : Not possible to reserve space on the pools with that space tokens. New RPM with the given functionality will solve it. (dCache 1.8.0-11)

  • 'ls' should provide the information on whether a file is online or not. We understand that this needs to be added but is currenlty difficult for us to change. For now we recommend to wait for the bringOnline to succeed.
  • Bizar issues with 'ls' : with lcg clients. Flavia tries at various sites. Flavia tries with S2 clients and dCache clients.
  • Removal of files. Files couldn't be removed. CRITICAL
  • --------- NON URGENT from here ----------------
  • DORIS : minor problem : /opt/pnfs/trash : trash/1 is missing. The consequence is that there is no information on when a file is removed. This is needed by the HSM backend to remove the files. Is working after Doris manually created trash/1.
  • Mixed type of files in one pool.
  • SARA : might be on vacation.

Last changed by patrick at Fri Feb 26 22:15:49 2021