Table of Contents
[part of a series of meetings]
Participants
Agenda
[see box to the side]
Postcards
Up to two minutes (uninterrupted) per person where they can answer two questions:
- What I did last week (since the last meeting),
- What I plan to do in the next week.
No questions until we get through everyone :)
Irina: Investigating Cleaner issue in #5418 (could not be reproduced on a test system); testing HSM cleaner to be ready for 1.9.6-3 ; managing RT tickets. Next week: add main Cleaner Commands to the dCache Book, Chapter 28; set up a multi-node dCache on 64Bit machines with SL5; reinvestigate problem with 'ls dcap' reported in #5291.
Status of work for 1.9.5
A (quick?) review of activity needed for the 1.9.5 release
Timur: Private discussion with Gerd about start-up scripts. Implement his suggestions to support other implementation doors (kerberos-FTP door ) adding new service isn't sufficient.
Feb 18th is next update point for Fermi.
(Partial) reversal of the sync failure handling. Release -13 as soon as possible.
Gerd: Timur, can you get the script ready today? Yes.
A couple of security configuration options need to be updated and some other changes, so there may be some discussion needed.
Makes sense to have per-Domain JAVA_OPTS.
Gerd: this is planned for 1.9.7
Pool-list file to be for any service; and domain list to contain options.
Jon may be able to do one more upgrade of pools to latest dCache before going into production for a very long time. Interactive command to change the log level.
Gerd: It's on my todo list, but haven't had a chance to look into it yet.
Restarts of pools that crash shouldn't involve admin intervention. This should already be the case with the pools his is already rolling out.
Status of work for 1.9.6
A (quick?) review of activity needed for the 1.9.6 release
Old bug that needs to be fixed in Jetty.
Handful of patches in RB for SRM; they're ACK-ed by Tigran, but it would be useful if Timur could have a look.
The SRM fixes are dead-lock fixes. Do a release of 1.9.6 as soon as these are in.
Status of work for Trunk (a.k.a future 1.9.7)
A (quick?) review of activity needed for the 1.9.7 release
How's getting everything working with message queue: only broken thing is the topo cell.
Switch to stop automatic restarts: suppress restart. Gerd: it's on my list.
Connection reset by peer.
Issues from yesterday's Tier-1 meeting
srm client interoperability
According to message in team from JPB. srm client should change behaviour to do the following.
# check the list of methods supported by the FTP server and not use checksum if not supported by the server # or try with checksum and if rejected by server, retry internally in the client without checksum.
At least while waiting for a proper fix, there is a workaround with this option -send_cksm=false I'm more worried by the problem with srm-setpermissions which does not work with DPM as already reported in Savannah: there is currently no workaround.
JPB
Timur: we can easily accommodate the request.
Some discussion that one should do a srm-getPermission first to find the group.
It would be nice to get this out "reasonably quickly": within a calendar month.
dccp and errors
- In many cases dccp library hides system errors and translates them into its own errors which makes it hard to understand the root cause of the problem. The situation is similar to catching java exceptions, that should be propagated.
For example CMS had to do to change line
dcap.c: dc_debug(DC_ERROR, "Accept failed.");
to
dcap.c: dc_debug(DC_ERROR, "Accept failed (err=%d - %s).", errno, strerror(errno));
In order to understand the problem.
Would it be possible to always print the errno and associated error string for system errors? This would of course require scanning and modifying all of the dcap code.
library call: dc_perror(). This has the errno set by libc
Is there a dccp command-line option to expose these failures? dccp should relay this information by calling dc_perror() and printing an error message.
Timur: to obtain a detailed example of when it fails.
64-bit SL5 gsi dcap
I could not find proper gsi dcap version which would work with globus included with vdt that installs on 64 bit SL5. I also do not know which version of globus is needed to get gsi dcap working there. I need this for automatic dcache testing.
Timur and Owen to discuss this off-line.
Fermi upgrade
Upgrading public dCache to new PNFS. rpm
Outstanding RT Tickets
[This is an auto-generated item. Don't add items here directly]
RT 5424: Setting Java options for dCache SRM
Note that this is about the SRM clients. Just honour JAVA_OPTIONS environment variable.
Perhaps have a separate
SRMCLIENT_JAVA_OPTIONS
but fall back, if not set, to
JAVA_OPTIONS
Timur: we'll think about it.
5398: srm: Illegal State Transition : g illegal state transition from Done to Failed
This is triggered by the client calling cancel on a transfer that is done.
Tigran: these show up in S2 tests.
DTNM
Same time, next week.
