wiki:developers-meeting-20140702
Last modified 5 years ago Last modified on 07/02/14 16:47:55

[part of a series of meetings]

Participants

Al, Karsten, Christian, Paul, Dmitry

Agenda

[see box on the right-hand side]

Postcards

Up to two minutes (uninterrupted) per person where they can answer two questions:

  • What I did last week (since the last meeting),
  • What I plan to do in the next week.

No questions until we get through everyone :)

Karsten: Reviewing, some tickets, further setting up of the small files system Christian: Setting up IPv6 Hypervisor, still having some issues with infrastructure Paul: WebDAV 3rd Party Copy patches readying and now feature complete, they are in RB

when they're committed we will have the functionality. Now focussing on some other patches that should go into 2.10. Releases.

Dmitry: preparing 2.6 upgrade and looking at the OOM issue. Al: ReplicaManager is pretty stable, but still needs more testing during hot replication

There are a couple of timeout issues with NFS. Updated documentation for ReplicaManager, Flow Charts, refactored some interfaces to be able to decouple functionality (and to enable smaller patches). Waiting for some stuff from Gerd

  • Alarms patch issue... see below.

Plans for patch-releases

Should we make a new patch release?

patches need to be committed today for them to go into next releases. They will be out tomorrow.

2.10.: Is there anything that keeps up from branching?

Christian: Bug-Reporting would be nice to have in.

Trunk activity

Progress with new features...

Alarms

Al: 1) The issue is that our checksum alarm filter did not match the log message. So the suggestion

we mix alarms defined in the code and defined in the alarms definition file. Alarms depend on Exception messages. 3rd party code is hard to react on -> We could use AspectJ or wrap the relevant classes in an "alarm aware" subclass.

2) Could we get rid of the XML database -> needs more investigation

CRC Checksums

The script that writes the file does not get the checksum. -> We can pass it as a parameter

Gerd's changes to the HSM could also offer a nice way solve this.

Globus online

Dmitry: Cannot do bulk transfers. Paul: Their test system had two problems with master, one is fixed now.

The remaining one was that GO would immediately ask for a checksum after the transfer started, dCache would not reply to this and the transfer ultimately fail. There is a patch in RB addressing this by delaying the answer to the checksum request until the transfer is finished.

Dmitry: You can use GO from the CLI instead of the web interface.

Paul: At KIT they used a stock GO server

Will check

Documentation of ACL in dCache

Seems to be incomplete/wrong

Issues from [FIXME: Add link to yesterday's Tier-1 meeting]

Thurday:

Tuesday: Xavier, Marc

Nothing to report (mostly) Marc: Upgraded to 2.6 with enstore and did not see any problems Xavier: About the Brazilian CA: A patch went in and complaints stopped...

Also opened a ticket about various parameters in admin interface

Marisa's sticky bit ticket #8372

These situations need admin intervention. We could add a functionality to have dCache to react on sudden hardware failures to save as much data using cached copies on other pools.

Paul: It would be good to have a recipe ready for those cases.

Christian: Our documentation on this issue about flagging copies as sticky that isn't honored

anywhere.

Issues from EMI

New or noteworthy

Outstanding RT Tickets

[This is an auto-generated item. Don't add items here directly]

Review of RB requests

Paul: 2 patches

DTNM

Proposed: same time, next week.