Table of Contents
[part of a series of meetings]
Participants
Agenda
- accumulating DB connections
- New Book efforts
Postcards
Up to two minutes (uninterrupted) per person where they can answer two questions:
- What I did last week (since the last meeting),
- What I plan to do in the next week.
No questions until we get through everyone :), except from -p.
Gerd: debugging bring-online issues --> three tickets and one patch (triggered by GGUS). Need more input from Fermi people. This week debugging FTP stuff (triggered by issue in production). Found several problems, including some been reported by Jon (
Thomas: continued working async GSI connector for Jetty. Got it working for SRM door. Hopefully tomrrow be in RB.
Tigran: stuff .. nothing exciting; debugging a problem.
Jan: web-admin stuff. Some patches committed. Submitting new patches, including doc; cell-admin page also now in RB.
Paul: holiday + dodgy Debian problems.
Owen: sick for a week; catching up on EMI emails. Trying to get 1.9.5 out for SL4 (it's out for SL5). Updating host certificates for virtual machines.
Antje: holiday, EMI, test environment.
Patrick: preparing stuff for new team members. Finalising EMI data report.
Tanja: tickets, reading NFS spec. talking to Tigran about it. Trying to understand .. something. Sketch of poster for Patrick for the EGI technical forum. Collecting plans for next Golden Release.
Plans for patch-releases
Should we make a new patch release?
Installed 1.9.5-10 on test machine to check whether SRM fails in the same way that 1.9.5-HEAD then we can release 1.9.5-HEAD as 1.9.5-21 and fix the problem(s) later on (maybe). If 1.9.5-10 doesn't show the symptoms then the problem is a regression and must be fixed.
As we've seen the problem with 1.9.9 and 1.9.5, we need to fix this before investigating releasing 1.9.7 and 1.9.8.
Terracotta
Merge request from Timur about the Terracotta scaling.
Timur committed it to trunk and wants to have it in all supported branches.
The change alters the code-path in SRM that is called even if terracotta is not enabled.
Merge into 1.9.9 branch and, if it passes test suite, we release.
Can we do more testing? There's the grid-jobs; install it on srm-devel (which is outside the firewall) and test with thousands of clients. Check memory while under testing.
Don't test with terracotta enabled, for now.
dCache configure
Owen intends to submit the new dCache-configure for 1.9.7 and later. Would like this to go into dCache for 1.9.10 and be backported to earlier releases (with the new configuration system).
Migration script
Patch should emit an error message if duplicate property assignments are detected.
Trunk activity
Progress with new features...
Plans for the next Golden Release
- new gPlazma
- Configuration files
- new Pool
- Pool rebalancing
- "Thread-less" xrootd mover
- Better pool selection
- new Webadmin
- Asynchronous GSI connector for SRM
- SRM over SSL
- Netty based HTTP-Mover
- GLUE 2.0 support
- JMS by default
- UUIDs in all pools
- dCache running as non-root user by default
- dCache configure to merged into dCache server.rpm
- SSH2 for the Admin Interface
- New billing web interface.
- Debian / HFS compliant packaging of dCache.
Rather than having these as a wiki pages we can create a ticket for each item and use a "road map". Patrick wants a list of these items; but can be auto-generated from trac.
Gerd wants to fix the resolution of well-known cells as this is still an issue (which affects "cd" command).
NDGF is also starting to look into deploying a stand-along JMS instance (replicated over two machines). Should be ready 2--3 weeks.
Doesn't seem to be anything essential .. apart from the dCache configure (as it would be a regression).
Tanja to add these items to trac; and create a mile-stone.
For the new gPlazma, we need to have unique uids .. what is important is not gPlazma, but rather various features that this will enable.
gPlazma bullet-point should be split into sub-bullets; one for each feature we want to enable that depends on the new gPlazma.
Book stuff
Update of the dCache book to current configuration files and more detail/better instructions in some parts of it.
Use hudson to build and deploy the new versions of the book.
Chimera schema migration
Patch exists, but we have to decide how much we want to automate.
Hg
Completely move to Mercurial
Tigran wants to drop externals; 7.5 GiB size currently for Mercurial repository, most of this is external dependencies.
Plan to move to Maven before to Hg.
Web stuff
Redirect caching.
HTTP keep-alive.
Agenda item for Gerd's visit.
Issues from [FIXME: Add link to yesterday's Tier-1 meeting]
discussed.
Outstanding RT Tickets
[This is an auto-generated item. Don't add items here directly]
RT 5725: dcache cannot access ssh keys when running under a different user
Need to update the error messages; Paul will do this.
RT 5796: feature request: crc onflush
Plan to move checksum to mover; plan to close data connection before checksum calculation if user hasn't supplied a checksum. Subsequent reads will block until checksum completes.
RT 5797: feature request: reload of certificates
won't fix for 1.9.5 and is already fixed in 1.9.7 and later.
RT 5795: stuff
Points to an error in Globus client.
The door sends a 150 reply and we subsequently send a 426 error message.
The problem is that the client, after receiving 150 reply, it ignores the 426 error message.
Matt says dCache behaviour is perfectly acceptable.
As a work-around, the door can kill the connection if the door has sent a 426 and the client hasn't done anything (is idle).
TODO: implement perf-marker -like message for all protocols from pool to door, instead of the door pinging the mover.
Review of RB requests
DTNM
Same time, next week.
