wiki:developers-meeting-20100818
Last modified 11 years ago Last modified on 08/24/10 12:11:21

[part of a series of meetings]

Participants

Agenda

  • accumulating DB connections
  • New Book efforts

Postcards

Up to two minutes (uninterrupted) per person where they can answer two questions:

  • What I did last week (since the last meeting),
  • What I plan to do in the next week.

No questions until we get through everyone :), except from -p.

Gerd: debugging bring-online issues --> three tickets and one patch (triggered by GGUS). Need more input from Fermi people. This week debugging FTP stuff (triggered by issue in production). Found several problems, including some been reported by Jon (

Thomas: continued working async GSI connector for Jetty. Got it working for SRM door. Hopefully tomrrow be in RB.

Tigran: stuff .. nothing exciting; debugging a problem.

Jan: web-admin stuff. Some patches committed. Submitting new patches, including doc; cell-admin page also now in RB.

Paul: holiday + dodgy Debian problems.

Owen: sick for a week; catching up on EMI emails. Trying to get 1.9.5 out for SL4 (it's out for SL5). Updating host certificates for virtual machines.

Antje: holiday, EMI, test environment.

Patrick: preparing stuff for new team members. Finalising EMI data report.

Tanja: tickets, reading NFS spec. talking to Tigran about it. Trying to understand .. something. Sketch of poster for Patrick for the EGI technical forum. Collecting plans for next Golden Release.

Plans for patch-releases

Should we make a new patch release?

Installed 1.9.5-10 on test machine to check whether SRM fails in the same way that 1.9.5-HEAD then we can release 1.9.5-HEAD as 1.9.5-21 and fix the problem(s) later on (maybe). If 1.9.5-10 doesn't show the symptoms then the problem is a regression and must be fixed.

As we've seen the problem with 1.9.9 and 1.9.5, we need to fix this before investigating releasing 1.9.7 and 1.9.8.

Terracotta

Merge request from Timur about the Terracotta scaling.

Timur committed it to trunk and wants to have it in all supported branches.

The change alters the code-path in SRM that is called even if terracotta is not enabled.

Merge into 1.9.9 branch and, if it passes test suite, we release.

Can we do more testing? There's the grid-jobs; install it on srm-devel (which is outside the firewall) and test with thousands of clients. Check memory while under testing.

Don't test with terracotta enabled, for now.

dCache configure

Owen intends to submit the new dCache-configure for 1.9.7 and later. Would like this to go into dCache for 1.9.10 and be backported to earlier releases (with the new configuration system).

Migration script

Patch should emit an error message if duplicate property assignments are detected.

Trunk activity

Progress with new features...

Plans for the next Golden Release

  • new gPlazma
  • Configuration files
  • new Pool
  • Pool rebalancing
  • "Thread-less" xrootd mover
  • Better pool selection
  • new Webadmin
  • Asynchronous GSI connector for SRM
  • SRM over SSL
  • Netty based HTTP-Mover
  • GLUE 2.0 support
  • JMS by default
  • UUIDs in all pools
  • dCache running as non-root user by default
  • dCache configure to merged into dCache server.rpm
  • SSH2 for the Admin Interface
  • New billing web interface.
  • Debian / HFS compliant packaging of dCache.

Rather than having these as a wiki pages we can create a ticket for each item and use a "road map". Patrick wants a list of these items; but can be auto-generated from trac.

Gerd wants to fix the resolution of well-known cells as this is still an issue (which affects "cd" command).

NDGF is also starting to look into deploying a stand-along JMS instance (replicated over two machines). Should be ready 2--3 weeks.

Doesn't seem to be anything essential .. apart from the dCache configure (as it would be a regression).

Tanja to add these items to trac; and create a mile-stone.

For the new gPlazma, we need to have unique uids .. what is important is not gPlazma, but rather various features that this will enable.

gPlazma bullet-point should be split into sub-bullets; one for each feature we want to enable that depends on the new gPlazma.

Book stuff

Update of the dCache book to current configuration files and more detail/better instructions in some parts of it.

Use hudson to build and deploy the new versions of the book.

Chimera schema migration

Patch exists, but we have to decide how much we want to automate.

Hg

Completely move to Mercurial

Tigran wants to drop externals; 7.5 GiB size currently for Mercurial repository, most of this is external dependencies.

Plan to move to Maven before to Hg.

Web stuff

Redirect caching.

HTTP keep-alive.

Agenda item for Gerd's visit.

Issues from [FIXME: Add link to yesterday's Tier-1 meeting]

discussed.

Outstanding RT Tickets

[This is an auto-generated item. Don't add items here directly]

RT 5725: dcache cannot access ssh keys when running under a different user

Need to update the error messages; Paul will do this.

RT 5796: feature request: crc onflush

Plan to move checksum to mover; plan to close data connection before checksum calculation if user hasn't supplied a checksum. Subsequent reads will block until checksum completes.

RT 5797: feature request: reload of certificates

won't fix for 1.9.5 and is already fixed in 1.9.7 and later.

RT 5795: stuff

Points to an error in Globus client.

The door sends a 150 reply and we subsequently send a 426 error message.

The problem is that the client, after receiving 150 reply, it ignores the 426 error message.

Matt says dCache behaviour is perfectly acceptable.

As a work-around, the door can kill the connection if the door has sent a 426 and the client hasn't done anything (is idle).

TODO: implement perf-marker -like message for all protocols from pool to door, instead of the door pinging the mover.

Review of RB requests

DTNM

Same time, next week.