wiki:developers-meeting-20110126
Last modified 10 years ago Last modified on 01/26/11 17:54:36

[part of a series of meetings]

Participants

Dmitry; Gerd; Patrick, Antje, Paul, Tigran, Karsten, Christian.

Agenda

[see box on the right-hand side]

Postcards

Up to two minutes (uninterrupted) per person where they can answer two questions:

  • What I did last week (since the last meeting),
  • What I plan to do in the next week.

No questions until we get through everyone :)

Christian: ill, not so; lot of problems with the test-bed, meetings Gene: .. Dmitry: mostly busy with CDF dCache operations -- file delivery .. PoolManager was modified to always check for the flag and not provision for the flag being missing. Support for default pools; interviews for dCache position .. putting together documentation for scalable SRM so people can review the work. Jon asked Catalin to backport scalable-SRM into his private branch and deploy at Fermi. Current bottleneck is pin-manager (if running with Chimera; with PNFS then PNFS is bottleneck). Current code-base suffers from growing message-queue during the testing period that it recovers from after. Gerd: finishing touches on new pin-manager last week; deployed Monday. Discovered a bug straight away (in 1.9.11). Simple negation bug .. not removing sticky bits on the pools. Running to next day and found 3 minor issues: one of these was a negation-bug in the retry logic, causing the pinning to fail. Trouble with c3p0 connection throwing NPE, switched to dbcp instead. Race-condition when deleting pins: file was unpinned and deleted at the same time triggered by PnfsManager unpinning the file as it's being deleted. This has been fixed. DataNucleus? 2nd-level caching (shared on node over all transactions) is currently switched off, but can be enabled through spring. There's a potential issue if one runs multiple pin-managers concurrently or if the db-back-end is modified outside pin-manager. Solutions are to switch off 2nd-level

Pin-manager talks to pool-manager emulating a door. Some cleaning up PoolManager as a result.

Patrick: mostly meetings. Monday, participating with Bari meeting of CMS. Introduced more standards. Brian B. is interesting in using HTTP for the global namespace (alternative to cmsd). People were interested and asked questions.

Antje: working on The Book. Some tickets.

Paul: reviewing patch, working on

Tigran: NIS patch mapping plugin, pool manager stuff, Chimera using liquibase.

Karsten: plugin VO mapping plugin, new patch configuration of gPlazma2 .. cleaning up. Making a list of security related requirements for EMI JRA1.4 (security).

Plans for patch-releases

Should we make a new patch release?

1.9.10 is next to go out .. merges have gone in (two fixes to allow CMS working with GSI).

Once that's out, release a new 1.9.11.

Trunk activity

Progress with new features...

Pin-manager

Old pin-manager is not reliably removing pins during the end

60--70,000 pins. After a hour, this dropped to ~100 pins.

The old pin-manager retired the pins after the TURL times out (~24 hours for ATLAS).

Patrick: nearly convinced.

Tigran: service component, it's OK if we can hot-wire an always-OK answer.

For each transfer, we create a pin. This is a DB-transaction.

Is there a risk that the performance of the new pin-manager has lower performance than the old pin-manager?

Dmitry: interest in new pin-manager is mostly based on performance. We have a test stand that can generate a fair amount of load. Gerd has tested the pin-manager, so we can test it.

Gerd: currently patch is final; it's not the final version of the pin-manager

Patrck: we promised not to introduced new components

Paul: maintainability is important, too.

Tigran: 3cp0 problem understood?

Gerd: 2 years ago, someone had a similar NPE.

Tigran: Might be there's some query that isn't handled correctly by c3p0, but Chimera doesn't touch this part.

gPlazma2

examples

Dmitry: prefer /opt/d-cache/etc/layouts/examples (Patrick +1) Gerd: fine to move them out; but hope that we can have a working 1-node installation.

/opt/d-cache/share/examples/layouts /opt/d-cache/share/examples/dcache.kpwd.template

Migration module

Patrick: last year, Tigran (& Paul) promised to fix bugs. Ticket 6038, DOT concerned that a pool's read-only flag is being ignored by the migration module. The request is actually about the pool-manager. This is deliberate to allow no further incoming transfers but allow migration module to work.

The fix would be 1) the dest pool to deny transfers when set read-only, 2) the src pool to discover which pools are disabled.

Wishes to drain three pools into a poolgroup, not specifying the pools explicitly.

Say 1-week the problem will be fixed.

Issues from yesterday's Tier-1 meeting

No issues reported.

Issues from EMI

Ticket thing

EMI wants to have metrics on how quickly we fixes bugs that enter EMI releases. We're making changes to RT to capture additional information.

The problem is that extracting a list of tickets that are fixed with an EMI release then scanning the SVN commit logs doesn't provide information about which ticket(s) are being fixed with this commit.

Also, please put the commit metadata into RB description. This allows the reviewer to double-check the metadata and it also provides a link from RB to RT.

Tigran concerned that we're creating something that we later find out doesn't work.

Outstanding RT Tickets

[This is an auto-generated item. Don't add items here directly]

RT 5847: Migration remarks(problems)

Need to investigate and find out what went wrong.

Tigran remembers that, after p2p transfer, the source file was marked cached despite the transfer failing.

Close ticket (quietly)

Review of RB requests

Gift ideas

Swiss army cyber-knife Victory-nox.

DTNM

Same time, next week.