wiki:developers-meeting-20110201
Last modified 10 years ago Last modified on 02/02/11 18:32:34

dCache Tier I meeting February 1, 2011

[part of a series of meetings]

Present

dCache.org(Patrick, Antje, Tanja, Tigran, Paul), Triumf(Simon), PIC(Gerard), GridKa(Doris)

Agenda

(see box on the other side)

Site reports

Triumf

Last week was good, nothing to report.

GridKa

So far, so good.

No serious problems.

Question about old SRM (on old instance, just upgraded to new hardware).

JDBC task failed, JDBC queue

Max number of tasks in queue

Number of outstanding database requests from SRM to the database. If it's full then we don't store state in the database.

If you increase this number. No.

Running 1.9.5? Yes.

Before the upgrade, the dcache instance was also running 1.9.5, but an earlier version.

Some status are not stored in database. Restarting SRM may result in

Running requests are always taken from memory.

DB is only used for history plots or

Too many SRM threads or too many Tomcat processes allowed.

Number of JDBC

Doris will investigate problem and open a ticket if the problem isn't understood.

PIC

We're still recovering data from misplaced DDN operations (just a guess). One of the pools is almost computer

The

Pool starts it

Serving 25% of the files. The others have to be copied to another pool. This will take us about 9 days.

200 TiB loose, we guess about 5% will be corrupt in the end. ATLAS says a lot of this is junk.

1,000 files.

Apart from this corruption issue, everything else is running fine.

DDN are requesting various log files.

From the controllers there's not much information during the corruption period.

Similar at DESY

DESY has observed effects that are consistent with DDN confusing LUNs between controllers.

Patrick to get Gerard and Martin G. in touch.

ZFS

The engineer working on the problem says that there no way for ZFS for detecting this problem: the data

Do you run RAID-0 on it? Striped pools (RAID-z). Yes.

This explains why ZFS is not able to recover.

IOError or checksumMismatch.

Support tickets for discussion

[Items are added here automagically]

DTNM

Same time, next week.