wiki:tier-one-meeting-20180821
Last modified 4 months ago Last modified on 08/21/18 16:04:48

dCache Tier I meeting MONTH DATE, 2013

[part of a series of meetings]

Present

dCache.org(), IN2P3(), Sara(), Triumf(), BNL(), NDGF(Jens), PIC(Elena), KIT(Xavier), Fermi(), CERN()

Agenda

(see box on the other side)

Site reports

PIC

Doing stress test on tape.

Aresh ATLAS

We have seen many requests cancelled in FTS error "transfer cancelled because performance marker timeout exceeded"

70 seconds

Aresh -- ATLAS contact at PIC

19 August -- for data carousel model. Recall 200 TB data from tape to disk.

Second test -- previous test was in July.

Seems to have found a bottleneck in dCache.

6---8 hours 400 section keys. FTS looks for 6 minutes.

FTS gsi SRM pools.

Failed to abort transfer

Billing very different times -- messages abort transfer

Log transfer file -- no such file or directory.

Currently in the middle of the test.

Started Sundary 12:00 CEST.

90,000 files requested; ~200 TiB of data.

Tuesday 15:00 CEST. 40,000 files transferred.

Last test in July avr rate 400 MiB/s.

Asked FTS to limit bulk request read requests 5k at a time.

Now/Currently? throttling 10k at a time; never goes beyond 4k at a time.

No pending requests, but active requests never goes beyond 4k.

Evaluate performance -- bandwidth

Open support ticket

xrootd bug =

xrootd transfer bug, please reported.

20 GiB of ..

59,000,000 records for one day.

Different files, but the same file multiple errors (e.g., 600)

Some repetition.

NDGF

Things are pretty good -- some problems over the weekend.

Monday morning 800 failed transfers (xrootd) -- one single CE in India, requesting the same file 800 times per second.

Banned that CE for now.

One head node due to log rotate -- for two days, had quite a full filesystem. They had 30,000,000 stage requests failed.

Rethink log -- journald --> rsyslogd --> disk; log.

KIT

Things are running fine at KIT.

Recovery

Discussions -- offered files to CMS -- salvage 58,000 of 330,000.

Upgrade

Option HA for doors behind a round-robin. Managed to set that up.

Because the interfaces are ... no default

ctdb.. (samba software) supports policy routing ... specify public addresses ... but not for IPv6.

3 options -- install as if the interface

Support tickets for discussion

[Items are added here automagically]

DTNM

Same time, next week.