wiki:TroubleShooting

Version 13 (modified by bernardt, 9 years ago) (diff)

--

Trouble Shooting Guide for dCache

SshStreamEngine? : SSH_CMSG_AUTH_RSA : Key not found
dCache version Error Problem Solution
all ERROR: role "chimera" already exists When setting up the postgres database and something goes wrong using the commands described in the installation documentation. Reissuing commands such as 'createuser -U postgres --no-superuser --no-createrole --createdb --pwprompt chimera' will result in 'createuser: creation of new role failed: ERROR: role "chimera" already exists' dropdb -U postgres chimera && su postgres && dropuser chimera, afterwards you can start over creating the database and user, etc.
>1.9.12 In the log files you will find: (666) file:/usr/share/dcache/services/pool.batch: line 100: (3) java.util.concurrent.ExecutionException?: org.springframework.beans.factory.BeanCreationException?: Error creating bean with name 'lock' defined in class path resource [org/dcache/pool/classic/pool.xml]: Invocation of init method failed; nested exception is java.io.FileNotFoundException?: /pools/pool1/lock (Permission denied) from ac_create_$_2_3 The rights for the pool are not properly set. Changing ownership of pools to dcache user solves the problem. (chown dcache /pools/pool1/ in this case)
all, at occurrence 1.9.12 26 Apr 2011 11:40:36 (SRM-xen-ep-emi-tb-se-3) [] org.globus.common.ChainedIOException: Authentication failed [Caused by: Failure unspecified at GSS-API level [Caused by: Bad certificate (The signature of 'C=CH,O=CERN,OU=GD,CN=Test user 100' certificate does not match its issuer) Bad certificate (The signature of 'C=CH,O=CERN,OU=GD,CN=Test user 100' certificate does not match its issuer) The problem was solved by putting the issuing CA's certificates in /etc/grid-security/certificates
all, especially 1.9.5 SRMClientV2 : srmPrepareToPut: try # 0 failed with error SRMClientV2 : ; nested exception is: java.io.EOFException Tue Apr 26 14:43:40 CEST 2011: java.io.EOFException Tue Apr 26 14:43:40 CEST 2011: stopping copier going to stop.... srmcp causes a EOF exception. The actual problem can lie in a CA's certificate revocation list (crl) being out of date. check if crl of the CA that issued the user certificate is out of date. Find out the issueing CA: openssl x509 -in ~/.globus/<usercert>.pem -noout -issuer, then grep for the relevant crl on the storage element (SE). This will give you a name: something like: 1cda0759.r0 This command can then be used to check if the crl is still valid: openssl crl -in 1cda0759.r0 -noout -text
at occurence 1.9.12 11 May 2011 12:28:56 (alm) [] Auth (knownUsers) : Ssh knownUsers unavailable for request from User admin Host /131.169.252.35