Lychee sytem: Difference between revisions

From MDWiki
Jump to navigationJump to search
Line 65: Line 65:
== LDAP server open file descriptor problem ==
== LDAP server open file descriptor problem ==


By default, fedora-ds only can have 1024 open file descriptors, which would be run out very soon and cause every client machine/node to hang. Change option in /opt/fedora-ds/slapd-lychee/config/dse.ldif
By default, fedora-ds only can have 1024 open file descriptors, which would be run out very soon and cause every client machine/node to hang.  


nsslapd-maxdescriptors: 1024
The number of open file descriptor is limited by the system. The hard limit can be checked with


            ulimit -a


to 8192 to overwrite the open file descriptor limit.
To change that value when the ldap server starts, add
 
            ulimit -n 8192
 
to /etc/init.d/fedora-ds script.
 
If the value cannot be changed to exceed 1024, check the following places:
 
* /etc/security/limits.conf, add the following line:
 
          "*              -      nofile      8192"
 
* /etc/sysctl.conf, make sure fs.file-max is larger than the limit specified in /etc/limits.conf:
 
      # ADDED FOR FDS
      net.ipv4.tcp_keepalive_time = 300
      net.ipv4.ip_local_port_range = 1024 65000
      fs.file-max = 64000
 
      # hostname/domainname
      kernel.hostname = lychee.md.smms.uq.edu.au
      kernel.domainname = md.smms.uq.edu.au
 
* Change option in /opt/fedora-ds/slapd-lychee/config/dse.ldif (this maybe overwritten by fedora-ds during server restart)
 
      nsslapd-maxdescriptors: 8192

Revision as of 01:19, 8 January 2009

ssh Hostbased Authentication

In order to make queue transfer data from and to cluster nodes (mango*) smoothly, ssh host based Authentication must be correctly setup.

  • /etc/ssh/sshd_config on servers (actually everynodes & lychee) must have the following lines:
  AllowUsers root *@mango* *@lychee*
  HostbasedAuthentication yes
  IgnoreUserKnownHosts yes
  • /etc/ssh/ssh_config on clients (mango* & lychee) must have:
  Host *
       HostbasedAuthentication yes
       EnableSSHKeysign yes
  • /etc/ssh/ssh_known_hosts2 stores protocol 2 ssh public keys, which can be obtained by:
  ssh-keyscan -vt rsa mango02 >> /etc/ssh/ssh_known_host2

Different entries can share the same key, as long as the host machines use the same ssh_host_rsa_key key pairs(recommended).

  • /etc/hosts.equiv stores all the possible hostname one in a line like
   mango01
   192.168.0.3
   mango02
   192.168.0.4
   ....
   lychee
   lychee.md.smms.uq.edu.au
   192.168.1.249
   ...
  • restart sshd server and it should work.

see also:

http://www.snailbook.com/faq/trusted-host-howto.auto.html

https://www.cs.uwaterloo.ca/twiki/view/CF/SSHHostBasedAuthentication

http://docs.hp.com/en/5992-4213/ch04s06.html

Torque PBS qsub wrapper

Using a wrapper of qsub will be helpful in case that some rules/restrains to the jobs are difficult to be added by qmgr.

To use a filter, add the this to /var/spool/PBS/torque.cfg .

   SUBMITFILTER /path/to/your/wrapper

The wrapper will read lines, which is content of the job script, from STDIN, analyze it, and output the modified version to STDOUT. Useful information can be displayed by writing to STDERR as well.

LDAP server gidName index for group name searching

quoted from martin's email

 The fedora-ds install configuration builds indexes for most of the commonly searched attributes, but
 not for "gidNumber". The fedora-ds GUI console provides an "indexes" page, where this (and other 
 attributes) may be added. Following any changes, the DS must be stopped and a db2index command run to 
 recreate the indexes.

LDAP server open file descriptor problem

By default, fedora-ds only can have 1024 open file descriptors, which would be run out very soon and cause every client machine/node to hang.

The number of open file descriptor is limited by the system. The hard limit can be checked with

            ulimit -a

To change that value when the ldap server starts, add

            ulimit -n 8192

to /etc/init.d/fedora-ds script.

If the value cannot be changed to exceed 1024, check the following places:

  • /etc/security/limits.conf, add the following line:
          "*               -       nofile       8192"
  • /etc/sysctl.conf, make sure fs.file-max is larger than the limit specified in /etc/limits.conf:
      # ADDED FOR FDS
      net.ipv4.tcp_keepalive_time = 300
      net.ipv4.ip_local_port_range = 1024 65000
      fs.file-max = 64000
      # hostname/domainname
      kernel.hostname = lychee.md.smms.uq.edu.au
      kernel.domainname = md.smms.uq.edu.au
  • Change option in /opt/fedora-ds/slapd-lychee/config/dse.ldif (this maybe overwritten by fedora-ds during server restart)
      nsslapd-maxdescriptors: 8192