Táboa de Contidos

Cluster CIQUS

Localización en el CPD

Cluster de Berta

Servidores

Nombre Service Tag Modelo IP Notas
dell 1BS314J Dell 2950 eth0:192.168.0.254/24 192.168.1.252/24 eth1:172.16.247.180/24
dell1 D33414J Dell 2950 eth0:192.168.0.101/24 eth1:
dell2 F33414J Dell 2950 eth0:192.168.0.102/24 eth1:
dell3 233414J Dell 2950 eth0:192.168.0.103/24 eth1:
dell4 H33414J Dell 2950 eth0:192.168.0.104/24 eth1:
dell5 J33414J Dell 2950 eth0:192.168.0.105/24 eth1:
dell6 733414J Dell 2950 eth0:192.168.0.106/24 eth1:
dell7 633414J Dell 2950 eth0:192.168.0.107/24 eth1:
dell8 C23414J Dell 2950 eth0:192.168.0.108/24 eth1:
dell9 433414J Dell 2950 eth0:192.168.0.109/24 eth1:
dell10 H23414J Dell 1950 eth0:192.168.0.110/24 eth1:
dell11 JHB714J Dell 1950 eth0:192.168.0.111/24 eth1:
dell12 B33414J Dell 1950 eth0:192.168.0.112/24 eth1:
dell13 G23414J Dell 1950 eth0:192.168.0.113/24 eth1:
dell14 243414J Dell 1950 eth0:192.168.0.114/24 eth1: Error: E171F PCIE Fatal Err B0 DE F0.Sistema de archivos xfs de sda2 cascado. Da problemas con el sistema de archivos?
dell15 F23414J Dell 1950 eth0:192.168.0.115/24 eth1:
dell16 933414J Dell 1950 eth0:192.168.0.116/24 eth1: Mal passwd de root. Error en config de red.
dell17 B0NQX4J Dell 1950 eth0:192.168.0.117/24 eth1:
dell18 D0NQX4J Dell 1950 eth0:192.168.0.118/24 eth1: Mal passwd de root. Error en config de red.
dell19 C0NQX4J Dell 1950 eth0:192.168.0.119/24 eth1:
dell20 BCTQX4J Dell 1950 eth0:192.168.0.120/24 eth1:
dell21 6RBGY4J Dell 1950 eth0:192.168.0.121/24 eth1: Mal passwd de root. Error en config de red.
dell22 3RBGY4J Dell 1950 eth0:192.168.0.122/24 eth1:
dell23 4RBGY4J Dell 1950 eth0:192.168.0.123/24 eth1: Mal passwd de root. Error en config de red.
dell24 2RBGY4J Dell 1950 eth0:192.168.0.124/24 eth1:
dell25 5RBGY4J Dell 1950 eth0:192.168.0.125/24 eth1:
dell26 7RBGY4J Dell 1950 eth0:192.168.0.126/24 eth1:
dell27 CL3W102 R420 eth0:192.168.0.127/24 eth1: Error: PCIe training error. Se cambia la placa.
dell28 7M3W102 R420 eth0:192.168.0.128/24 eth1:

Master

El SO es Scientific Linux SL release 5.2 (Boron).
La ip de acceso externa es la 172.16.247.180 en eth1 (en el CIQUS era la 172.16.249.180) tiene definido IP SNAT 193.144.81.138 (accesible 22/tcp (ssh), para todos).
La red interna del cluster es la 192.168.0.0/24 en eth0.
La red 192.168.1.252 en eth0 debe ser un resto de otro cluster, no se usa para nada aunque hay referencia a un servidor dns en esa red en el resolv.conf.
PARTICIONES:

Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             7.8G  4.2G  3.2G  57% /
/dev/mapper/VG0-home  296G  140G  141G  50% /home
tmpfs                 3.9G     0  3.9G   0% /dev/sh

IPTABLES:

#*nat
#:PREROUTING ACCEPT [181641:15178573]
#:POSTROUTING ACCEPT [45392:2667940]
#:OUTPUT ACCEPT [45392:2667940]
#-A POSTROUTING -s 192.168.0.0/255.255.255.0 -o eth0 -j SNAT --to-source 172.16.249.180
# la regla anterior no funcionaba y la cambie por la siguiente
#-A POSTROUTING -j MASQUERADE
 
#COMMIT
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:RH-Firewall-1-INPUT - [0:0]
-A INPUT -j RH-Firewall-1-INPUT
#-A FORWARD -j RH-Firewall-1-INPUT
# Red interna
-A RH-Firewall-1-INPUT -s 192.168.0.0/24 -i eth0 -j ACCEPT
# Red BMC
-A RH-Firewall-1-INPUT -s 192.168.1.0/24 -i eth0 -j ACCEPT
# qfqcpc06
-A RH-Firewall-1-INPUT -s 193.144.87.45 -p tcp --dport 22 -j ACCEPT
# gdrqcluster
-A RH-Firewall-1-INPUT -s 193.144.87.51 -p tcp --dport 22 -j ACCEPT
# gdrqcluster32
-A RH-Firewall-1-INPUT -s 193.144.87.53 -p tcp --dport 22 -j ACCEPT
# gdrqcluster64
-A RH-Firewall-1-INPUT -s 193.144.87.52 -p tcp --dport 22 -j ACCEPT
# IP casa Javier
-A RH-Firewall-1-INPUT -s 83.165.109.52 -p tcp --dport 22 -j ACCEPT
# qfqcpc05
-A RH-Firewall-1-INPUT -s 193.144.87.44 -p tcp --dport 22 -j ACCEPT
# pcsistema1.cesga.es
-A RH-Firewall-1-INPUT -s 193.144.44.144 -p tcp --dport 22 -j ACCEPT
#
# Proteccion contra ataques ssh
#
# Si el la conexion es nueva, se anade a la lista SSH_LIST.
-A RH-Firewall-1-INPUT -p tcp -m tcp --dport 22 -m state --state NEW -m recent --set --name SSH_LIST --rsource 
# Actualizamos la lista de conexiones nuevas, quedandonos sólo con las entradas de los últimos 60 seg.
# Si en esa lista hay alguna conexión que se haya intentado mas de 3 veces, hacemos DROP
-A RH-Firewall-1-INPUT -p tcp -m tcp --dport 22 -m state --state NEW -m recent --update --seconds 60 --hitcount 3 --name SSH_LIST --rsource -j DROP
#
# Si no es un ataque aceptar ssh desde el exterior
-A RH-Firewall-1-INPUT -p tcp --dport 22 -j ACCEPT
#
-A RH-Firewall-1-INPUT -i lo -j ACCEPT
-A RH-Firewall-1-INPUT -p icmp --icmp-type any -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
# DEFAULT REJECT
-A RH-Firewall-1-INPUT -j REJECT --reject-with icmp-host-prohibited

SELINUX desactivado. TCP Wrappers activado:

/etc/hosts.allow
sshd: 172.16.64.75 , 172.16.243. , 172.16.247. , 172.16.249.

El hosts.deny esta lleno de ips porque hay un servicio llamado denyhosts que añade direcciones ip de forma dinamica para prevenir ataques.

/etc/hosts
127.0.0.1	localhost	localhost.localdomain
# For the nodes we need this !!!!!!!!!!!!!!!!!!!
# uncomment it for the nodes and comment the above line
#127.0.0.1 	payne			localhost
192.168.0.1	max.cluster.org	max
192.168.0.6	payne5	
192.168.0.7	payne6	
192.168.0.8	payne7	
192.168.0.9	payne8	
192.168.0.10	payne9	
# for (( i=46; i<60; i++ )); do echo -e "192.168.0.$[i+1]\tpayne$i"; done
# Dell
# For the future:
 
# The new node to be installed
#192.168.0.254	nuevo	
 
# Los ordenadores de Quimica
#193.144.87.30	qfqcpc02	
#193.144.74.229	qfqcpc02.usc.es	qfqcpc02
#193.144.74.199	qfbef01.usc.es	qfbef01
 
# Los servidores que tienen Debian
193.146.38.146	toxo.com.uvigo.es	
130.206.1.5	ftp.rediris.es	
194.109.137.218	security.debian.org	
 
# El otro cluster
193.144.87.51	cluster2	
 
# The following lines are desirable for IPv6 capable hosts
# (added automatically by netbase upgrade)
 
::1	ip6-localhost	ip6-loopback
fe00::0	ip6-localnet	
ff00::0	ip6-mcastprefix	
ff02::1	ip6-allnodes	
ff02::2	ip6-allrouters	
ff02::3	ip6-allhosts	
192.168.0.2	payne1	
192.168.0.3	payne2	
192.168.0.4	payne3	
192.168.0.5	payne4	
192.168.0.254	dell	
192.168.0.11	payne10	
192.168.0.12	payne11	
192.168.0.13	payne12	
192.168.0.14	payne13	
192.168.0.15	payne14	
192.168.0.16	payne15	
192.168.0.17	payne16	
192.168.0.18	payne17	
192.168.0.19	payne18	
192.168.0.20	payne19	
192.168.0.21	payne20	
192.168.0.22	payne21	
192.168.0.23	payne22	
192.168.0.24	payne23	
192.168.0.25	payne24	
192.168.0.26	payne25	
192.168.0.27	payne26	
192.168.0.28	payne27	
192.168.0.29	payne28	
192.168.0.30	payne29	
192.168.0.31	payne30	
192.168.0.32	payne31	
192.168.0.33	payne32	
192.168.0.34	payne33	
192.168.0.35	payne34	
192.168.0.36	payne35	
192.168.0.37	payne36	
192.168.0.38	payne37	
192.168.0.39	payne38	
192.168.0.40	payne39	
192.168.0.41	payne40	
192.168.0.42	payne41	
192.168.0.43	payne42	
192.168.0.44	payne43	
192.168.0.45	payne44	
192.168.0.46	payne45	
192.168.0.101	dell1	
192.168.0.102	dell2	
192.168.0.103	dell3	
192.168.0.104	dell4	
192.168.0.105	dell5	
192.168.0.106	dell6	
192.168.0.107	dell7	
192.168.0.108	dell8	
192.168.0.109	dell9	
192.168.0.110	dell10	
192.168.0.111	dell11	
192.168.0.112	dell12	
192.168.0.113	dell13	
192.168.0.114	dell14	
192.168.0.115	dell15	
192.168.0.116	dell16	
192.168.0.117	dell17	
192.168.0.118	dell18	
192.168.0.119	dell19	
192.168.0.120	dell20	
192.168.0.121	dell21	
192.168.0.122	dell22	
192.168.0.123	dell23	
192.168.0.124	dell24	
192.168.0.125	dell25	
192.168.0.126	dell26	
192.168.0.127	dell27	
192.168.0.128	dell28	
192.168.0.129	dell29	
192.168.0.130	dell30	
192.168.0.131	dell31	
192.168.0.132	dell32	
192.168.0.133	dell33	
192.168.0.134	dell34	
192.168.0.135	dell35	
192.168.0.136	dell36	
192.168.0.137	dell37	
192.168.0.138	dell38	
192.168.0.139	dell39	
192.168.0.140	dell40	
192.168.0.141	dell41	
192.168.0.142	dell42	
192.168.0.143	dell43	
192.168.0.144	dell44	
192.168.0.145	dell45	
192.168.0.146	dell46	
192.168.0.147	dell47	
192.168.0.148	dell48	
192.168.0.149	dell49

Usuarios:

javier:x:1000:1000:javier,,,:/home/javier:/bin/bash
uscqfjcf:x:1001:1001:Jose Luis,,,:/home/uscqfjcf:/bin/bash
tbp:x:1002:1002:Thomas Bondo Pedersen,,,:/home/tbp:/bin/tcsh
berta:x:1003:1003:Berta Fernandez Rodriguez,,,:/home/berta:/bin/bash
domenico:x:1004:1004:Domenico,,,:/home/domenico:/bin/bash
snfhko:x:1005:100::/home/snfhko:/bin/bash
cristian:x:1006:1006:Cristian,,,:/home/cristian:/bin/bash
jonathan:x:1007:1007:Jonathan,,,:/home/jonathan:/bin/bash
alfredo:x:1008:1008:Alfredo Sanchez de Meras,,,:/home/alfredo:/bin/bash
ganglia:x:102:101:Ganglia Monitor:/var/lib/ganglia:/bin/false
siham:x:1009:1009:Siham Naima Derrar,1,,:/home/siham:/bin/bash
stefan:x:1010:1010:Stefan Bilan,,,:/home/stefan:/bin/bash
juanpablo:x:1011:1011:,,,:/home/juanpablo:/bin/tcsh
silvia:x:1012:1012:,,,:/home/silvia:/bin/bash
hubert:x:1013:1013:Hubert Cybulski,,,:/home/hubert:/bin/bash
angelika:x:1014:1014:Angelika,Quimica Cuantica,,:/home/angelika:/bin/bash

REPOS:

/etc/yum.repos.d/adobe.repo
name=Adobe Systems Incorporated
baseurl=http://linuxdownload.adobe.com/linux/i386/
--
/etc/yum.repos.d/atrpms.repo
name=ATrpms rpms
baseurl=http://ftp.scientificlinux.org/linux/extra/atrpms/sl5-$basearch/stable
--
/etc/yum.repos.d/atrpms.repo.rpmnew
name=ATrpms rpms
baseurl=http://ftp.scientificlinux.org/linux/extra/atrpms/sl5-$basearch/stable
--
/etc/yum.repos.d/dag.repo
name=DAG rpms
baseurl=http://ftp.scientificlinux.org/linux/extra/dag/redhat/el5/en/$basearch/dag/
--
/etc/yum.repos.d/dag.repo.rpmnew
name=DAG rpms
baseurl=http://ftp.scientificlinux.org/linux/extra/dag/redhat/el5/en/$basearch/dag/
--
/etc/yum.repos.d/epel.repo
name=Extra Packages for Enterprise Linux 5 - $basearch
baseurl=http://download.fedoraproject.org/pub/epel/5/$basearch
--
/etc/yum.repos.d/epel.repo
name=Extra Packages for Enterprise Linux 5 - $basearch - Debug
baseurl=http://download.fedoraproject.org/pub/epel/5/$basearch/debug
--
/etc/yum.repos.d/epel.repo
name=Extra Packages for Enterprise Linux 5 - $basearch - Source
baseurl=http://download.fedoraproject.org/pub/epel/5/SRPMS
--
/etc/yum.repos.d/epel-testing.repo
name=Extra Packages for Enterprise Linux 5 - Testing - $basearch 
baseurl=http://download.fedoraproject.org/pub/epel/testing/5/$basearch
--
/etc/yum.repos.d/epel-testing.repo
name=Extra Packages for Enterprise Linux 5 - Testing - $basearch - Debug
baseurl=http://download.fedoraproject.org/pub/epel/testing/5/$basearch/debug
--
/etc/yum.repos.d/epel-testing.repo
name=Extra Packages for Enterprise Linux 5 - Testing - $basearch - Source
baseurl=http://download.fedoraproject.org/pub/epel/testing/5/SRPMS
--
/etc/yum.repos.d/sl-contrib.repo
name=Scientific Linux 5 contrib area
baseurl=http://ftp.scientificlinux.org/linux/scientific/52/$basearch/contrib
--
/etc/yum.repos.d/sl-debuginfo.repo
name=Scientific Linux 5 debuginfo rpm's
baseurl=http://ftp.scientificlinux.org/linux/scientific/5x/archive/debuginfo
--
/etc/yum.repos.d/sl-fastbugs.repo
name=SL 5 fastbugs area
baseurl=http://ftp.scientificlinux.org/linux/scientific/52/$basearch/updates/fastbugs
--
/etc/yum.repos.d/sl.repo
name=SL 5 base
#baseurl=http://ftp.scientificlinux.org/linux/scientific/52/$basearch/SL
#		ftp://ftp.scientificlinux.org/linux/scientific/52/$basearch/SL
baseurl=http://linuxsoft.cern.ch/scientific/52/$basearch/SL
--
/etc/yum.repos.d/sl.repo.rpmnew
name=SL 5 base
baseurl=http://ftp.scientificlinux.org/linux/scientific/52/$basearch/SL
--
/etc/yum.repos.d/sl-security.repo
name=SL 5 security updates
baseurl=http://ftp.scientificlinux.org/linux/scientific/52/$basearch/updates/security
--
/etc/yum.repos.d/sl-srpms.repo
name=Scientific Linux 5 source rpm's (src.rpm)
baseurl=http://ftp.scientificlinux.org/linux/scientific/5x/SRPMS
--
/etc/yum.repos.d/sl-testing.repo
name=Scientific Linux 5 testing area
baseurl=http://ftp.scientificlinux.org/linux/scientific/5rolling/testing/$basearch

La version de torque es la 2.0.0 y la de SGE 6.0u7.

Servicios

DHCP

Se sirven todas las direcciones de los nodos de la red 192.168.0.0/24 por dhcp filtrados por mac:

subnet 192.168.0.0 netmask 255.255.255.0 {
        # default gateway
        option routers 192.168.0.254;
        option subnet-mask 255.255.255.0;
 
        option domain-name "cluster.pri";
        option domain-name-servers 193.144.75.9, 192.168.0.254;
 
#       range dynamic-bootp 192.168.0.2 192.168.0.254;
        range 192.168.0.201 192.168.0.250;
        default-lease-time 600;
        max-lease-time 7200;
}

y también está configurado para el PXE:

# Needed for PXE (taken from the RHEL-3 sysadmin-guide)
allow booting;
allow bootp;
class "pxeclients" {
   match if substring(option vendor-class-identifier, 0, 9) = "PXEClient";
   next-server 192.168.0.254;
   filename "pxelinux.0";
}

TFTP server

Necesario para PXE

/etc/xinet.d/tftp
service tftp
{
        socket_type             = dgram
        protocol                = udp
        wait                    = yes
        user                    = root
        server                  = /usr/sbin/in.tftpd
        server_args             = -s /tftpboot
        disable                 = no
        per_source              = 11
        cps                     = 100 2
        flags                   = IPv4
}

NFS

/etc/exports
/home  192.168.0.0/255.255.255.0(rw,no_root_squash)

En /home están los homes de los usuarios, en /home/cluster el software para instalar y en /home/opt software ya instalado?

NIS

Hay un NIS server funcionando.

/etc/nsswitch.conf
passwd:     files nis
shadow:     files nis
group:      files nis
hosts:      files dns
bootparams: nisplus [NOTFOUND=return] files
 
ethers:     files
netmasks:   files
networks:   files
protocols:  files
rpc:        files
services:   files
 
netgroup:   files
 
publickey:  nisplus
 
automount:  files
aliases:    files nisplus
 
sudoers:  files ldap
/etc/yp.conf
ypserver 192.168.0.254
/etc/ypserv.conf
#
# ypserv.conf	In this file you can set certain options for the NIS server,
#		and you can deny or restrict access to certain maps based
#		on the originating host.
#
#		See ypserv.conf(5) for a description of the syntax.
#
 
# Some options for ypserv. This things are all not needed, if
# you have a Linux net.
 
# Should we do DNS lookups for hosts not found in the hosts table ?
# This option is ignored in the moment.
dns: no
 
# How many map file handles should be cached ?
files: 30
 
# Should we register ypserv with SLP ?
slp: no
# After how many seconds we should re-register ypserv with SLP ?
slp_timeout: 3600
 
# xfr requests are only allowed from ports < 1024
xfr_check_port: yes
 
# The following, when uncommented,  will give you shadow like passwords.
# Note that it will not work if you have slave NIS servers in your
# network that do not run the same server as you.
 
# Host                     : Domain  : Map              : Security 
#
# *                        : *       : passwd.byname    : port 
# *                        : *       : passwd.byuid     : port
 
# Not everybody should see the shadow passwords, not secure, since
# under MSDOG everbody is root and can access ports < 1024 !!!
*			   : *       : shadow.byname    : port
*			   : *       : passwd.adjunct.byname : port
 
# If you comment out the next rule, ypserv and rpc.ypxfrd will
# look for YP_SECURE and YP_AUTHDES in the maps. This will make
# the security check a little bit slower, but you only have to
# change the keys on the master server, not the configuration files
# on each NIS server.
# If you have maps with YP_SECURE or YP_AUTHDES, you should create
# a rule for them above, that's much faster.
# *                        : *       : *                : none

FTP

Hay un VSFTP funcionando.Parece que sirve /var/ftp/

/etc/vsftpd/vsftpd.conf
anonymous_enable=YES
local_enable=YES
write_enable=YES
local_umask=022
dirmessage_enable=YES
xferlog_enable=YES
connect_from_port_20=YES
xferlog_std_format=YES
listen=YES
 
pam_service_name=vsftpd
userlist_enable=YES
tcp_wrappers=YES

ntp

/etc/ntp.conf
# Hosts on local network are less restricted.
restrict 192.168.0.0 mask 255.255.255.0 nomodify notrap
 
server hora.rediris.es
server 0.rhel.pool.ntp.org
server 1.rhel.pool.ntp.org
server 2.rhel.pool.ntp.org

httpd

Solo está para ejecutar un script cgi que tiene algo que ver con el proceso de instalacion de pxe en /var/www ?????

pbs y sge

Se ejecutan ambos.

root      3667  0.0  0.0 102512  5036 ?        Sl   Mar12   2:01 /opt/cluster/sge60/bin/lx24-amd64/sge_qmaster
root      3687  0.2  0.0  50032  4592 ?        Sl   Mar12   7:04 /opt/cluster/sge60/bin/lx24-amd64/sge_schedd
root      4086  0.0  0.0   6352   568 ?        Ss   Mar12   0:00 /usr/sbin/pbs_server
root      4099  0.0  0.0   6056   316 ?        Ss   Mar12   0:00 /usr/sbin/pbs_sched

El sge esta en: /home/opt/cluster/sge60/ por lo que está compartido con los nodos. PBS en : /var/lib/torque/

PGI workstation

Se lanza un servidor de licencias para PGI que parece ser un conjunto de compiladores.

root      4268  0.0  0.0  16360  1428 ?        S    Mar12   0:00 /home/opt/pgi/linux86-64/13.4/bin/lmgrd -c /opt/pgi/license.dat -l /opt/pgi/flexlm.log
root      4270  0.0  0.0  50544  2856 ?        Ssl  Mar12   0:01 pgroupd -T localhost 11.11 3 -c /opt/pgi/license.dat --lmgrd_start 53201b25

Añadidos posteriores

Instalado Ganglia en todo el cluster.

Un script que permite apagar y encender los nodos desde el master: /usr/local/bin/gestionar_nodos

Nodos

Los nodos tienen todos Scientific Linux SL release 5.3 (Boron)

Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             7.8G  1.7G  5.8G  23% /
tmpfs                 7.9G     0  7.9G   0% /dev/shm
/dev/sda2             920G  270G  651G  30% /scratch
dell:/home            296G  140G  141G  50% /home

/scratch está formateado en XFS

Eth0 está configurada con DHCP y Eth1 no tiene configuración porque no se usa.

No hay usuarios locales, se usa NIS:

/etc/nsswitch.conf
passwd:     files nis
shadow:     files nis
group:      files nis
hosts:      files dns
bootparams: nisplus [NOTFOUND=return] files
 
ethers:     files
netmasks:   files
networks:   files
protocols:  files
rpc:        files
services:   files nis
 
netgroup:   files
 
publickey:  nisplus
 
automount:  files
aliases:    files nisplus

Se ejecuta SGE:

root      3271  0.0  0.0   5308  1616 ?        S    Mar10   0:02 /opt/cluster/sge60/bin/lx24-amd64/sge_execd

Instalar nodo

Añadir una entrada con la mac del nodo en el dhcpd.conf

Hay que crear un archivo de configuracion (copiando uno que ya este) en /tftpboot/pxelinux.cfg/ con el nombre igual a su ip en hexadecimal.

Hay que editar esta linea para que apunte a un archivo de configuracion valido: append initrd=pxeboot/initrd.img ramdisk_size=16384 ksdevice=link ks=http://192.168.0.254/installations/SL5x/sl53.x86_64_ks.cfg

Instalar SGE en nodo

En el home de javier hay algunos scripts y archivos de configuracion que dan pistas, pero todo esta muy desfasado

Hay que crear enlaces simbolicos en opt

cd /opt/
  ln -s /home/cluster/ cluster
  ln -s /home/cluster/sge60/ sge60

Hay que copiar el nsswitch.conf de otro nodo

Exportar variables de entorno:

export SGE_ROOT=/home/cluster/sge60/
export PATH=$PATH:/home/cluster/sge60/bin/lx24-amd64 
export LD_LIBRARY_PATH=/home/cluster/sge60/lib/lx24-amd64

Añadir a /etc/services:

sge_qmaster     1434/tcp
sge_execd       1435/tcp

Instalar, todo por defecto.

cd /home/cluster/sge60
./install_execd

Añadir host y modificar colas en master:

qconf -ah dellx
qstat -f
qmod -mq nombre_de_cola

Configurar las características del nodo en SGE:

qconf -me dellxx
 
complex_values        arch=64,s_vmem=24G,num_proc=24

Donde s_vmem es un soft limit de memoria (igual a la memoria física del host) y num_proc es el número total de nucleos