¡Esta es una revisión vieja del documento!


Advanced Computational Laboratory

Front End(temporal)

  • External IP: 181.118.149.209
  • Internal IP: 10.5.99.159
  • External Name: gc1.javerianacali.edu.co
  • H/W : SunFire Z20v - AMD Opteron 250 / 4 Gb Ram
  • OS : Ubuntu Server 14.04
  • eth-pujc : 10.5.99.159 (eth0)
  • eth-cluster : 192.168.222.1 (eth1)
Nodes

By default (H/W RAID)

  • SCSI (2,0,0) (sda) 2.0 TB
  • SCSI (2,1,0) (sdb) 1.6 TB

Partitions

  • sda
    • swap : 20 GB
    • /boot : 4 GB
    • / : 500 GB
    • /vol-user : (PVFS/OrageFS)
  • sdb
    • /vol-scratch : (PVFS/OrangeFS)

Cluster network : 192.168.222.0/255.255.255.0

Nodes (temporal)

  • tmp-node1 : 192.168.222.2
  • tmp-node0 : 192.168.222.11 (dhcp) Sun Fire
Forward traffic
# Delete and flush. Default table is "filter". Others like "nat" must be explicitly stated.
iptables --flush            # Flush all the rules in filter and nat tables
iptables --table nat --flush
iptables --delete-chain     # Delete all chains that are not in the default filter and nat table
iptables --table nat --delete-chain
# Set up IP FORWARDing and Masquerading
iptables --table nat --append POSTROUTING --out-interface eth0 -j MASQUERADE
iptables --append FORWARD --in-interface eth1 -j ACCEPT	 
echo 1 > /proc/sys/net/ipv4/ip_forward             # Enables packet forwarding by kernel

Mac IP: 172.16.84.81

Cluster File System

Use ZFS and LUSTRE for the workers and front-end.

ZFS offers consistency and support NFS (http://zfsonlinux.org/ and https://launchpad.net/~zfs-native/+archive/ubuntu/stable). The idea is mount user's home directory from front-end into the workers.

LUSTRE will make a distributed file system from the worker's storage (SAS HDD). Linux kernel has support for lustre's kernel modules. There are some tools that require been build (check http://lustre.opensfs.org/ and http://zfsonlinux.org/lustre.html). A problem of LUSTRE is that data storage and clients are not recommended to run on the same machine for memory consumption problems.

Another option for distributed file system among the workers is ceph (http://ceph.com/). ceph is already in the ubuntu repository. It support Hadoop (http://ceph.com/docs/master/cephfs/hadoop/) through a Haddop Ceph plugin.

One more to check is Gluster (http://www.gluster.org/). There is an official ppa repository and looks like works with NFS and ZFS.

Process Batch system

For process batch system use slurm resource manager (http://slurm.schedmd.com/). HTCONDOR could be used to submit jobs for grid type of computation.

System management and deployment

There are few alternatives for system configuration and management.

To Do

  • Setup network definition
  • Setup ntp to syncronize clocks amoung nodes
  • Configure slurm (http://slurm.schedmd.com/quickstart_admin.html) (Missing accounting to configure)
  • Install and configure lustre Replace lustre in favor of CEPH
  • Install environment_module to select the libraries to use
  • Setup /sw to be shared between the nodes
  • Setup a git (or similar) repository to keep the node's configurations (/etc files mainlly)
  • (Re)install the nodes
  • Install and configure magpie to unify hpc and big data (https://github.com/LLNL/magpie)
  • Update Ubuntu 12.04 to 14.04
  • Admin tasks (clean users and directories)

IP address

The IP ranges for each network. It depends on its use in the cluster. The IPs has the form 192.168.XXX.YYY, where XXX and YYY are defines as follow:

XXX YYY Interface postfix
Communication 10Gbit 201 10-250 cn (Communication Network)
Communication 1Gbit 202 10-250 cn
File system Hydra 101 10-250 fsn (File system Network)
File system Lili 102 10-250 fsn
Administration 10 10-250 YYY=1: lca-admin

The front-end address (YYY) is 10.

The IP assignment is:

hostname cn vlan fsn0 vlan eth0 vlan Description
hfe 192.168.201.10/24 4 192.168.101.10/24 2 192.168.10.10/24 1 Hydra's front-end
hnode1 192.168.201.11/24 4 192.168.101.11/24 2 192.168.10.11/24 1 Hydra's node 1
hnode2 192.168.201.12/24 4 192.168.101.12/24 2 192.168.10.12/24 1 Hydra's node 2
hnode3 192.168.201.13/24 4 192.168.101.13/24 2 192.168.10.13/24 1 Hydra's node 3
hnode4 192.168.201.14/24 4 192.168.101.14/24 2 192.168.10.14/24 1 Hydra's node 4
lfe 192.168.202.10/24 5 192.168.102.10/24 3 192.168.10.20/24 1 Lili's front-end
lnode1 192.168.202.11/24 5 192.168.102.11/24 3 192.168.10.21/24 1 Lili's node 1
lnode2 192.168.202.12/24 5 192.168.102.12/24 3 192.168.10.22/24 1 Lili's node 2
lca-admin 192.168.101.101/24 2 192.168.10.1/24 1 General Administration node
192.168.102.102/24 3

Hostname assignment (cluster):

# Hydra's front-end
192.168.10.10    hydra-adm hfe
192.168.201.10   hydra hfe-cn # Communication (MPI)
192.168.101.10   hfe-fsn

# Hydra's node 1
192.168.10.11    hnode1-adm
192.168.201.11   hnode1      # Communication (MPI)
192.168.101.11   hnode1-fsn

# Hydra's node 2
192.168.10.12    hnode2-adm
192.168.201.12   hnode2      # Communication (MPI)
192.168.101.12   hnode2-fsn

# Hydra's node 3
192.168.10.13    hnode3-adm
192.168.201.13   hnode3      # Communication (MPI)
192.168.101.13   hnode3-fsn

# Hydra's node 4
192.168.10.14    hnode4-adm
192.168.201.14   hnode4      # Communication (MPI)
192.168.101.14   hnode4-fsn
# Lili's front-end
192.168.10.20    lili lfe
192.168.202.10   lfe-cn
192.168.102.10   lfe-fsn

# Lili's node 1
192.168.10.21    lnode1-adm
192.168.202.11   lnode1      # Communication (MPI)
192.168.102.11   lnode1-fsn

# Lili's node 2
192.168.10.22    lnode2-adm
192.168.202.12   lnode2      # Communication (MPI)
192.168.102.12   lnode2-fsn
# General administration node
192.168.10.1     lca lca-admin gc1
192.168.101.101  hydra-fsn
192.168.102.102  lili-fsn

# Other machines connected to the cluster network
192.168.10.51    Dell swtich 10G Ethernet
192.168.10.100   babbage
192.168.222.50   UPS (needs a new IP address)

Configurations

FAQ

  • If sinfo shows nodes in DOWN* state, check the clock synchronization between nodes and hydra (run sudo ntpdate hydra to sync on each worker)
clush -b -w hnode[1-4] sudo ntpdate hydra
  • If sinfo shows a node with DOWN state (without *), the node can be resumed running on hydra (Change hnode with the correct node name).
scontrol update NodeName=hnode State=RESUME
  • clush
 
hpccluster.1476390656.txt.gz · Última modificación: 2016/10/13 15:30 por callanor
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki