Members: Alice, Bill, Dawn, Ed, Hamp, Kathy G., Katherine H., Jim, Mark S., Martha, Robin,
Sue Ellen, Steve, Terry, Tim S., Joe S., Tom S.
Attending: Alice, Bill, Dawn, Ed, Hamp,
Kathy G., Katherine H., Martha, Steve, Tang, Terry, Tim S.,
Chair: Tim T. Recorder: Alice Howard
II. Ongoing Discussion Topics:
1. Cluster
projects:
· “Condo” Cluster (aka cedar): Will
operate as 32-bit cluster for at least first year. (Note: we need to advertise this as having 125
compute nodes – we need to reserve 1 node for an AVAKI gateway.)
o
Need firm “go-live” date and have as
smooth, problem-free rollout as possible – need to make a very favorable
impression if we want to encourage further participation in condo-cluster model.
Picked July 6 for the rollout date, even though we will still have some
ongoing power and A/C issues.
o
Have four researchers participating: Leo Zhigilei (lz2n) – 8 nodes w/ ETF $; Jeff Shabanowitz/Dina Bai (Chemistry)- 8 nodes; Phil Parrish/Rich Gregory – 4 nodes; Matt Neurock – 4 nodes.
o
Outstanding issues:
§
Problems with insufficient power and A/C in
the machine room – Facilities Management
will be adding more but do not have a timeframe yet. Meanwhile experimenting with how many nodes
we can have up (~88-96) – will rollout to users even if all 125 nodes are not
up.
§
Mitch Rosen wants to buy 10 nodes from this cluster – will go
ahead with this and subtract the 10 from the general pool of nodes.
§
Bill thinks that the NPI link-checker software could be quite
useful for our clusters but it does cost some $$.
§
Bill thinks he can suppress the RSA warnings.
§
The problem with /state/partition1/ going missing was likely due
to some configuration work that was going on – think it’s
stable now.
o
PBSPro -- priority queuing for paying participants and usage reporting.
· Teak Cluster:
o
Right now, only users on it are using Gaussian– if get Gaussian
for Linux, might not need teak anymore. The goal is to maintain a “rolling” set of
3 Linux clusters at any given time.
· Oak Cluster:
o
Hamp will see if it is possible to add the modules software that
would make it seamless for IMSL and Java users to move from aspen to oak and
make it easier for users to access different versions of the Solaris compilers.
· Aspen & Birch cluster:
Timeline for OS upgrade and upgrade to kernel to bring both clusters
into concordance with cedar?
o
Will do this upgrade after the new cluster is settled in – (note: cedar is at 2.4.) – aim for first week in August and schedule 2 days of
downtime.
2. /longtmp additional
disk space is installed,
· Hamp and Tim will collaborate
on a script to automate checking usage and reminding users.
3. Linux
license server, cluster of three lm1., lm2.,
lm3.license.
· About half of the
research software licenses have been migrated off of aix.license
onto the linux license server. Will continue to move as they need to be
updated. Having a problem with ESRI.
· Although we recommend
using a VPN whenever possible, it would help users doing ssh tunneling if we could find a way to designate the
master – Hamp will look into it.
4. blue.unix upgrade of nodes to new 64-bit nodes and new
version of OS. ITC-Transitions CDP is tracking as
well. Timeline?
· holmes is being upgraded to AIX 5.3 today. Then we need to get people using it to see
what breaks within the coming week.
Won’t have our old load balancing program, but do have some alternatives.
· Going from 16 nodes to 3
nodes for blue.unix.
III. New Business:
5.
Ganglia web pages: Need to add cedar.itc.
· It might be possible to separate the teak and oak clusters from
the general research computing cluster so we can see these nodes easily – Steve will look at it.
6.
Should we increase the CPU limit on blue.unix
since the new nodes will be much faster?
· Since the new nodes are
faster and use of blue.unix has declined, one should
get a lot more done in an hour of CPU time – so we decided to leave the CPU
limit as is for the time being and see what happens – we could relax it later.
Next Scheduled Meeting of this group:
Monday, August 29th,
Next Scheduled Meeting of ITC ResComp Management team: Monday, July 25th,
Go to: ITC Research Computing Standing Committee Home Page