Minutes
of the
ITC-Research Computing Standing Committee
Meeting on June 26, 2002, 9:00 AM
Members: Alice, Bill,
Brian, Dawn, David, Ed, Hamp, Kathy, Jim, Mark S., Martha, Michael, Robin,
Sue Ellen, Steve, Terry, Tim S., Tom S.
Attending: Bill, Brian,
David, Ed, Hamp, Kathy, Jim., Martha, Robin, Steve, Terry, Tim S., Tom S.,
Bob Reynolds
Chair: Tim
T.
Recorder: Alice
See "To
Do" List generated from this meeting.
I. Review and Discussion of committee charge
Bob stressed his goal of improving
the academic support (research and instruction) provided by ITC
-- emphasizing the importance of communication and project-tracking within
ITC.
Tim T. reviewed the ground
rules for the group (open discussion, with the focus on issues, reasonable
to question and discuss any topic or decision, and agreement before "big"
changes). Everyone agreed to them.
There was discussion about
using a single email list for communications between group members -- it
was decided to add csd-unix to itc-rescomp, and then to use itc-rescomp
for all e-mail communications for this group
II. Items on Linux Cluster
*Hamp: related his concerns with Aspen's responsiveness/turnaround-time
for replacements for failed components -- we have ordered a spare
HP network switch card. It was
suggested that Hamp copy Alan in his emails to Aspen. Perhaps we could
ask Victoria if there is anything in the contract we could use to alleviate this issue. Tom S. has had varying experience with Aspen's responsiveness.
*Hamp: prompted discussion of how to manage temp space
both during a job and after it completes -- sounds like either PBS or PVFS could be
used.
*Hamp: there is a problem between the scheduler and
sendmail over domain names -- (the cluster will need to send job-related email only) --
it can be fixed -- not a big deal.
*Hamp: allow logins? or not? Current configuration allows interactive logins. Several methods were
discussed that could provide better control. For now, it was decided to just monitor use (this is both a technical
and a policy issue).
*Hamp: use our ssh or Open SSH? NPACI Rocks expects
Open SSh. Defer this choice for now.
*Hamp: Linux Cluster will need to be moved to the
unmanned machine room once the mainframe disk is out -- will need about 1 hour downtime
-- target July 10 for this move.
*All: About test users for the new cluster -- Tim's
group can do some testing (jobs, profiles, etc.) very soon -- possible to login to
the head node now. Will get ready
for other friendly test users by the next ResComp group meeting. Might
be able to enlist some from Pearson's group, or John Hawley, or some SEAS faculty as friendly test users. Meanwhile users could use the ACHS cluster now -- but they would have to move
twice.
*TimT: test suit from Aspen been run? We have no test suite from Aspen.
*TimT/Ed: what about cluster management and monitoring
software? We have not received any Aspen Systems Cluster Management software -- yet
it was included in the contract and we have paid for it -- so we need to
take this up with them. Perhaps Systems
should look at it first before we decide we do not want it. If we
do decide that we do not want it, then we need to get something else (spare parts? spare node? a node for testing?)
from Aspen in place of this software. Could use other software, like Ganglia, to monitor the cluster; BigBrother functions more like rover.
*Ed prompted discussion about
the pros/cons of installing and using PBSPro vs. OpenPBS. Ultimately what
we really need to decide is whether we should have the same or different schedulers on all ITC-supported research
systems. Ed and Bill will work together to further investigate the advantages
of PBSPro over OpenPBS and Maui.
*IMSL and PGI compilers have
been ordered for the Linux Cluster.
III.
Items from Linux Cluster workshop
*Will use the e-mail list
to discuss these.
IV. Other action items and issues
*How do we (technically, and
policy-wise) fit all the research support systems components (Orange and Teal clusters, UnixLab workstations, the SP nodes, the SMP box, the new Linux Cluster) into a coherant, cohesive set of research computing resources? For example, the SMP provides an easier migration path for current
SP users plus it can run "big memory" jobs -- what is our
long-term commitment to this system? not to grow it? It will be necessary for
the foreseeable future.
Who are we going to "push"
towards using the new Linux Cluster (which will be a more difficult transition)?
Decided to set a target date
of May 2003 for all SP users to have run a job somewhere else (SMP, Linux Cluster, etc.). And we do not need to turn off the SP -- can just freeze its config
and let it run -- use should naturally dwindle through time. We do need a "communications strategy" for research users.
*The ACAC has agreed to take
on the role of providing input and oversight for the allocation of the new research computing resources (SMP, Linux Cluster). They
would like ITC to make a list of "possible alternatives" with pros/cons that they could then consider.
Items could be those related to priorities and "fair shares",
etc. and should be described at a pretty high level. The ACAC will resume meeting in the Fall.
*Maple 8 should be installed
by July 8. Maple 7 will keep running.
*ESRI updates -- need a list
of all systems with ESRI.
*S-Plus and SAS licenses will
be updated/renewed in July -- CSS can do this.
Meeting adjourned at 10:30 AM, next meeting July 29, 2002, 10:30 AM

Go to: ITC Research
Computing Standing Committee Home Page