Minutes of the
ITC-Research Computing Standing Committee
Meeting on June 26, 2002, 9:00 AM

Members: Alice, Bill, Brian, Dawn, David, Ed, Hamp, Kathy, Jim, Mark S., Martha, Michael, Robin, Sue Ellen, Steve, Terry, Tim S., Tom S.

Attending: Bill, Brian, David, Ed, Hamp, Kathy, Jim., Martha, Robin, Steve, Terry, Tim S., Tom S., Bob Reynolds

Chair: Tim T.

Recorder: Alice

See "To Do" List generated from this meeting.

I.  Review and Discussion of committee charge

Bob stressed his goal of improving the academic support (research and instruction) provided by ITC -- emphasizing the importance of communication and project-tracking within ITC.

Tim T. reviewed the ground rules for the group (open discussion, with the focus on issues, reasonable to question and discuss any topic or decision, and agreement before "big" changes).  Everyone agreed to them.

 

There was discussion about using a single email list for communications between group members -- it was decided to add csd-unix to itc-rescomp, and then to use itc-rescomp for all e-mail communications for this group

II.  Items on Linux Cluster

*Hamp:  related his concerns with Aspen's responsiveness/turnaround-time  for replacements for failed components -- we have ordered a spare HP  network switch card.  It was suggested that Hamp copy Alan in his emails  to Aspen.  Perhaps we could ask Victoria if there is anything in the  contract we could use to alleviate this issue.  Tom S. has had varying  experience with Aspen's responsiveness.

*Hamp:  prompted discussion of how to manage temp space both during a job  and after it completes -- sounds like either PBS or PVFS could be used.

*Hamp:  there is a problem between the scheduler and sendmail over domain names -- (the cluster will need to send job-related email only) -- it can be fixed -- not a big deal.

*Hamp:  allow logins? or not?  Current configuration allows interactive logins.  Several methods were discussed that could provide better control. For now, it was decided to just monitor use (this is both a technical and a policy issue).

*Hamp:  use our ssh or Open SSH? NPACI Rocks expects Open SSh. Defer this choice for now.

*Hamp:  Linux Cluster will need to be moved to the unmanned machine room once the mainframe disk is out -- will need about 1 hour downtime -- target July 10 for this move.

*All:  About test users for the new cluster -- Tim's group can do some testing (jobs, profiles, etc.) very soon -- possible to login to the head node now.  Will get ready for other friendly test users by the next ResComp group meeting.  Might be able to enlist some from Pearson's group, or John Hawley, or some SEAS faculty as friendly test users.  Meanwhile users could use the ACHS cluster now -- but they would have to move twice.

*TimT:  test suit from Aspen been run?  We have no test suite from Aspen.

*TimT/Ed:  what about cluster management and monitoring software?  We have not received any Aspen Systems Cluster Management software -- yet it was included in the contract and we have paid for it -- so we need to take this up with them.  Perhaps Systems should look at it first before we decide we do not want it.  If we do decide that we do not want it, then we need to get something else (spare parts? spare node? a node for testing?) from Aspen in place of this software. Could use other software, like Ganglia, to monitor the cluster; BigBrother functions more like rover.

*Ed prompted discussion about the pros/cons of installing and using PBSPro vs. OpenPBS.  Ultimately what we really need to decide is whether we should have the same or different schedulers on all ITC-supported research systems. Ed and Bill will work together to further investigate the advantages of PBSPro over OpenPBS and Maui.

*IMSL and PGI compilers have been ordered for the Linux Cluster.

III. Items from Linux Cluster workshop

*Will use the e-mail list to discuss these.

IV.  Other action items and issues

*How do we (technically, and policy-wise) fit all the research support systems components (Orange and Teal clusters, UnixLab workstations, the SP nodes, the SMP box, the new Linux Cluster) into a coherant, cohesive set of research computing resources? For example, the SMP provides an easier migration path for current SP users plus it can run "big memory" jobs -- what is our long-term commitment to this system? not to grow it? It will be necessary for the foreseeable future.

 

Who are we going to "push" towards using the new Linux Cluster (which will be a more difficult transition)?

 

Decided to set a target date of May 2003 for all SP users to have run a job somewhere else (SMP, Linux Cluster, etc.).  And we do not need to turn off the SP -- can just freeze its config and let it run -- use should naturally dwindle through time. We do need a "communications strategy" for research users.

 

*The ACAC has agreed to take on the role of providing input and oversight for the allocation of the new research computing resources (SMP, Linux Cluster).  They would like ITC to make a list of "possible alternatives" with pros/cons that they could then consider.  Items could be those related to priorities and "fair shares", etc. and should be described at a pretty high level.  The ACAC will resume meeting in the Fall.

 

*Maple 8 should be installed by July 8.  Maple 7 will keep running.

 

*ESRI updates -- need a list of all systems with ESRI.

 

*S-Plus and SAS licenses will be updated/renewed in July -- CSS can do this.

Meeting adjourned at 10:30 AM, next meeting July 29, 2002, 10:30 AM

=====================

<==Go to: ITC Research Computing Standing Committee Home Page