Notes for the ITC-Research Computing Standing Committee Meeting

June 8, 2009 at 10:45 AM
2015 Ivy Road, 1st Floor Conference Room

 

Members: Alice, Andrew, Hamp, Jim, Joe S., Mark, Mike, Robin, Terry, Tim S., Tom and members of Systems, Research Computing and ACHS as desired

Attending: Alice, Hamp, Jim,  Mark, Mike, Robin, Terry, Tim S., Katherine Holcomb, Ed Hall, Bill Pemberton, Dale Castle


Chair: Tom, Convener: Katherine, Recorder: Alice

 

After a meeting with Andrew, Jim, and Hamp, Mike was asked to come and share the new framework/context that the existence of UVACSE brings to the provisioning of HPC services:  it is more ÔexperimentalÕ with fewer constraints on the rate of change (i.e. less process, less testing beforehand) – although we will stick with the practice of giving users at least 48 hours notice prior to changing a system.  This word now needs to get out to the users of the clusters – and it should come jointly from UVACSE, ITC, and VP/CIO.

 

Mike also reported that 2 NSF solicitations were discussed at the most recent CSAC lunch:  for the MRI grant program we are working on a proposal for a petascale, research storage system; for the ARI grant program, Mike and Jim are talking about re-doing the broadband cabling and the wiring closets.

 

 

1. Systems report on whether there's been any progress on testing SGE.  If we don't have something soon, it would be best to find the money to renew PBS Pro one more year.  I'm finding that users are somewhat

confused about the new queuing setup as it is.

 

2. Setting up of queues for node purchasers.  We don't want to just have a pool of purchased nodes -- each group should have a dedicated queue for exactly the number they bought.  Otherwise it's inevitable that some purchasers will "steal" time from others.  Have we got something set up so purchasers can start using their nodes?

 

3. Report on how much memory Cedar's and Dogwood's motherboards can support, and how much it would cost to upgrade them to (1) 4 GB and (2) 8 GB if that is possible.

 

4. Removing the dual functionality of the itc-rescomp email list.  Right now it is both for email about the committee business and for traffic among Systems and UVACSE about the clusters.  However, we have an increasing number of people who want to attend the rescomp meetings but don't want to see the systems-related traffic -- it's now up to 4 people whom I have to cc for this meeting reminder -- so I propose a new list to consist of unix-systems, uvacse, and probably Jim Jokl.  I am willing to be the list administrator.

 

5. Changing the default shell to bash.  ksh was fine in its day but bash is firmly embedded as "the" Linux shell, and a lot of things aren't even tested, or are barely tested, under ksh these days.

 

6. Dropping OpenMPI.  This would leave only MPICH2 but it seems to work well.  We put up OpenMPI originally because it was the first MPI-2 implementation, and also because some users were having severe problems with semlock/semget errors in MPICH1 (which we have already dropped), but MPICH2 does not do shmem the same way as MPICH1 and does not suffer from this problem.

 

7.  What about experimenting with Lustre?

 

 

Next Scheduled Meeting of ITC ResComp Standing Committee: July 13, 2009 in Carruthers Converfence Rom A
Please send suggestions, additions, corrections to: Tom or Alice