Notes for the
ITC-Research Computing Standing Committee Meeting
September 8, 2008
at 10:30 AM
2015 Ivy Road, 1st Floor Conference Room
Members: Alice, Andrew, Hamp, Jim, Joe S., Mark, Mike,
Robin, Terry, Tim S., Tom and members of Systems, Research Computing and ACHS
as desired
Attending:
Alice, Andrew, Hamp, Jim, Joe S., Mark, Mike, Terry, Tom, Katherine,
Kathy
Chair: Tom
Recorder: Alice
Rather than working from our ÒstandardÓ agenda, this meeting
was devoted mainly to discussion of possible configurations for a new cluster.
The meeting started off with Mike giving an update about our
new approach to HPC:
- This
computational science initiative (including a Center) has begun with a
Ôstone soupÕ approach with resources (financial, human, space, hardware,
etc) contributed by the VP/CIO, ITC, the Library – and likely from
COFU funding. So for the next
3 years it will be supported by a partnership – if successful, it
will become largely self-supporting over time.
- In
practical terms, ITCÕs resources for this initiative include Katherine and
Ed, and some hardware/software resources.
- A new
faculty Policy and Allocation Board (PAB) is being established for the
allocation of HPC resources.
Our Research Computing committee will need to figure out how to
interface ITCÕs infrastructure with the strategic direction coming from
advisory group(s). What we
will try for starters is that the PAB would have high level discussions
about goals and policies – and we would implement them in terms of
configurations, operations, etc., with Andrew and Mike as a bridge.
- Andrew
gave a recent example of working through priorities and the allocation of
resources in the initial startup of Tiger Teams and selecting the
proposals that they have been focusing on. The Board could choose to change our utilization
policies – for example, to establish a meta-queue for multiple
clusters with the objective of rationalizing resource utilization across
UVa.
What to do about a new cluster?
- This
could be a replacement for Cedar if it has comparable or greater capacity.
- We
have several faculty interested in a condo cluster purchase –
Michael Shirts (ChemE) has funds and needs to have a cluster in place by
January. Others (Matt
Neurock/ChemE and Martin Wu/Biology) are also interested.
- Potential
configurations were discussed (e.g. Hamp proposed dual quad-core with 32
GB of RAM).
- Do
we need to stay with homogeneous clusters? Not necessary anymore – so we could have some variation
in the cluster – but for now we could stick with recommending a
basic configuration and see what happens with respect to demand later.
- The
interconnect is an issue – do we need/want some nodes connected via
Infiniband? (although this does cost more – in increments of 24
nodes per the switch capacity).
- Over
time, we may want to move Schools/users away from allocating $$ to
clusters and towards allocations that they can use at UVa or elsewhere.
- Meanwhile
we can recommend HampÕs configuration to people and provide a 24-port
Infiniband switch.
- Andrew
described another option of getting Ôlightly usedÕ hardware. As an example, the CS department
got hardware from a bank.
However it did take more effort to understand how to use the
hardware – and it also took lots of paperwork. Hamp will talk to Scott in CS
about how hard this was.
Could be worth looking into for future needs.
- However
we have users who need something NOW – and we do have some machine
room capacity – especially if this is a cedar replacement.
- Action
Plan:
- Talk
to Scott in CS about their gift system;
- Get
back to the folks who currently have funds and interest in a condo
cluster purchase. (Hamp will
send Andrew the quote.)
Storage Plans and Issues:
- Jim
gave an overview of ITCÕs current approach and policies.
- Hearing
from some users that they might not need backups, etc – they would
like another, lower-cost model to choose for some of their storage needs
-- one that offers availability and persistence, but not necessarily
backups.
- Also
hearing that some users would like the storage options unbundled –
for example with respect to amount of space, backups, and replication
– and offer different levels of service for different needs –
with Òwhat is the entitlementÓ also defined.
- ITC
will have defined a general mid-level storage option by next semester.
- Meanwhile
if there are individual storage needs (e.g. Michael Shirts) we could add
it, independent of a need for nodes – until we have a larger
strategy.
The PBS Pro quote from Altair:
- There
was discussion about how many cores we need to cover at UVa – and
some discussion about whether we want to try to recover any costs –
and whether this is more of an infrastructure cost than a software-budget
cost. But, given our
familiarity with PBS Pro, it seems worth paying for this – and it
was decided that ITC would pay $12,000 (to cover 3000 cores - including support, maintenance,
upgrades) for a one year license.
Next Scheduled Meeting of ITC ResComp Standing Committee:
Monday, October 13 in ITC-Dynamics, 2015 Ivy Road, Room #102 (First Floor
Conference Room)!
Please send suggestions, additions, corrections to: Tom or Alice