[ale] Linux Cluster Server Room

Dow Hurst dhurst at kennesaw.edu
Mon Apr 19 21:31:47 EDT 2004


I understand your philosophy here but have a question?  What if the 
calculations are long and costly to restart?  Shouldn't I look at the value of 
spent computation that might have to be done over if I lose power?  The code I 
am most concerned about running on the cluster may or may not be 
checkpointable.  I think it might be, but I know my users and they won't want 
power to be an issue with predicting when their jobs will finish. ;-)

Are Best UPS better performing than Tripplite or APC?  I have experience with 
Tripplite, APC, and Leibert so far and never used Best.  I like the toughness 
and quality of the enclosure of the APC and Leibert.  I like the quality of 
all three.  I like the performance and cost of APC and Tripplite.  Tripplite's 
cases or enclosures on the low end aren't as nice as APC, but when you get the 
high UPSes they have nice rack enclosures.  Performance wise, I haven't been 
able to tell a difference between the two.  Heat production leans toward APC 
producing less overall.

What do you mean by getting the wrong power factor conversion? Do you mean 
getting 120v at 60Hz vs 220v at 60Hz on the output outlets?

I appreciate all this advice!
Dow



Jeffrey B. Layton wrote:
> I'll give you my 2 cents about clusters and UPS's if you wish.
> 
> A good cluster configuration will treat each compute node as
> an appliance. You don't really care about it too much and it
> doesn't hold any data of any importance. What you care about
> is the master node and/or where the data is stored These
> machines can have their own UPS or a single UPS to cover
> the machines (they may be more than one). Then take the cost
> savings (if you can) and put them into more nodes, or a better
> interconnect (if needed), or a large file system, or a better
> backup system, or .... well, you get the picture.
> 
> Thinking of only putting a UPS on the important parts of the
> cluster will save you money, time, and headaches. However,
> if you put a cluster in a server room you can have all power
> covered by a single huge UPS and probably a diesel backup
> generator as well. This goes back to the purpose of a server
> room - to support independent servers, not clusters. While this
> is nice and good, it is somewhat wasteful. If you could have
> a combination of UPS/Diesel backed power and just regular
> conditioned power, that would be more economical. However,
> the budgets for clusters (computing) and the budget for facilities
> are never really seen as related by management. Even though
> they come out of the same overall pot within the company (or
> university), management has a tendency to compartmentalize
> things for easy managing (and the definite lack of brain power
> on the part of most managers). Try arguing that you really
> don't need the giant UPS/Diesel combo and you will get IT
> managers screaming all sorts of things about you. Sigh.
> 
> Of course, these comments depend on your cluster configuration.
> If you are running a global filesystem across all of the nodes,
> so that each node has part of the filesystem, then you might
> want to think about a good UPS for all of the nodes (try
> restoring a 20 TB global filesystem from backup after a
> power outage).
> 
> Good Luck!
> 
> Jeff
> 
>> What type of UPS system are you using? Do most install a large UPS 
>> system for the entire server room? If so, how much will this cost?
>>
>> Thanks,
>> Chris
>>
>> -----Original Message-----
>> From: Dow Hurst [mailto:dhurst at kennesaw.edu]
>> Sent: Monday, April 12, 2004 11:20 AM
>> To: ale
>> Subject: Re: [ale] Linux Cluster Server Room
>>
>>
>> Thanks Jonathon!  That is exactly the kind of ballpark I needed!  I 
>> don't need
>> the vendors right now as we are still kicking around ideas.  If anyone 
>> would
>> throw some specs or ideas out there, I'd appreciate it.  Here is a quick
>> question?  Is planning for double your planned load a good rule?  I would
>> think that would be a good idea.  How about backup cooling if the main 
>> unit
>> dies?  The firesafe is one I had not thought of.
>> Dow
>>
>>
>> Jonathan Glass (IBB) wrote:
>>  
>>
>>> How big are the Opteron nodes?  Are they 1,2,4U?  How big are the power
>>> supplies?  What is the maximum draw you expect?  Convert that number to
>>> figure out how much heat dissipation you'll need to handle.
>>>
>>> I have a 3-ton A/C unit in my 14|15 x 14|15 server room, and the 24-33
>>> node cluster I just spec'd out from IBM (1U, Dual Opterons) was rated at
>>> a max heat dissipation (is this the right word?) of 18,000 BTU. 
>>> According to my A/C guy, the 3-ton unit can handle a max of 36,000 BTU,
>>> so I'm well inside my limits.  Getting the 3-ton unit installed in the
>>> drop-down ceiling, including installing new chilled water lines, was
>>> around $20K.
>>>
>>> I do have sprinkler fire protection, but that room is set to release its
>>> water supply independent of the other rooms. Also, supposedly, the fire
>>> sprinkler heads (whatever they're called) withstand considerably more
>>> heat than normal ones.  So, the reasoning goes, if it gets hot enough
>>> for those to go off, I have bigger problems than just water.  Thus, I
>>> have a fire safe nearby (in the same bldg...yeah, yeah, I know; off-site
>>> storage!) that holds my tapes, and will shortly hold a hardware
>>> inventory and admin password list on all my servers.
>>>
>>> If you want my list of vendors, send me an email off-list, or call my
>>> office, and I'll see if I can track down the DPOs for you.
>>>
>>> Thanks
>>>
>>> Jonathan Glass
>>>
>>> On Fri, 2004-04-09 at 17:35, Dow Hurst wrote:
>>>
>>>   
>>>
>>>> If I needed to take an existing space 400 square feet w/8' ceiling, 
>>>> 20'x20'x8', and add A/C and fire protection for a server room, what 
>>>> kind of cost would be incurred?  Sounds like an algebra problem from 
>>>> highschool doesn't it?  Let's say a full 84" rack of 4CPU Opteron 
>>>> nodes and supporting hardware were in the room.  Does anyone have 
>>>> any ballpark figures they could throw out there?  Any links I could 
>>>> be pointed to?
>>>> Thank a bunch,
>>>> Dow
>>>>
>>>>
>>>> PS.  I'd like some other type of fire protection than sprinkler 
>>>> heads. ;-)
>>>>     
>>>
>>>
>>>   
>>
>>
>>  
>>
> 
> 
> _______________________________________________
> Ale mailing list
> Ale at ale.org
> http://www.ale.org/mailman/listinfo/ale
> 

-- 
__________________________________________________________
Dow Hurst                  Office: 770-499-3428            *
Systems Support Specialist    Fax: 770-423-6744            *
1000 Chastain Rd. Bldg. 12                                 *
Chemistry Department SC428  Email:   dhurst at kennesaw.edu   *
Kennesaw State University         Dow.Hurst at mindspring.com *
Kennesaw, GA 30144                                         *
************************************************************
This message (including any attachments) contains          *
confidential information intended for a specific individual*
and purpose, and is protected by law.  If you are not the  *
intended recipient, you should delete this message and are *
hereby notified that any disclosure, copying, distribution *
of this message, or the taking of any action based on it,  *
is strictly prohibited.                                    *
************************************************************



More information about the Ale mailing list