[ale] shared research server help

DJ-Pfulio DJPfulio at jdpfu.com
Thu Oct 5 08:26:31 EDT 2017


I use taskspooler to manage computer batch workloads, but don't know how
to force other users to use it.
https://www.linux.com/news/queuing-tasks-batch-execution-task-spooler


On 10/05/2017 07:52 AM, Jim Kinney wrote:
> Back to the original issue:
> 
> A tool like torque or slurm is really your best solution to intensive
> shared resources. It prevents 2 big jobs from eating the same machine
> and can also encourage users to code better to manage resources better
> so they can run more jobs.
> 
> I have the same problem. One heavy gpu machine (4 tesla P100) only has
> 64 G ram. Student tried to load in 200+G of data into ram.
> 
> A few crashes later he can run 2 jobs at once, each only eats 30G ram
> and one p100.
> 
> On October 4, 2017 6:32:32 PM EDT, Todor Fassl <fassl.tod at gmail.com> wrote:
> 
>     I manage a group of research servers for grad students at a university. 
>     The grad students use these machines to do the research for their Ph.D 
>     theses. The problem is that they pretty regularly kill off each other's 
>     programs by using up all the ram. Most of the machines have 256G of ram. 
>     One kid uses 200Gb and another 100Gb and one or the other, often both, 
>     die. Sometimes they bringthe machines down by hogging the cpu or using 
>     up all the ram. Well, the machines never crash but they might as well be 
>     down.
> 
>     We really, really don't want to force them to use a scheduling system 
>     like slurm. They are just learnng and they might run the same piece of 
>     code 20 times in an hour.
> 
>     Is there a way to set a limit on the amount of ram all of a user's 
>     processes can use? If so, we were thinking of setting it at 50% of the 
>     on-board ram. Then it would take 3 students together to trash a machine. 
>     It might still happen but it would be a lot more infrequent.
> 
>     Any other suggestions? Anything at all? Just keep in mind that we really 
>     want to keep it easy for the students to play around.
> 
> 
> -- 
> Sent from my Android device with K-9 Mail. All tyopes are thumb related
> and reflect authenticity.
> 
> 
> _______________________________________________
> Ale mailing list
> Ale at ale.org
> http://mail.ale.org/mailman/listinfo/ale
> See JOBS, ANNOUNCE and SCHOOLS lists at
> http://mail.ale.org/mailman/listinfo
> 


-- 
Got Linux? Used on smartphones, tablets, desktop computers, media
centers, and servers by kids, Moms, Dads, grandparents and IT
professionals.


More information about the Ale mailing list