[ale] Ram, sigstop, swap, pids, etc

Jim Kinney jim.kinney at gmail.com
Mon Feb 8 14:13:55 EST 2021


Scenario: user Bill has a process running that may run for hours or days or weeks on a shared use system.

Mary has a job that must run NOW, or within short (minutes) of requesting to run it on the same machine and it may run for hours to days.

Both jobs are memory hogs and can't run on the same host without swapping it to death. Both jobs will eat most if not all cpu cores. 

In short, Bob and Mary's job can't really share the machine. And Mary outranks Bob and gets priority.

I want to send Bob's job a SIGSTOP and let Mary's job run to completion. Then send a SIGCONT and Bob is back running.

Will the kernel move Bob's process from  ram to swap and back if it sits in STOP for a while (hours to days)? Unknown how long after Mary starts that it eats all the RAM.

Pooradmins checkpointing without users changing their code. Looking at the freezer cgroup to move Bob into. That has an effect similar to a sigstop with memory being marked as swappable.

If the box dies, they all lose out. Not a concern in this.

-- 
Computers amplify human error
Super computers are really cool
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.ale.org/pipermail/ale/attachments/20210208/5942eb34/attachment.html>


More information about the Ale mailing list