<p dir="ltr"><a href="http://unixbhaskar.blogspot.com/2014/02/how-to-fix-read-only-root-file-system.html?m=1">http://unixbhaskar.blogspot.com/2014/02/how-to-fix-read-only-root-file-system.html?m=1</a></p>
<p dir="ltr">Not a solution but something to look at, fstab.</p>
<div class="gmail_quote">On Mar 28, 2016 11:57 AM, "Todor Fassl" <<a href="mailto:fassl.tod@gmail.com">fassl.tod@gmail.com</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I have a mysterious problem with workstations in a shared use environment. There are 2 labs in different buildings, onewith 6 workstations and one with 8. These workstations are used by a group of about 30 grad student TAs. All are running ubuntu 15.10. Authentication is via ldap and home directories are mounted via nfs. Every day, 2 or 3 of the machines go down. The earliest symptom I can find is that the root filesystem is remounted read-only. Soon they stop responding to ssh and snmp and they are essentially locked up. They still respond to pings though.<br>
<br>
I've caught the machines in the period where the root system is read-only but I can still ssh to them. I've found that I cannot nfs mount home directories on our file server. I can mount nfs shares on other servers. And I can mount the same home directories if I go to another workstation. Restarting nfs on the file server has no effect.<br>
<br>
When I try to mount a home directory on an effected machine, the mount just hangs. I ran it with strace and it just showed it was waiting -- for what, I'm not sure and I don't have a screen cap available at the moment. I put a packet sniffer on the server and it showed it received a single packet from the client and that's it.<br>
<br>
There is nothing in the logs on the client. In fact, they simply stop at some point in the process. At first I attributed this to the root filesystem being read-only but it continues after I move /var to a separate file system. At some point it just stops writing records to the syslog but I don't know if it's before or after the root filesystem is remounted read-only.<br>
<br>
Many of the TAs also have identical workstations in their offices. None of those machines seem to have this problem. The TAs do tend to walk away from the workstations w/o logging out. But I wrote a script to kill off their sessions and it didn't help. I had it send me an email whenever it killed somebody's session and it doesn't seem to be correlated with that. In other words, sometimes machines go down even if everyone who has used it has remembered to log out.<br>
<br>
I'm pretty desperate. Any ideas?<br>
<br>
_______________________________________________<br>
Ale mailing list<br>
<a href="mailto:Ale@ale.org" target="_blank">Ale@ale.org</a><br>
<a href="http://mail.ale.org/mailman/listinfo/ale" rel="noreferrer" target="_blank">http://mail.ale.org/mailman/listinfo/ale</a><br>
See JOBS, ANNOUNCE and SCHOOLS lists at<br>
<a href="http://mail.ale.org/mailman/listinfo" rel="noreferrer" target="_blank">http://mail.ale.org/mailman/listinfo</a><br>
</blockquote></div>