[ale] klogd and syslogd hanging

Dow Hurst Dow.Hurst at mindspring.com
Tue Jun 14 02:26:55 EDT 2005


I believe that commenting out in syslogd.conf the line:

#kern.warning;*.err;authpriv.none        /dev/tty10

has fixed the problem.   Now only klogd is printing to tty10 as set by 
it's init script.  Thanks for the help Doug!
Dow



Doug McNash wrote:

>On Wed, 8 Jun 2005, Dow_Hurst wrote:
>
>Looking at the kernel doc device.txt I see now that tty10 is a "virtual 
>console", whatever that is.  I didn't pick that up reading your original 
>message so it can't be related to physical serial port problems.  
>Nevertheless it does go thru the serial port driver (major number 4) and 
>the pty code also uses tty_io.c but thru another driver entry point (I 
>think, it's been awhile since I did tty driver work.)  I believe a virtual
>console is kinka like a pty with one endpoint instead of two (read and 
>write ends.)
>
>None of the above helps with your problem...
>
>The section of code that prints your kernel message is the part of the 
>code that makes sure everyone that has the device open has closed it and 
>that all the output has drained and so on.  The problem appears that it 
>logs its message to the very same device so it will get trapped in this 
>positive feedback loop. printk will write to the console independently of 
>syslogd.conf.  (The author of tty_io.c should NOT be writing to the 
>console at this point in the code IMHO but what else can they do?)
>
>There appears to be some control of printk in sysctl and thru 
>/proc/sys/kernel/printk, /proc/sys/kernel/printk_ratelimit, and 
>/proc/sys/kernel/printk_ratelimit_burst.  You might explore changing 
>those to not log warnings or put on more restrictive limits.  I have run 
>out of time at the moment or I would explore it further.
>
>Good Luck.
>
>
>  
>
>>Doug, thanks for the ideas!  I played with stty a bit and then tried sync and init 0.  I was able to get the machine shutdown and restarted with syslogd not configured to output to /dev/tty10.  To my surprise, I was still getting messages on tty10!  I searched thru the startup scripts and found that in boot.klog there is a section that uses the /usr/sbin/klogconsole to direct the kernel to send printk messages to /dev/tty10.  So, I'll watch the machine and see if syslog hangs, klog hangs, or nothing.    
>>
>>How do I find out if it is a pty or a tty?  I have thought that a pty is virtual terminal and a tty was a real serial port.  What is confusing is that the output is seen on virtual terminal 10 by switching with Alt-F10.  So does that mean that it is a pty assigned to a virtual terminal?  Is the /dev/tty10 actual treated as a pty by the kernel since the hardware for a real serial port isn't there in this machine?  How do you track these assignments?  Can you find them in /proc?
>>
>>Thanks,
>>Dow
>>
>>    
>>
>
>  
>
>>>release_dev: tty10: read/write wait queue active!
>>>
>>>Does anyone know what this means?  The syslogd daemon is running on the CPU and the /var/log/messages are no longer being logged.  I've tried cycling syslogd with the /etc/init.d/syslog restart/stop/start.  I've signaled syslogd and klogd with different signals such as HUP, TER, and KILL.  I've only truly recovered what seems to be normal operation with a reboot.  Two issues come to mind:
>>>
>>>1.  syslogd can choke with old compatibility libs that don't format the messages to syslogd correctly.
>>>
>>>2.  Is the tty10 not released due to its current state when syslogd is killed and that is why I am having to reboot to regain control?  Is there a way to first flush tty10 and regain control of it directly?
>>>
>>>
>>>I can try running syslogd in debug mode.  This whole issue cropped up in 
>>>      
>>>
>>mid April.  I just noticed the CPU was under load (this is a lightly loaded production fileserver).  I used top and saw syslogd and klogd showing significant percentage of the CPU so started investigating.  Ended up frustrated and having to reboot.  Now I've had a second occurence so after more investigation on my own, I ended up rebooting again.  I've seen the third time just 1 day later.  I don't want to have to reboot.  I would dlike to fix this!
>>    
>>
>>>Now that I've got a real error message to deal with, I am going to do some Googling.  
>>>Thanks for your thoughts,
>>>Dow
>>>
>>>
>>>No sig.
>>>_______________________________________________
>>>Ale mailing list
>>>Ale at ale.org
>>>http://www.ale.org/mailman/listinfo/ale
>>>
>>>      
>>>
>>_______________________________________________
>>Ale mailing list
>>Ale at ale.org
>>http://www.ale.org/mailman/listinfo/ale
>>
>>
>>No sig.
>>_______________________________________________
>>Ale mailing list
>>Ale at ale.org
>>http://www.ale.org/mailman/listinfo/ale
>>
>>    
>>
>_______________________________________________
>Ale mailing list
>Ale at ale.org
>http://www.ale.org/mailman/listinfo/ale
>
>  
>



More information about the Ale mailing list