<div dir="ltr">One of the fun parts of temp monitoring is when the sensors must be calibrated. Most chips "know" the scale factors but some are off a bit. So the driver makes the change. With Linux system, you can feed a bunch scale-factor params to the start up of lm_sensors. Tyan used to provide the lm_sensor data they had tested for best accuracy on their boards. Not sure if other makers do or not.<br>
</div><div class="gmail_extra"><br><br><div class="gmail_quote">On Sun, Apr 21, 2013 at 12:38 AM, Ron Frazier (ALE) <span dir="ltr"><<a href="mailto:atllinuxenthinfo@techstarship.com" target="_blank">atllinuxenthinfo@techstarship.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi all,<br>
<br>
The topic of monitoring temperatures in a PC comes up here periodically. As I mentioned in other threads, I've been working with graphics cards on a Mint installation for cryptocurrency computations. As you may know from my previous posts, I've always wanted to keep an eye on the status of my systems. In the process of working with this project, I've discovered a number of interesting pieces of information that I thought I'd share.<br>
<br>
Take a look at this image:<br>
<br>
<a href="https://dl.dropboxusercontent.com/u/9879631/sensors-sample1.png" target="_blank">https://dl.dropboxusercontent.<u></u>com/u/9879631/sensors-sample1.<u></u>png</a><br>
<br>
This shows a part of my screen on my Mint system. Note my Gnome panel at the top with a temperature monitor on it. This is the hardware monitor widget that is available in Gnome. However, when I installed the ATI / AMD graphics drivers, the sensor system was no longer able to monitor the cpu. After a bit of googling, I was directed to lm-sensors. Many of you are already aware of that. I tried this command.<br>
<br>
--> sudo apt-get install lm-sensors<br>
<br>
I found that it was already installed.<br>
<br>
I then found and issued these two commands to reinitialize the system.<br>
<br>
--> sudo sensors-detect<br>
<br>
I accepted the defaults here then told it to save the changes.<br>
<br>
--> sudo service module-init-tools start<br>
<br>
I think that allowed the changes to take effect without a reboot.<br>
<br>
This allowed the sensor system to work again, and my panel widgets to read both the cpu temperature and the hard drive temperatures as shown in the image.<br>
<br>
You can use this command to read the sensors once in a terminal window.<br>
<br>
--> sensors<br>
<br>
This command will read the sensors every few seconds and display the results continuously.<br>
<br>
--> watch sensors<br>
<br>
I searched for a while to find a utility to read the gpu temperatures. I found nothing for a while. Then I discovered that it's built into the ATI / AMD driver. I don't know how to do this with nvidia cards.<br>
<br>
The following command will read the clock speed and load on the first gpu.<br>
<br>
--> aticonfig --adapter=0 --od-getclocks<br>
<br>
The following command will read and display the results continuously.<br>
<br>
--> watch aticonfig --adapter=0 --od-getclocks<br>
<br>
The following command will read the temperature of the first gpu.<br>
<br>
--> aticonfig --adapter=0 --odgt<br>
<br>
The following command will read and display the results continuously.<br>
<br>
--> watch aticonfig --adapter=0 --odgt<br>
<br>
Once I found this out, I modified my mining program to add a temperature status window for each gpu so I could keep an eye on the temperature. This script file shows how I did it.<br>
<br>
<a href="https://dl.dropboxusercontent.com/u/9879631/start-miners" target="_blank">https://dl.dropboxusercontent.<u></u>com/u/9879631/start-miners</a><br>
<br>
If you look at these images, I also discovered something very interesting. The first one is the same as the one mentioned above, including the temperature readings of the GPU's on my Mint machine. The second is an image of the temperature readings of the GPU's on my Windows machine.<br>
<br>
<a href="https://dl.dropboxusercontent.com/u/9879631/sensors-sample1.png" target="_blank">https://dl.dropboxusercontent.<u></u>com/u/9879631/sensors-sample1.<u></u>png</a><br>
<a href="https://dl.dropboxusercontent.com/u/9879631/sensors-sample2.png" target="_blank">https://dl.dropboxusercontent.<u></u>com/u/9879631/sensors-sample2.<u></u>png</a><br>
<br>
All the gpu's are being run at close to 100% load, and the cases of both computers are well ventilated with multiple fans.<br>
<br>
Look at the Miner 1 temperature window in image 1. This is an MSI 7850 gpu running in the Mint machine. It's running at 73 deg C.<br>
<br>
Now, look at the right hand window in image 2. This is an IDENTICAL MSI 7850 gpu running in the Windows machine. It's running at 62 deg C.<br>
<br>
Like I said, they're identical cards running in almost identical conditions. So why is one running 11 degrees hotter than the other.<br>
<br>
This was puzzling me for a while but I think I've figured it out.<br>
<br>
In the Linux machine, the MSI card is in the TOP one in the chassis. That means its intake fan is right next to the 2nd gpu, with only about 1/8" of space between. So, it's air flow is very restricted. That's the card that's running hotter.<br>
<br>
In the Windows machine, the MSI card is the SECOND one in the chassis. It has several inches of air gap to the next object. It's the one that is running cooler.<br>
<br>
Now look at each image and compare the readings for each card within the same computer.<br>
<br>
In image 1, the Mint machine, Miner 1, the top card, is at 73 deg C. Miner 2, the bottom card, is at 57 deg C.<br>
<br>
In image 2, the Windows machine, the left window is an Asus 7850 card, and is the top card. It's at 75 deg C. The right window, the MSI card, is in the bottom slot. It's running at 62 deg C.<br>
<br>
So, in one case, the top card is running 16 degrees hotter. In the other case, the top card is running 13 degrees hotter.<br>
<br>
Based on this, I am convinced that any gpu or other card with it's own fan on the side will run substantially hotter than its baseline temperature if it's next to another card.<br>
<br>
I'm not quite sure what to do about it. I think 75 deg C is OK, but not great. For what it's worth, I think my AMD cpu's are rated at about 67 deg C. Apparently, the gpu's have more tolerance. You can see in image 2 that the fans on the gpu's in the Windows system are only running at about 40% of their max, assuming that GPU-Z is reading them right. So, maybe the card is not too unhappy. But, it may mean the card would be pushed over its thermal limits much faster if a case fan fails, or if the room ambient temperature rises too much.<br>
<br>
Anyway, I found this fascinating. I guess I'll just have to keep a close eye on any PCI-E cards with fans which are jammed up against other cards.<br>
<br>
PS I think I was monitoring the wrong temperature for CPU on my desktop machine for years. The MSI motherboards have a 2 digit led display on the board which monitors post codes and then temperature once the machine is running. I was monitoring the sensor that matched that reading. When I ran the AMD Overdrive utility, it came up with a different, lower, number for CPU temperature, so I started monitoring that instead. I don't know now exactly which temperature that the motherboard display is monitoring.<br>
<br>
PPS I took some of the text in this email from the Linux machine to the Windows machine to write the email. When I tried to open it up in notepad, I just got one long line of text with no breaks, since Windows has different line breaks. However, I found out that I could open it in Wordpad and it worked OK. Then, I could copy it into this email.<br>
<br>
Let me know what your experiences have been monitoring and controlling temperature.<br>
<br>
Hope this is helpful.<br>
<br>
Sincerely,<br>
<br>
Ron<br>
<br>
<br>
-- <br>
<br>
(PS - If you email me and don't get a quick response, you might want to<br>
call on the phone. I get about 300 emails per day from alternate energy<br>
mailing lists and such. I don't always see new email messages very quickly.)<br>
<br>
Ron Frazier<br>
<a href="tel:770-205-9422" value="+17702059422" target="_blank">770-205-9422</a> (O) Leave a message.<br>
linuxdude AT <a href="http://techstarship.com" target="_blank">techstarship.com</a><br>
Litecoin: LZzAJu9rZEWzALxDhAHnWLRvybVAVg<u></u>wTh3<br>
Bitcoin: 15s3aLVsxm8EuQvT8gUDw3RWqvuY9h<u></u>PGUU<br>
<br>
______________________________<u></u>_________________<br>
Ale mailing list<br>
<a href="mailto:Ale@ale.org" target="_blank">Ale@ale.org</a><br>
<a href="http://mail.ale.org/mailman/listinfo/ale" target="_blank">http://mail.ale.org/mailman/<u></u>listinfo/ale</a><br>
See JOBS, ANNOUNCE and SCHOOLS lists at<br>
<a href="http://mail.ale.org/mailman/listinfo" target="_blank">http://mail.ale.org/mailman/<u></u>listinfo</a><br>
</blockquote></div><br><br clear="all"><br>-- <br>-- <br>James P. Kinney III<br><i><i><i><i><br></i></i></i></i>Every time you stop a school, you will have to build a jail. What you
gain at one end you lose at the other. It's like feeding a dog on his
own tail. It won't fatten the dog.<br>
- Speech 11/23/1900 Mark Twain<br><i><i><i><i><br><a href="http://electjimkinney.org" target="_blank">http://electjimkinney.org</a><br><a href="http://heretothereideas.blogspot.com/" target="_blank">http://heretothereideas.blogspot.com/</a><br>
</i></i></i></i>
</div>