[ale] fascinating data on temperature, including ATI / AMD Radeon gpu
Ron Frazier (ALE)
atllinuxenthinfo at techstarship.com
Sun Apr 21 00:38:17 EDT 2013
Hi all,
The topic of monitoring temperatures in a PC comes up here
periodically. As I mentioned in other threads, I've been working with
graphics cards on a Mint installation for cryptocurrency computations.
As you may know from my previous posts, I've always wanted to keep an
eye on the status of my systems. In the process of working with this
project, I've discovered a number of interesting pieces of information
that I thought I'd share.
Take a look at this image:
https://dl.dropboxusercontent.com/u/9879631/sensors-sample1.png
This shows a part of my screen on my Mint system. Note my Gnome panel
at the top with a temperature monitor on it. This is the hardware
monitor widget that is available in Gnome. However, when I installed
the ATI / AMD graphics drivers, the sensor system was no longer able to
monitor the cpu. After a bit of googling, I was directed to
lm-sensors. Many of you are already aware of that. I tried this command.
--> sudo apt-get install lm-sensors
I found that it was already installed.
I then found and issued these two commands to reinitialize the system.
--> sudo sensors-detect
I accepted the defaults here then told it to save the changes.
--> sudo service module-init-tools start
I think that allowed the changes to take effect without a reboot.
This allowed the sensor system to work again, and my panel widgets to
read both the cpu temperature and the hard drive temperatures as shown
in the image.
You can use this command to read the sensors once in a terminal window.
--> sensors
This command will read the sensors every few seconds and display the
results continuously.
--> watch sensors
I searched for a while to find a utility to read the gpu temperatures.
I found nothing for a while. Then I discovered that it's built into the
ATI / AMD driver. I don't know how to do this with nvidia cards.
The following command will read the clock speed and load on the first gpu.
--> aticonfig --adapter=0 --od-getclocks
The following command will read and display the results continuously.
--> watch aticonfig --adapter=0 --od-getclocks
The following command will read the temperature of the first gpu.
--> aticonfig --adapter=0 --odgt
The following command will read and display the results continuously.
--> watch aticonfig --adapter=0 --odgt
Once I found this out, I modified my mining program to add a temperature
status window for each gpu so I could keep an eye on the temperature.
This script file shows how I did it.
https://dl.dropboxusercontent.com/u/9879631/start-miners
If you look at these images, I also discovered something very
interesting. The first one is the same as the one mentioned above,
including the temperature readings of the GPU's on my Mint machine. The
second is an image of the temperature readings of the GPU's on my
Windows machine.
https://dl.dropboxusercontent.com/u/9879631/sensors-sample1.png
https://dl.dropboxusercontent.com/u/9879631/sensors-sample2.png
All the gpu's are being run at close to 100% load, and the cases of both
computers are well ventilated with multiple fans.
Look at the Miner 1 temperature window in image 1. This is an MSI 7850
gpu running in the Mint machine. It's running at 73 deg C.
Now, look at the right hand window in image 2. This is an IDENTICAL MSI
7850 gpu running in the Windows machine. It's running at 62 deg C.
Like I said, they're identical cards running in almost identical
conditions. So why is one running 11 degrees hotter than the other.
This was puzzling me for a while but I think I've figured it out.
In the Linux machine, the MSI card is in the TOP one in the chassis.
That means its intake fan is right next to the 2nd gpu, with only about
1/8" of space between. So, it's air flow is very restricted. That's
the card that's running hotter.
In the Windows machine, the MSI card is the SECOND one in the chassis.
It has several inches of air gap to the next object. It's the one that
is running cooler.
Now look at each image and compare the readings for each card within the
same computer.
In image 1, the Mint machine, Miner 1, the top card, is at 73 deg C.
Miner 2, the bottom card, is at 57 deg C.
In image 2, the Windows machine, the left window is an Asus 7850 card,
and is the top card. It's at 75 deg C. The right window, the MSI card,
is in the bottom slot. It's running at 62 deg C.
So, in one case, the top card is running 16 degrees hotter. In the
other case, the top card is running 13 degrees hotter.
Based on this, I am convinced that any gpu or other card with it's own
fan on the side will run substantially hotter than its baseline
temperature if it's next to another card.
I'm not quite sure what to do about it. I think 75 deg C is OK, but not
great. For what it's worth, I think my AMD cpu's are rated at about 67
deg C. Apparently, the gpu's have more tolerance. You can see in image
2 that the fans on the gpu's in the Windows system are only running at
about 40% of their max, assuming that GPU-Z is reading them right. So,
maybe the card is not too unhappy. But, it may mean the card would be
pushed over its thermal limits much faster if a case fan fails, or if
the room ambient temperature rises too much.
Anyway, I found this fascinating. I guess I'll just have to keep a
close eye on any PCI-E cards with fans which are jammed up against other
cards.
PS I think I was monitoring the wrong temperature for CPU on my desktop
machine for years. The MSI motherboards have a 2 digit led display on
the board which monitors post codes and then temperature once the
machine is running. I was monitoring the sensor that matched that
reading. When I ran the AMD Overdrive utility, it came up with a
different, lower, number for CPU temperature, so I started monitoring
that instead. I don't know now exactly which temperature that the
motherboard display is monitoring.
PPS I took some of the text in this email from the Linux machine to the
Windows machine to write the email. When I tried to open it up in
notepad, I just got one long line of text with no breaks, since Windows
has different line breaks. However, I found out that I could open it in
Wordpad and it worked OK. Then, I could copy it into this email.
Let me know what your experiences have been monitoring and controlling
temperature.
Hope this is helpful.
Sincerely,
Ron
--
(PS - If you email me and don't get a quick response, you might want to
call on the phone. I get about 300 emails per day from alternate energy
mailing lists and such. I don't always see new email messages very quickly.)
Ron Frazier
770-205-9422 (O) Leave a message.
linuxdude AT techstarship.com
Litecoin: LZzAJu9rZEWzALxDhAHnWLRvybVAVgwTh3
Bitcoin: 15s3aLVsxm8EuQvT8gUDw3RWqvuY9hPGUU
More information about the Ale
mailing list