Present Location: News >> Blog >> coretemp and Core i7-980X

Blog

> coretemp and Core i7-980X
Posted by prox, from Charlotte, on May 09, 2010 at 20:01 local (server) time

As I mentioned recently, I'm now running an Intel Core i7-980X processor (6x physical cores, each capable of executing 2x threads via Intel HT, abstracted by the OS as 12x CPUs) in my workstation.  It's running great, but I've been lacking the ability to view CPU temperature readings from the operating system (Debian GNU/Linux x86_64 with a slightly modified kernel 2.6.32).

None of the modules for lm-sensors picked up anything on my DX58SO board that would report temperature.  The ACPI thermal zone module didn't pick up a thermal zone, and the coretemp module in the Linux kernel whined that it didn't support the CPU model:

[462200.646777] coretemp: Unknown CPU model 2c
[462200.646780] coretemp: Unknown CPU model 2c
[462200.646782] coretemp: Unknown CPU model 2c
[462200.646783] coretemp: Unknown CPU model 2c
[462200.646785] coretemp: Unknown CPU model 2c
[462200.646786] coretemp: Unknown CPU model 2c
[462200.646787] coretemp: Unknown CPU model 2c
[462200.646789] coretemp: Unknown CPU model 2c
[462200.646790] coretemp: Unknown CPU model 2c
[462200.646792] coretemp: Unknown CPU model 2c
[462200.646793] coretemp: Unknown CPU model 2c
[462200.646795] coretemp: Unknown CPU model 2c

Yes, it apparently went through every virtual CPU (or should we call it a thread, I don't even know anymore!).

I took a guess that the interface didn't change too much in the Gulftown vs. the original Bloomfield line of processors, so I took a peek at coretemp.c, and added in the new model code:

--- coretemp-orig.c 2010-05-09 19:43:06.000000000 -0400
+++ coretemp.c 2010-05-09 19:43:12.000000000 -0400
@@ -455,12 +455,12 @@
          /* check if family 6, models 0xe (Pentium M DC),
            0xf (Core 2 DC 65nm), 0x16 (Core 2 SC 65nm),
            0x17 (Penryn 45nm), 0x1a (Nehalem), 0x1c (Atom),
-           0x1e (Lynnfield) */
+           0x1e (Lynnfield), 0x2c (Gulftown) */
          if ((c->cpuid_level < 0) || (c->x86 != 0x6) ||
              !((c->x86_model == 0xe) || (c->x86_model == 0xf) ||
               (c->x86_model == 0x16) || (c->x86_model == 0x17) ||
               (c->x86_model == 0x1a) || (c->x86_model == 0x1c) ||
-              (c->x86_model == 0x1e))) {
+              (c->x86_model == 0x1e) || (c->x86_model == 0x2c))) {

               /* supported CPU not found, but report the unknown
                  family 6 CPU */

I did a make-kpkg kernel-image to rebuild the kernel image the Debian way, and installed the new kernel (didn't bother rebooting, since only the module changed).  It seemed to load fine.  I got some warnings in the kernel log buffer after the module was loaded:

[463606.992361] coretemp coretemp.0: Unable to access MSR 0xEE, for Tjmax, left at default
[463606.992417] coretemp coretemp.1: Unable to access MSR 0xEE, for Tjmax, left at default
[463606.992449] coretemp coretemp.2: Unable to access MSR 0xEE, for Tjmax, left at default
[463606.992483] coretemp coretemp.3: Unable to access MSR 0xEE, for Tjmax, left at default
[463606.992516] coretemp coretemp.4: Unable to access MSR 0xEE, for Tjmax, left at default
[463606.992554] coretemp coretemp.5: Unable to access MSR 0xEE, for Tjmax, left at default
[463606.992592] coretemp coretemp.6: Unable to access MSR 0xEE, for Tjmax, left at default
[463606.992629] coretemp coretemp.7: Unable to access MSR 0xEE, for Tjmax, left at default
[463606.992665] coretemp coretemp.8: Unable to access MSR 0xEE, for Tjmax, left at default
[463606.992698] coretemp coretemp.9: Unable to access MSR 0xEE, for Tjmax, left at default
[463606.992734] coretemp coretemp.10: Unable to access MSR 0xEE, for Tjmax, left at default
[463606.992768] coretemp coretemp.11: Unable to access MSR 0xEE, for Tjmax, left at default

After refining my Google search terms a few times to avoid clothing stores, apparently Tjmax is an abbreviation for Thermal Junction Maximum - essentially the maximum temperature of the processor that will trigger a shutdown.  I figured that wouldn't affect current temperature readings, and looked at some of the values:

(destiny:19:49)% sensors
coretemp-isa-0000
Adapter: ISA adapter
Core 0:      +78.0°C  (high = +80.0°C, crit = +100.0°C)

coretemp-isa-0001
Adapter: ISA adapter
Core 1:      +73.0°C  (high = +80.0°C, crit = +100.0°C)

[...]

coretemp-isa-000a
Adapter: ISA adapter
Core 10:     +70.0°C  (high = +80.0°C, crit = +100.0°C)

coretemp-isa-000b
Adapter: ISA adapter
Core 11:     +70.0°C  (high = +80.0°C, crit = +100.0°C)

Yikes!  12 values?  Apparently coretemp iterates through every CPU (as we saw in the original module loading error messages), regardless if it's a virtual CPU or not.  So, we apparently have 12 temperature values.  They all seem to vary by a degree or so, too.  And yes, I was doing some H.264 encoding at the time I took the snapshot above, normally temperature is around 32°C.

There's apparently a small discussion about this oddity, and how to correctly report CPU core temperatures on multicore processors with HT.  Right now, I'm still scratching my head about the fact that the virtual CPUs are all reporting slightly different temperature values.  I would have expected odd numbered CPUs to report NULL, or something.  Or maybe even just the same value as the previous CPU.

Regardless, I'm now graphing all the threads / virtual CPUs!

Comment by Aimon on September 11, 2010 at 07:39 local (server) time

Hi, I have same issue. I owuld like to recompile with your patch.. but the coretemp.c from the rpm you reference does not have that code in it.. I grepped it and the lm_sensors code.. no 0x1e or Nahalem anywhere. Where did you get that coretemp.c?

Regards.

Comment by Mark Kamichoff [Website] on September 11, 2010 at 15:07 local (server) time

Hi Aimon, I recompiled from the Debian kernel source (kernel-source-2.6.32-2.6.32-9).  I think there's a newer version of it, now, though:

http://packages.debian.org/squeeze/linux-source-2.6.32

Hope that helps!

- Mark


> Add Comment

New comments are currently disabled for this entry.