![]() |
News | Profile | Code | Photography | Looking Glass | Projects | System Statistics | Uncategorized |
Blog |
I haven't blogged in awhile.. so here are some updates.
I recently moved dax, which is the FreeBSD-based host that powers this web server, to a VM and away from Internap's Agile service. dax used to be a dedicated server hosted by Voxel dot net and provided IPv6 connectivity to the rest of my network since September of 2005. However, things started going downhill when Internap acquired them in 2012. Technical support turned into a horror story and they recently indicated to me (in an IPv6 support ticket from hell) that the Agile "legacy" services were going to be done away with later in 2016 (not sure if this is true).
I migrated all of my sites off the IPv6 PA /56 I had from Internap and onto my own PI /44. I moved dax to a VM on excalibur, a dedicated server hosted by Choopa, LLC. I'm now Internap-free and running everything off of AS395460.
Anyway, excalibur's got an Intel Xeon E3-1240 v2 CPU with 4x cores and should run at 3.40 GHz. Normally, with servers of mine that run VMs, I instruct cpufreq-set(1) to run all cores with the performance governor, which is supposed to direct all cores to run at their maximum clock rate. However, with the E3 Xeon CPUs, this doesn't work. The clock frequencies of each oscillate based on load and cause a considerable latency penalty, which is fairly visible in packet forwarding (causes jitter). Here's a snapshot:
(excalibur:2:12:EDT)% cat /proc/cpuinfo|grep -i MHz cpu MHz : 1714.742 cpu MHz : 2652.000 cpu MHz : 3563.492 cpu MHz : 3799.898 cpu MHz : 3722.070 cpu MHz : 2126.195 cpu MHz : 3800.164 cpu MHz : 2313.062
Searching on the web indicated I would need to completely disable the C-states in the BIOS, something I wasn't really willing to do and didn't feel like the correct solution. I then came across another post that indicated I should write "0" to /dev/cpu_dma_latency but keep the file open. So, I did this:
(excalibur:2:12:EDT)# cat > /dev/cpu_dma_latency 0
.. and didn't hit ^D. Sure enough:
(excalibur:2:12:EDT)% cat /proc/cpuinfo|grep -i MHz cpu MHz : 3599.882 cpu MHz : 3599.882 cpu MHz : 3599.882 cpu MHz : 3599.882 cpu MHz : 3599.882 cpu MHz : 3599.882 cpu MHz : 3599.882 cpu MHz : 3599.882
Seriously? There's a whole nice structure in /sys/devices/system/cpu/cpufreq/* that Linux has used for over a decade to control CPU frequency scaling and we've now a one-off /dev character device that controls such things on a modern CPU?
Well, considering the stupidity of systemd that's supposedly accepted by most Linux distributions now, I guess I shouldn't be surprised that this type of hacky interface exists. At least I've got a way to emulate the behavior of the performance governor without mucking with BIOS settings. It turns out this also works on my E3 1245 v3 Xeon I've at home in vega, too.
Update: This is a bad idea. This causes all CPUs to run very hot even when utilization is low, which confuses me:
Update: The right solution to this problem is to just disable the Intel P-states driver by passing intel_pstate=disable on the kernel command line. The acpi-cpufreq driver is used instead and operates the way it should. See comments for more information.
with :cat > /dev/cpu_dma_latency" I can also reproduce the heating issue but the manual governor setting seems to work without a hiccup.
Unfortunately, that does not work on the E3 Xeons, which is why I started looking around to begin with. As mentioned above, I run all cores using the "performance" governor, which still results in clock speed oscillation.
Which kernel version are you using? My cpu is very similar, also a Xeon E3 as I've posted before.
root@xxx:~# uname -a
Linux xxx.xxx.se 4.1.27-ovpn-grsec #2 SMP Sat Jul 2 03:02:29 CEST 2016 x86_64 GNU/Linux
What if cpufreq-set(1) misbehaves? Did you try the fully manual method?
4.6.0-1-amd64. cpufreq-set(1) just writes "performance" to scaling_governor, from what I can tell. Doing it manually makes no difference (I tried). The hierarchy is a little different between kernel versions, but:
(excalibur:14:26:EDT)% ls -la /sys/devices/**/scaling_governor
-rw-r--r-- 1 root root 4096 Sep 18 21:34 /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
-rw-r--r-- 1 root root 4096 Sep 18 21:21 /sys/devices/system/cpu/cpufreq/policy1/scaling_governor
-rw-r--r-- 1 root root 4096 Sep 18 21:21 /sys/devices/system/cpu/cpufreq/policy2/scaling_governor
-rw-r--r-- 1 root root 4096 Sep 18 21:21 /sys/devices/system/cpu/cpufreq/policy3/scaling_governor
-rw-r--r-- 1 root root 4096 Sep 18 21:21 /sys/devices/system/cpu/cpufreq/policy4/scaling_governor
-rw-r--r-- 1 root root 4096 Sep 18 21:21 /sys/devices/system/cpu/cpufreq/policy5/scaling_governor
-rw-r--r-- 1 root root 4096 Sep 18 21:21 /sys/devices/system/cpu/cpufreq/policy6/scaling_governor
-rw-r--r-- 1 root root 4096 Sep 18 21:21 /sys/devices/system/cpu/cpufreq/policy7/scaling_governor
(excalibur:14:26:EDT)% cat /sys/devices/**/scaling_governor
performance
performance
performance
performance
performance
performance
performance
performance
(excalibur:14:26:EDT)% grep MHz /proc/cpuinfo
cpu MHz : 1600.125
cpu MHz : 3800.031
cpu MHz : 3800.828
cpu MHz : 3094.664
cpu MHz : 3717.554
cpu MHz : 1770.656
cpu MHz : 3783.429
cpu MHz : 1963.367
Hey Mark,
Try to change the scaling driver
echo acpi-cpufreq >/sys/devices/system/cpu/cpufreq/X/scaling_driver
I think intel_pstate is causing the issue.
You might be on to something. I can't change the driver at runtime, though:
(excalibur:12:51:EDT)# pwd
/sys/devices/system/cpu/cpufreq/policy0
(excalibur:12:51:EDT)# cat scaling_driver
intel_pstate
(excalibur:12:52:EDT)# echo acpi_cpufreq > scaling_driver
zsh: permission denied: scaling_driver
(excalibur:12:52:EDT)#
I think I may need to disable it via the kernel command-line at boot with:
intel_pstate=disable
I'll give this a try during the next reboot, thanks!
Alright, if that doesn't work, try appending this to Grub's cmdline:
(/etc/default/grub in case of Debian-derivatives)
GRUB_CMDLINE_LINUX_DEFAULT="cpufreq_driver=acpi-cpufreq"
This worked like a champ on one of my boxes. With intel_pstate disabled, the kernel chooses acpi-cpufreq automatically.
(vega:11:08:PDT)% cat /sys/devices/**/scaling_governor
performance
performance
performance
performance
performance
performance
performance
performance
(vega:11:08:PDT)% grep MHz /proc/cpuinfo
cpu MHz : 3401.000
cpu MHz : 3401.000
cpu MHz : 3401.000
cpu MHz : 3401.000
cpu MHz : 3401.000
cpu MHz : 3401.000
cpu MHz : 3401.000
cpu MHz : 3401.000
(vega:11:08:PDT)% sensors
acpitz-virtual-0
Adapter: Virtual device
temp1: +27.8°C (crit = +105.0°C)
temp2: +29.8°C (crit = +105.0°C)
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0: +62.0°C (high = +80.0°C, crit = +100.0°C)
Core 0: +56.0°C (high = +80.0°C, crit = +100.0°C)
Core 1: +62.0°C (high = +80.0°C, crit = +100.0°C)
Core 2: +58.0°C (high = +80.0°C, crit = +100.0°C)
Core 3: +55.0°C (high = +80.0°C, crit = +100.0°C)
(vega:11:09:PDT)% cpufreq-info|grep driver
driver: acpi-cpufreq
driver: acpi-cpufreq
driver: acpi-cpufreq
driver: acpi-cpufreq
driver: acpi-cpufreq
driver: acpi-cpufreq
driver: acpi-cpufreq
driver: acpi-cpufreq
(vega:11:09:PDT)%
Thanks, Tommy!
Glad to hear! ;)
New comments are currently disabled for this entry.
![]() ![]() ![]() ![]() ![]() |
This HTML for this page was generated in 0.000 seconds. |
Hey Mark, how about ?
echo performance > /sys/devices/system/cpu/cpuX/cpufreq/scaling_governor
where X is the id of the cpu core.
CPU is: Intel(R) Xeon(R) CPU E3-1241 v3