Intel Core I7: Nehalem Arrives
Join the community - in the OCAU Forums!
Power, Heat, Overclocking, Conclusions
POWER CONSUMPTION AND HEAT DISSIPATION
Intel is rating the Core i7 920/940/965 with a thermal design power (TDP) of 130 Watt and a maximum case temperature of 55C. The QX9770 is rated at 132 Watt and 55.5C. Going by these values, power consumption and heat dissipation should not be much different. On the other hand, Core i7 processes more instructions per clock cycle than Penryn, and better performance comes normally on the expense of higher power consumption. However, we found that Core i7 is doing quite well in this regard. The wattage monitoring unit we used reads power consumption at the wall plug. The values shown in the “System Power” chart represent therefore total system consumption, including video card, chipset, RAM, and peripherals. When idling, the i7 920 system consumes 132 Watt. The i7 965 system consumes 137 Watt, 4.5 % more than the QX9770 system. To simulate maximum processor load we did run 8 instances of Prime95. The i7 965 consumes 241 Watt. This is again around 4% more than the QX9770 under the same condition. The temperature of the 4 x i7 965 cores was between 54C and 58C when Prime96 was running. These are very acceptable values both for the wattage and the heat dissipation considering that Core i7 is running most applications much faster than Penryn.
OVERCLOCKING “CORE I7”
All “Core i7” platform busses and components are driven from a single 133.33 MHz base clock. The resulting component speed values are generated by applying a multiplier value to this base clock. There are four multipliers on i7 motherboards which are used to set the system speed:
1. CPU speed
When multiplied by the system base clock speed (default 133.33 MHz) gives the CPU frequency. Four multipliers are used to define different speeds based on the number of active CPU cores.
2. Memory speed
When multiplied by the system base clock speed gives the memory frequency. For example a Memory Multiplier of 10 times the base clock of 133.33 MHz results in a memory frequency of 1333 MHz.
Selectable transfer rate of data transferred between the CPU and the IOH.
This multiplier applies to the non-CPU related items in the processor. The limit on this multiplier is set by the Memory multiplier
Unchanged from previous generation Intel CPU there are three basic methods of enhancing CPU performance, only the terms have changed. One method increases the CPU multiplier to change processor frequency. The second one increases the base clock frequency. This was previously done by increasing the FSB frequency. With “Core i7” you do it by increasing the QPI bus frequency. The final method is a combination of both. The combination method adds a significant level of complexity to the process. As with previous generation Intel CPU, only Extreme Edition processors are capable of modifying the multiplier. That leaves a regular user with the method of increasing the QPI bus frequency in order to enhance processor/platform performance above default. The one difference however with Core i7 is that by default everybody gets already up to 266 MHz additional frequency by automated overclocking through “Turbo Mode.”
We are now describing how you overclock a Core i7 processor with Intel’s DX58SO motherboard: In the Processor tuning page of BIOS setup perform the following steps:
- Increase the Dynamic CPU Voltage Offset.
- Program the TDC (current) and TDP (power) limits for Turbo Mode so that the processor does not throttle at peak performance conditions.
- Program highest CPU Ratio used by Turbo Mode for each of the 1,2,3,4 Core Ratio Limits.
In the following BIOS screenshot each of the multiplier selections have been set to 29, this will run the CPU at 29 x 133.33 MHz = 3.86 GHz no matter how many CPU cores are active.
The frequency of the CPU can also be set by increasing the base clock and leaving the multipliers constant. From the Performance setup page set the ‘Host Clock Frequency Override’ setting to ‘Manual’. This allows increasing the Host Clock Frequency (MHz). The following BIOS screenshot shows the setting for a “Core i7 965”. The default multiplier of 25x and a Host Clock Frequency of 155 MHz are resulting in a processor frequency of 3.88 GHz
Penryn’s excellent overclocking potential is credited to Intel’s high yielding 45nm production technology with low current leakage and the usage of advanced materials as explained in our Penryn review. Core i7 is produced with the same technology and should overclock therefore not worse than Penryn. At least that is what our results were telling us. The Core i7 965 is an Extreme Edition and that means we could change the processor multiplier. This keeps all other system components stress free when overclocking. After the usual procedure with trying various BIOS settings we settled at 32x133.3=4260 MHz. This is with air-cooling and 1.55 core voltage.
The result with the much more affordable Core i7 920 is probably of greater interest for you. As this processor is “multiplier locked”, it can only be overclocked by increasing the QPI frequency. With the help of additional voltage the 920 would run at 20 x 185.5 MHz QPI=3720 MHz. However because the i7 920 continued to run in “Turbo Mode”, the actual frequency was 22 x 185.5 = 4081 MHz. The i7 920 would run Prime95 for 1 hour with these settings. 4081 MHz is 53% above default and quite an amazing result. We want to hold back though with far reaching assumptions here, because both processors are engineering samples. They might have been “hand-picked by Intel for their high frequency headroom. It has to be seen how retail processor overclock, but Nehalem looks promising.
SUMMARY AND FINAL THOUGHTS
Nehalem offers the following improvements over the existing “Penryn” platform:
- various enhancements to the microarchitecture such as by new instructions
- a new, far more efficient system architecture
- simultaneous multithreading (SMT)
- built-in overclocking by “Turbo Mode”
We guess that the biggest impact is from the vastly improved system architecture namely the now on-die memory controller and the very fast QPI interconnect. Memory bandwidth has almost doubled and latency reduced by almost 50%. Our application test results show clock-for-clock a massive improvement with Core i7 versus previous generation Penryn. For applications that can run heavily multithreaded like WinRAR, POV-Ray or Cinebench, but also in calculation oriented software, we observed gains between 30% and 70%. The smallest gain in 16 tests was 14%. Applications do benefit to a much different extent from the mentioned improvements and changes. SMT for example is not always enhancing performance. In particular for gaming we found it not helpful and kept it disabled. It is therefore difficult to give an “average” for the performance gain with Core i7, but we would put it conservatively at 20% for applications.
When it comes to gaming Core i7 fell a bit short of our expectations. Nehalem shines with highly multithreaded applications, but that does not pay off with games. Its relative small L2 cache does not help for gaming either. Clock-by-clock you will get, give or take, the same performance with a Core i7 processor than with a previous generation Penryn processor. We expect Core i7 to gain ground though with newer games that will run more multithreaded than current games.
The Core i7 overclocking potential seems to be at least as good as with the highly praised Penryn core. Our Core i7 920 was with additional voltage, but air-cooled, running at 2660@4080 MHz. And if you are not interested in going into the pains of BIOS modifications, failed system boots, system crashes, and file corruptions by unsuccessful overclocking attempts, just enjoy the built-in “Turbo Mode”. You get for free and with no hassle up to 266 MHz on top of the rated processor frequency. That is 10% more for a Core i7 920. Not bad. You would assume that a vastly better performing processor like the Core i7 would consume considerable more power. Considering an average performance gain of 25%, we found the small increase in power consumption of around 10 Watt (4%) under load very acceptable.
What is next? In 2009 Intel will gradually introduce more Nehalem desktop variants including a 2-core entry level processor, as well as notebook/netbook, and server processors. And because the Nehalem platform is easy to “expanded”, we will even get an 8-core processor capable of running 16 threads. For their QX9775 dual processor “Skulltrail” platform Intel had to use a workstation chipset, because the existing desktop chipsets did not allow running 2 processors from the same PCB. This is possible with the new design, and we will get a Nehalem based dual socket platform in 2009 as well. And the clock continues to tick. Intel has already committed itself to the next “tock”. The next processor generation “Westmere” will be based on the Nehalem microarchitecture but will be produced with 32nm technology.
Did you enjoy this article? Please share it via Digg:
All original content copyright James Rolfe.
All rights reserved. No reproduction allowed without written permission.
Interested in advertising on OCAU? Contact us for info.