Overclockers Australia!
Make us your homepage. Add us to your bookmarks  
Major Sponsors:

News
Current
News Archive
SEND NEWS!

Site
Articles & Reviews
Forums
Wiki
Podcast
Pix
Search
Contact

Team OCAU
Folding Team
Seti@Home Team
Climate Prediction

Misc
OCAU Sponsors
OCAU IRC
Online Vendors
Motorcycle Club

Advertisement:
Intel Penryn: Fast, cool, and “greenish”
Join the community - in the OCAU Forums!
Date 20th November 2007
Author Chainbolt
Editor James "Agg" Rolfe
Manufacturer Intel


Media Encoding, Image Processing, Desktop, Folding

Media Encoding:
Microsoft “Windows Media Encoder 9” is a multithreaded application and therefore makes good use of a quadcore processor. We used the encoder to convert a 9 MB AVI video clip into a 39 MB WMV file. As with most encoding applications, WME 9 is not (yet) utilizing SSE4 instructions. Nonetheless the QX9650 is almost 10% faster than the QX6850 and the difference between the QX9650 and the slowest dual core processor is more than 100%!

“Main Concept Reference 1.0” is also not yet taking advantage of SSE4. We transcoded a 40 MB AVI file into the H.264 codec using one of the built-in encoding profiles. The results are similar to what we obtained with MS Media Encoder: the QX9650 is around 10% faster than the QX6850 and does the job roughly twice as fast as the two dual core processors.

The 6.7 version of the popular DivX codex is already configurable to use SSE4 and we expected therefore an even more substantial gain with this codec. We used “Virtual Dub” as front-end for DivX to encode a 70 MB MPEG2 source file. We first encoded with SSE2 and then the same video clip with SSE4 for experimental full-search enabled. The difference between the QX6950 and the QX6850 at equal footing, and by this I mean when both are only using SSE2, is 11%. However, when we encoded the same source file again with SSE4 enabled we found a difference of 43% between the QX9650 and the QX6850.

The last test in this round is transcoding a 112 MB WAV file to MP3 with the multithreaded version of LAME. Again we see the QX9650 coming out first. It encoded WAV into MP3 12% faster than its predecessor.

Click to Enlarge   Click to Enlarge   Click to Enlarge
Windows Media Encoder 9 - Main Concept Reference 1.0 - DivX

 

To sum it up: the QX9650 is encoding media files at least 10% faster than the previous QX6850 on a clock-to-clock basis due to its enlarged L2 cache and enhanced micro-architecture. If SSE4 instructions are used, it seems a gain of 40% or more is a realistic expectation.

Image Processing
How is the QX9650 at rendering and editing 3D images? We ran four popular applications to find out. Rendering and other image processing tasks are easy to break down into parallel threads and are therefore ideal tasks for a multi-core processor.

“Cinebench R10” comes with a built-in benchmark. A predefined picture is rendered either single-threaded or with as many threads as cores are available. The QX9650 is achieving a 10% better score than the QX6850 in both the single and the multiprocessor test.

“Panorama Factory” lets you stitch together multiple images to create a wide aspect panorama. We asked the application to stitch four 3MB 1920x1080 photos together. The result is a glorious Tokyo neighborhood panorama. The QX6950 is processing the four pictures in 30 seconds, the QX6850 in 33 seconds. The process took much longer with the dual-core CPUs: 45 and 48 seconds respectively.

There isn’t much to explain about Adobe’s “Photoshop”. It’s probably the most popular image editing software around. We are applying four filters to a 75MB picture. The QX9650 is around 7% faster than the QX6850 in this test, and the X6600 comes out last.

The “Persistence of Vision Raytracer”, or POV-Ray, is a ray tracing program for creating 3D graphics. We used version 3.7 beta that comes with native multithreading. We rendered the 1024x768 demo picture “chess2.pov” with 1 thread, 2 threads and 3 threads. The results are another testimony for quadcore processing power: they render the POV-Ray demo picture almost twice as fast as the dual core CPUs. The QX9650 is 18 seconds or exactly 15% faster than the QX6850.

If we believe our test results, 3D image processing is another area where Penryn will excel. The user can expect clock-to-clock performance improvements of up to 15%.

Click to Enlarge   Click to Enlarge   Click to Enlarge
Cinebench R10 - POV-Ray - Panorama Factory

 


Desktop Tasks:
We finally threw a couple of desktop tasks at the QX9650. The popular WinRAR 3.71 file compression runs multithreaded and uses a lot of processing power. Among the quadcore CPUs, the QX9650 is once more far ahead thanks to the larger L2 cache and other enhancements. It compressed a 90MB file in 46 seconds - that’s 6 seconds or 15% faster than the QX6850.

The difference when virus-checking a 100 GB disk drive with around 90,000 objects with AVG 7.5 was just 5%, though. The QX9650 transferred a 36-pages PowerPoint file into a PDF document with Adobe’s “PDFMaker 7.0” in 60.3 seconds, that’s around 8% faster than the QX6850. “SpyBot - Search and Destroy” is scanning, removing and blocking spyware, adware, trojans, keyloggers and other unpleasant stuff. The QX9650 scanned an almost full 100GB disk drive in 50.3 seconds, around 6% faster than the QX6850.

Click to Enlarge   Click to Enlarge
WinRAR - PowerPoint to PDF

Click to Enlarge   Click to Enlarge
Spybot S&D - AVG


Folding@Home:
Folding@Home is a distributed computing project. It studies protein folding, misfolding, aggregation and related diseases. Worldwide more than 400,000 PC users are participating in this project. Many of them are banded together in competing teams. OCAU (Team 24) is currently the world’s #2 team. In case you haven’t joined us, please consider it.

The calculation of a F@H work unit puts maximum stress on the CPU. In addition to the fine goals of the project, F@H is therefore also an excellent tool for testing system stability and efficiency. It is possible to run as many F@H instances as cores are available. A quadcore processor is therefore highly efficient for folding at home. We let the QX9650 “fold” a “Gromac” work unit and compared the result with the time it took the QX6850 to finish the same work unit. For a realistic work load scenario we let the other 3 cores crunching on work units as well. The QX9650 finished the protein “p2147_lambda _m2_expl_99p” in 17 hours and 11 minutes; the QX6850 took 18 hours and 2 minutes. The difference of around 5% seems to be quite substantial, considering that the gain could multiply with the number of cores folding. However, this result cannot be taken as a general benchmark for folding efficiency as F@H cores are performing in a different way with a CPU. The Gromac core for example uses SSE instructions, something “older” computing cores cannot do.





Advertisement:

All original content copyright James Rolfe.
All rights reserved. No reproduction allowed without written permission.
Interested in advertising on OCAU? Contact us for info.

Hosted by Micron21!
Advertisement:

Recent Content


Mini Server Rack
Gashapon



SpaceX Starlink



T-Force Cardea
Zero Z330 NVMe SSD



Team Group T-Force
Vulcan G SSD



Synology DS720+ NAS



Raspberry Pi 4
Model B 8GB



Retro Extreme!