Beruflich Dokumente
Kultur Dokumente
1.INTRODUCTION
Core i7 quad -core processor delivers 8-threaded performance .The Intel Core i7 processor also
offers unrivaled performance for immersive 3-D games - over 40 percent faster than previous
Intel high-performance processors on both the 3DMark Vantage CPU physics and AI tests,
popular industry computer benchmarks that measure gaming performance. The Extreme Edition
uses 8 threads to run games with advanced artificial intelligence and physics to make games act
and feel real. The Intel Core i7 processors and Intel X58 Express Chipset-based Intel® Desktop
Board DX58SO Extreme .Series are for sale immediately from several computer manufacturers
online and in retail stores, as well as a boxed retail product via channel online sales. The Core i7
processor is the first member of the Intel Nehalem micro architecture family; server and mobile
product versions will be in production later. Each Core i7 processor features an 8 MB level 3
cache and three channels of DDR3 1066 memory to deliver the best memory performance of any
desktop platform. Intel's top performance processor, the Intel Core i7 Extreme Edition, also
removes over speed protection ,allowing Intel's knowledgeable customers or hobbyists to further
increase the chip's speed
www.seminarcollections.com
Seminar Report– Nov ‘10 -2- Core i7
2.MOORE’S LAW
Moore's law describes a long-term trend in the history of computing hardware. Since the
invention of the integrated circuit in 1958, the number of transistors that can be placed
inexpensively on an integrated circuit has increased exponentially, doubling approximately every
two years. The trend was first observed by Intel co-founder Gordon E. Moore in a 1965 paper. It
has continued for almost half a century and in 2005 was not expected to stop for another decade
at least.
Almost every measure of the capabilities of digital electronic devices is strongly linked to
Moore's law: processing speed, memory capacity, sensors and even the number and size of pixels
in digital cameras. All of these are improving at (roughly) exponential rates as well. This has
dramatically increased the usefulness of digital electronics in nearly every segment of the world
economy. Moore's law describes this driving force of technological and social change in the late
20th and early 21st centuries.
On 13 April 2005, Gordon Moore stated in an interview that the law cannot be sustained
indefinitely: "It can't continue forever. The nature of exponentials is that you push them out and
eventually disaster happens." He also noted that transistors would eventually reach the limits of
miniaturization at atomic levels:
In terms of size [of transistors] you can see that we're approaching the size of atoms which is a
fundamental barrier, but it'll be two or three generations before we get that far—but that's as far
out as we've ever been able to see. We have another 10 to 20 years before we reach a
fundamental limit. By then they'll be able to make bigger chips and have transistor budgets in the
billions.
www.seminarcollections.com
Seminar Report– Nov ‘10 -3- Core i7
www.seminarcollections.com
Seminar Report– Nov ‘10 -4- Core i7
3.NEHALEM ARCHITECTURE
www.seminarcollections.com
Seminar Report– Nov ‘10 -5- Core i7
a)Nehalem Architecture
Initial Nehalem processors use the same 45 nm manufacturing methods as Penryn. A working
system with two Nehalem processors was shown at Intel Developer Forum Fall 2007,[7] and a
large number of Nehalem systems were shown at Computex in June 2008.
The architecture is named after the Nehalem River in Northwest Oregon,[citation needed] which
is in turn named after the Nehalem Native American nation in Oregon.[citation needed] The code
name itself had been seen on the end of several roadmaps starting in 2000. At that stage it was
supposed to be the latest evolution of the NetBurst architecture. Since the abandonment of
NetBurst, the codename has been recycled and refers to a completely different project.
www.seminarcollections.com
Seminar Report– Nov ‘10 -6- Core i7
www.seminarcollections.com
Seminar Report– Nov ‘10 -7- Core i7
4.FEATURES
a) Essential Features
no of Cores 4
Bus/Core Ratio 20
No of QPI Links 1
No of Memory Channels 3
Package Specifications
www.seminarcollections.com
Seminar Report– Nov ‘10 -8- Core i7
b) Advanced Technologies
www.seminarcollections.com
Seminar Report– Nov ‘10 -9- Core i7
c) Quick Path
QuickPath allows processors to take shortcuts when they ask other processors for information.
Imagine a quad-core microprocessor with processors A, B, C and D. There are links between
each processor. In older architectures, if processor A needed information from D, it would send a
request. D would then send a request to processors B and C to make sure D had the most recent
instance of that data. B and C would send the results to D, which would then be able to send
information back to A. Each round of messages is called a hop -- this example had four hops.
QuickPath skips one of these steps. Processor A would send its initial request -- called a "snoop"
-- to B, C and D, with D designated as the respondent. Processors B and C would send data to D.
D would then send the result to A. This method skips one round of messages, so there are only
three hops. like a small improvement, but over billions of calculations it makes a big difference.
In addition, if one of the other processors had the information A requests, it can send the data
directly to A. That reduces the hops to 2. QuickPath also packs information in more compact
payloads.
www.seminarcollections.com
Seminar Report– Nov ‘10 - 10 - Core i7
In a microprocessor, everything runs on clock cycles. Clock cycles are a way to measure how
long a microprocessor takes to execute an instruction. Think of it as the number of instructions a
microprocessor can execute in a second. The faster the clock speed, the more instructions the
microprocessor will be able to handle per second.
One way microprocessors like the Core i7 try to increase efficiency is to predict future
instructions based on old instructions. It's called branch prediction. When branch prediction
works, the microprocessor completes instructions more efficiently. But if a prediction turns out
to be inaccurate, the microprocessor has to compensate. This can mean wasted clock cycles,
which translates into slower performance.
Nehalem has two branch target buffers (BTB). These buffers load instructions for the processors
in anticipation of what the processors will need next. Assuming the prediction is correct, the
processor doesn't need to call up information from the computer's memory. Nehalem's two
buffers allow it to load more instructions, decreasing the lag time in the event one set turns out to
be incorrect.
Another efficiency improvement involves software loops. A loop is a string of instructions that
the software repeats as it executes. It may come in regular intervals or intermittently. With loops,
branch prediction becomes unnecessary -- one instance of a particular loop should execute the
same way as every other. Intel designed Nehalem chips to recognize loops and handle them
differently than other instructions.
Microprocessors without loop stream detection tend to have a hardware pipeline that begins with
branch predictors, then moves to hardware designed to retrieve -- or fetch -- instructions, decode
the instructions and execute them. Loop stream detection can identify repeated instructions,
bypassing some of this process.
Intel used loop stream detection in its Penryn microprocessors. Penryn's loop stream detection
hardware sits between the fetch and decode components of older microprocessors. When the
Penryn chip's detector discovers a loop, the microprocessor can shut down the branch prediction
and fetch components. This makes the pipeline shorter. But Nehalem goes a step farther.
Nehalem's loop stream detector is at the end of the pipeline. When it sees a loop, the
microprocessor can shut down everything except the loop stream detector, which sends out the
appropriate instructions to a buffer.
The improvements to branch prediction and loop stream detection are all part of Intel's "tock"
strategy. The transistors in Nehalem chips are the same size as Penryn's, but Nehalem's design
makes more efficient use of the hardware.
www.seminarcollections.com
Seminar Report– Nov ‘10 - 11 - Core i7
Nehalem's architecture allows each processor to handle two threads simultaneously. That means
an eight-core Nehalem microprocessor can process 16 threads at the same time. This gives the
Nehalem microprocessor the ability to process complex instructions more efficiently. According
to Intel, the multithreading capability is more efficient than adding more processing cores to a
microprocessor. Nehalem microprocessors should be able to meet the demands of sophisticated
software like video editing programs or high-end video games.
Another benefit to multithreading is that the processor can handle multiple applications at the
same time. This lets you work on complex programs while running other applications like virus
scanners in the background. With older processors, these activities could cause a computer to
slow down or even crash.
Nehalem's turbo boost feature is similar to an old hacking trick called overclocking. To overclock
a microprocessor is to increase its processing frequency beyond the normal parameters of the
chip. But overclocking isn't always a good idea -- it can cause chips to overheat.The turbo boost
feature is dynamic -- it makes the Nehalem microprocessor work harder as the workload
increases, provided the chip is within its operating parameters. As workload decreases, the
microprocessor can work at its normal clock frequency. Because the microchip has a monitoring
system, you don't have to worry about the chip overheating or working beyond its capacity. And
when you aren't placing heavy demands on your processor, the chip conserves power.
www.seminarcollections.com
Seminar Report– Nov ‘10 - 12 - Core i7
f)CPU Cache
i)Cache
A CPU cache is a cache used by the central processing unit of a computer to reduce the
average time to access memory. The cache is a smaller, faster memory which stores copies of
the data from the most frequently used main memory locations. As long as most memory
accesses are cached memory locations, the average latency of memory accesses will be closer
to the cache latency than to the latency of main memory.
When the processor needs to read from or write to a location in main memory, it first checks
whether a copy of that data is in the cache. If so, the processor immediately reads from or
writes to the cache, which is much faster than reading from or writing to main memory.
Each location in each memory has a datum (a cache line), which in different designs
ranges in size from 8 to 512 bytes. The size of the cache line is usually larger than the size of
the usual access requested by a CPU instruction, which ranges from 1 to 16 bytes. Each
location in each memory also has an index, which is a unique number used to refer to that
location. The index for a location in main memory is called an address. Each location in the
cache has a tag that contains the index of the datum in main memory that has been cached. In
a CPU's data cache these entries are called cache lines or cache blocks
Most modern desktop and server CPUs have at least three independent caches: an
instruction cache to speed up executable instruction fetch, a data cache to speed up data fetch
and store, and a translation lookaside buffer used to speed up virtual-to-physical address
translation for both executable instructions and data.
www.seminarcollections.com
Seminar Report– Nov ‘10 - 13 - Core i7
In the case of a cache miss, most caches allocate a new entry, which comprises the tag just
missed and a copy of the data from memory. The reference can then be applied to the new
entry just as in the case of a hit. Misses are comparatively slow because they require the data
to be transferred from main memory. This transfer incurs a delay since main memory is much
slower than cache memory, and also incurs the overhead for recording the new data in the
cache before it is delivered to the processor.
In order to make room for the new entry on a cache miss, the cache generally has to evict
one of the existing entries. The heuristic that it uses to choose the entry to evict is called the
replacement policy. The fundamental problem with any replacement policy is that it must
predict which existing cache entry is least likely to be used in the future. Predicting the future
is difficult, especially for hardware caches that use simple rules amenable to implementation
in circuitry, so there are a variety of replacement policies to choose from and no perfect way
to decide among them. One popular replacement policy, LRU, replaces the least recently
used entry.
When data is written to the cache, it must at some point be written to main memory as
well. The timing of this write is controlled by what is known as the write policy. In a write-
through cache, every write to the cache causes a write to main memory. Alternatively, in a
write-back or copy-back cache, writes are not immediately mirrored to memory. Instead, the
cache tracks which locations have been written over (these locations are marked dirty). The
data in these locations are written back to main memory when that data is evicted from the
cache. For this reason, a miss in a write-back cache will often require two memory accesses
to service: one to first write the dirty location to memory and then another to read the new
location from memory.
www.seminarcollections.com
Seminar Report– Nov ‘10 - 14 - Core i7
5.TECHNOLOGY
a )45nm Technology
Intel® 45nm high-k metal gate silicon technology is the next-generation Intel® Core™
microarchitecture. With roughly twice the density of Intel® 65nm technology, Intel's 45nm
packs about double the number of transistors into the same silicon space. That's more than
400 million transistors for dual-core processors and more than 800 million for quad-core.
Intel® 45nm technology enables great performance leaps, up to 50-percent larger L2 cache,
and new levels of breakthrough energy efficiency.
Intel's had the world's first viable 45nm processors in-house since early January 2007—
the first of fifteen 45nm processor products in development. With one of the biggest
advancements in fundamental transistor design in 40 years, Intel 45nm high-k silicon
technology can deliver more than a 20 percent improvement in transistor switching speed,
and reduce transistor gate leakage by over 10 fold.
This new transistor breakthrough allows Intel to continue delivering record-breaking PC,
laptop, and server processor speeds well into the future. It also ensures that Moore's Law—a
high-tech industry axiom that transistor counts double about every two years to deliver more
performance and functionality at decreasing cost—thrives well into the next decade
www.seminarcollections.com
Seminar Report– Nov ‘10 - 15 - Core i7
b) Hyper Threading
i)Performance
The advantages of hyper-threading are listed as: improved support for multi-threaded code,
allowing multiple threads to run simultaneously, improved reaction and response time.According
to Intel the first implementation only used 5% more die area than the comparable non-
hyperthreaded processor, but the performance was 15–30% better.
Intel claims up to a 30% speed improvement compared with an otherwise identical, non-
simultaneous multithreading Pentium 4. Intel also claims significant performance improvements
with a hyper-threading-enabled Pentium 4 processor in some artificial intelligence algorithms.
The performance improvement seen is very application-dependent, however, and some programs
actually slow down slightly when Hyper Threading Technology is turned on. This is due to the
replay system of the Pentium 4 tying up valuable execution resources, thereby starving the other
thread. (The Pentium 4 Prescott core gained a replay queue, which reduces execution time
needed for the replay system, but this is not enough to completely overcome the performance
hit.) However, any performance degradation is unique to the Pentium 4 (due to various
architectural nuances), and is not characteristic of simultaneous multithreading in general
www.seminarcollections.com
Seminar Report– Nov ‘10 - 16 - Core i7
ii) Details
Hyper-threading works by duplicating certain sections of the processor—those that store the
architectural state—but not duplicating the main execution resources. This allows a hyper-
threading processor to appear as two "logical" processors to the host operating system, allowing
the operating system to schedule two threads or processes simultaneously. When execution
resources would not be used by the current task in a processor without hyper-threading, and
especially when the processor is stalled, a hyper-threading equipped processor can use those
execution resources to execute another scheduled task. (The processor may stall due to a cache
miss, branch misprediction, or data dependency.)
This technology is transparent to operating systems and programs. All that is required to take
advantage of hyper-threading is symmetric multiprocessing (SMP) support in the operating
system, as the logical processors appear as standard separate processors.
www.seminarcollections.com
Seminar Report– Nov ‘10 - 17 - Core i7
iii) Security
In May 2005 Colin Percival presented a paper, Cache Missing for Fun and Profit, demonstrating
that a malicious thread operating with limited privileges can monitor the execution of another
thread through their influence on a shared data cache, allowing for the theft of cryptographic
keys. Note that while the attack described in the paper was demonstrated on an Intel Pentium 4
with HyperThreading processor, the same techniques could theoretically apply to any system
where caches are shared between two or more non-mutually-trusted execution threads; see also
side channel attack.
Older Netburst Pentium 4 based CPUs use hyper-threading, but Intel's processors based on the
Core microarchitecture do not. However, Intel is using the feature in the newer Atom and Core i7
processors.
Intel released the Nehalem (Core i7) in November 2008 in which hyper-threading makes a
return. Nehalem contains 4 cores and effectively scales 8 threads.[3]
The Intel Atom is an in-order single-core processor with hyper-threading, for low power
mobile PCs and low-price desktop PCs.
www.seminarcollections.com
Seminar Report– Nov ‘10 - 18 - Core i7
Running a processor at high clock speeds allows for better performance. However, when the
same processor is run at a lower frequency (speed), it generates less heat and consumes less
power. In many cases, the core voltage can also be reduced, further reducing power
consumption and heat generation. This can conserve battery power in notebooks, extend
processor life, and reduce noise generated by variable-speed fans. By using SpeedStep, users
can select the balance of power conservation and performance that best suits them, or even
change the clock speed dynamically as the processor burden changes.
Under older Microsoft Windows operating systems, including Windows 2000 and previous
versions, a special driver and dashboard application were needed to access the SpeedStep
feature. Intel's website specifically states that such drivers must come from the computer
manufacturer; there are no generic drivers supplied by Intel which will enable SpeedStep for
older Windows versions if one cannot obtain a manufacturer's driver.
Under Microsoft Windows XP, SpeedStep support is built into the power management
console under the control panel. In Windows XP a user can regulate the processor's speed
indirectly by changing power schemes. The "Home/Office Desk" disables SpeedStep, the
"Portable/Laptop" power scheme enables SpeedStep, and the "Max Battery" uses SpeedStep
to slow the processor to minimal power levels as the battery weakens. The SpeedStep
settings for power schemes, either built-in or custom, cannot be modified from the control
panel's GUI, but can be modified using the POWERCFG.EXE command-line utility.
www.seminarcollections.com
Seminar Report– Nov ‘10 - 19 - Core i7
FEATURES
Price (in $US for 1,000 units 920 (2.8GHz): $195 940 920 (2.66GHz): $284 940
quanitites) (3.0GHz): $225 (2.93GHz): $562 965 Extreme
(3.2GHz): $999
www.seminarcollections.com
Seminar Report– Nov ‘10 - 20 - Core i7
Intel Vs AMD
It’s no secret that Intel has dominated our performance tests over the past year. First, its Core 2
Duos at 45 nm gave enthusiasts a great platform for aggressive, yet relatively safe overclocking.
The company’s Core 2 Quads cost quite a bit more, but they managed to deliver smoking speeds
in the applicationsoptimized for multi-threaded execution.
The recent Core i7 launch further cemented Intel’s position as the performance champion. Its
Core i7 965 Extreme, clocked at 3.2 GHz, demonstrated gains straight across the board versus its
outgoing flagship, the Core 2 Extreme QX9770. And the Core i7 920, Intel’s sub-$300 entry-
level model running at 2.66 GHz, seems to have little trouble reaching up to 4 GHz on air
cooling.
There was once a time when Intel didn’t handle its technology shifts as smoothly. As recently as
the Pentium 4 Prescott core, Intel struggled to maintain an advantage against AMD’s Athlon 64.
But now, with the marketing of its "tick-tock" approach to rolling out lithography advancements
and micro-architecture tweaks, things have certainly turned around. How is AMD expected to
compete?
www.seminarcollections.com
Seminar Report– Nov ‘10 - 21 - Core i7
Up until now, AMD has relied on the loosely-translated term "value" to keep in the game. On its
own, the Phenom X4-series is a moderate performer. AMD knows this, and has priced the chip
more competitively than Intel’s quad-core offerings to attract attention. However, the Phenom
hasn’t had to exist alone in an ecosystem backed by third-party vendors. It’s instead
complemented by AMD’s own chipsets, mainly the 790GX and 790FX. Of course, those
platforms extend comprehensive CrossFire support for its own graphics cards, which have been
capturing hearts since mid-2008.Combined, AMD’s processors, chipsets, and GPUs have fared
better than any one of those components would have alone. Thus, we’d consider the company’s
efforts to emphasize its Spider platform—the cumulative result of all three puzzle pieces—a
success.
In light of a new competitive challenge—Intel’s Core i7—AMD is revamping its Spider platform
with a new processor and the addition of software able to tie all of the hardware together. As you
no doubt already know from reading Bert’s story, this latest effort is called Dragon.
But we’re not here to rehash the details of Phenom II. Rather, in light of significant
enhancements to the CPU architecture’s overclocking capabilities (and indeed, confirmation
from AMD that all of the "magic" that went into its ACC [Advanced Clock Calibration]
technology is now baked into Phenom II), we’re eager to compare the value of AMD’s fastest 45
nm chip to Intel’s entry-level Core i7 920—the one most enthusiasts would be likely to eye as an
overclocking contender.
7.FUTURE SCOPES
The next step for Intel is another "tick" development. That means reducing transistors down to 32
nanometers wide. Producing one microprocessor with transistors that size is an amazing achievement. But
what's even more daunting is finding a way to mass produce millions of chips with transistors that small
in an efficient, reliable and cost-effective way.
The codename for the next Intel chip is Westmere. Westmere will use the same
microarchitecture as Nehalem but will have the 32-nanometer transistors. That means Westmere
will be more powerful than Nehalem. But that doesn't mean Westmere's architecture will make
the most sense for a microprocessor with transistors that small.
www.seminarcollections.com
Seminar Report– Nov ‘10 - 22 - Core i7
will allow the processor to perform hardware-accelerated encryption, not only resulting in
faster execution but also protecting against software targeted attacks.
• AES-NI may be included in the integrated graphics of Westmere.
• integrated graphics, released at the same time as the processor.
• Improved virtualization latency.[36]
New virtualization capability: "VMX Unrestricted mode support" -- which allows 16-bit guests
to run. (real mode and big real mode.
www.seminarcollections.com
Seminar Report– Nov ‘10 - 23 - Core i7
Without a dramatic change to the way Intel designs transistors, there's a danger that Moore's Law
will finally become moot.
Still, engineers tend to think of ways around problems that seem completely insurmountable.
Even if transistors can't get any smaller after one or two more generations, it won't be the end of
electronics. It just might mean we advance a little more slowly than we're accustomed to.
9.CONCLUSION
Developing a microprocessor takes years. While Intel unveiled Nehalem in 2008, the project was
more than five years old at the time. That means even as people wait for an announced microchip
to make its way into various electronic devices and computers, manufacturers like Intel are
working on the next step in microprocessor evolution. They have to, if they want to keep up with
Moore's Law.
www.seminarcollections.com
Seminar Report– Nov ‘10 - 24 - Core i7
10.REFERENCES
• www.tigerdirect.com
• www.howstuffworks.com
• www.tomshardware.com
• www.intel.com
• www.wikipedia.com
www.seminarcollections.com