Sie sind auf Seite 1von 4

Meita Dian Hapsari - 1104114135

Superscalar Mircroarchitecture
Instruction Fetch & Branch Prediction
Fetch phase must fetch multiple instructions per cycle from cache memory to keep a
steady feed of instructions going to the other stages. The number of instructions fetched
per cycle should match or be greater than the peak instruction decode & execution rate (to
allow for cache misses or occasions where the max # of instructions cant be fetched). For
conditional branches, fetch mechanism must be redirected to fetch instructions from branch
targets.
4 Step to processing conditional branch instruction :
1. Recognizing that in the instruction there is a conditional branch
2. Determining the branch outcome (taken or not)
3. Computing the branch target
4. Transferring control by redirecting instruction fetch
Dynamic prediction faster than static prediction, even its complex
Processing Conditional Branches
1. Recognizing Conditional Branches
2. Determining Branch Outcome
3. Computing Branch Targets
4. Transferring Control
Multicore Architecture
- A multicore design in which a single physical processor contain the core logic
of more than 1 processor
- Goal enable a system to run more task simultaneously achieving greater
overall performance
Hyper threading or multicore?
- Early PC capable of doing single task at a time
- Intels multi threading called Hyper threading


Multicore Processor
- Each core has its execution pipeline
- No limitation for number of cores that can be paced in a single chip
Meita Dian Hapsari - 1104114135
- Two cores run at slower speeds and lower temperature
- But the combined throughput better than single processor
Fundamental relationship between frequency and power can be used to multiply the
no of cores from 2,4,8 or even higher

Grid Computing
Grid computing is the collection of computer resources from multiple locations to reach a
common goal. The grid can be thought of as a distributed system with non-interactive
workloads that involve a large number of files. Grid computing is distinguished from
conventional high performance computing systems such as cluster computing in that grid
computers have each node set to perform a different task/application
Advantage of Grid Computing
Core networking technology now accelerates at a much faster rate than advances in
microprocessor speeds
Exploiting under utilized resources
Parallel CPU capacity
Virtual resources and virtual organizations for collaboration
Access to additional resources
Types of Resources
Computation
Storage
Communications
Software and licenses
Special equipment, capacities, architectures, and policies
Security
Access policy - What is shared? Who is allowed to share? When can sharing
occur?
Authentication - How do you identify a user or resource?
Authorization -How do you determine whether a certain operation is
consistent with the rules?

Meita Dian Hapsari - 1104114135
Grid Architecture
Fabric layer: Provides the resources to which shared access is mediated by Grid
protocols.
Connectivity layer: Defines the core communication and authentication protocols
required for grid-specific network functions.
Resource layer: Defines protocols, APIs, and SDKs for secure negotiations, initiation,
monitoring control, accounting and payment of sharing operations on individual
resources.
Collective Layer: Contains protocols and services that capture interactions among a
collection of resources.
Application Layer: These are user applications that operate within VO environment.

GRID INTERNET
Application
Collective
Resource
Connectivity Transport
Fabric Internet

Key Components
1. Portal or User Interface
2. Security
3. Broker ( MDS )
4. Scheduler
5. Data Management ( GASS )
6. Job and resource Management ( GRAM )

Globus Toolkit 4
Is An open source toolkit for building computing and provided by the globus
aliance.
Ex Applications with GT4 :
- Southern California Eartquake Center ( Visualize eartquake simualtion data )
- Computational schientists at brown university ( Simulate the flow of blood )
- etc
Meita Dian Hapsari - 1104114135

Cell Processor
Overview
- A chip with 1 PPC hyper-thread core called PPE and 8 specialized cores called SPEs
- The challenge to be solved by the cell was to put all thse cores together
Cell Processor Component
- Power PC processor (PPE), PPE made out 2 main units :
1. The power processor unit (PPU)
2. Power Processor Storage subsystem (PPSS)
PPU is hyper-threaded and support 2 simultanious threads and consist of :
o A full set of 64 bit PowerPC register
o 32 128bit vector multimedia register
o A 32kb L1 instruciton cache
o 32kb L1 data cache
PPSS handles all memory from the PPE and nearest made to the PPE by other processor or
i/o devices and consist of :
o Unified 521kb L2 instruction & data cache
o Various queques
o A bus interfaces unti that handles bus arbitration and pacing on the element
interconnect bus

- Synergistic Processor Element (SPE)
o Each cell has 8 SPE
o They are 128bit RISC processor
o Consist of 2 main unit :
1. Synergistic processor unit (SPU)
2. Memory flow controller (MFC)
- Synergistic processor unit deals with control & execution
- SPU implemented a set of SIMD instruction, specific to cell
- Each SPU is independent & has its own program counter
- Instruction fetched into local store
- Data load & store in local store
- Memory flow counter actually the interface between SPU, the rest of the cell chip
- MFC interfaces the SPU with EIB
- Addition to whole set of MMIO register, this contains a DMA controller

Das könnte Ihnen auch gefallen