ChipFind - документация

Электронный компонент: PC7447A

Скачать:  PDF   ZIP

Document Outline

5387AHIREL04/04
Features
3000 Dhrystone 2.1 MIPS at 1.3 GHz
Selectable Bus Clock (30 CPU Bus Dividers up to 28x)
Selectable MPx/60x Interface Voltage (1.8V, 2.5V)
P
D
Typically 18W at 1.33 GHz at V
DD
= 1.3V; 8.0W at 1 GHz at V
DD
= 1.1V
Full Operating Conditions
Nap, Doze and Sleep Power Saving Modes
Superscalar (Four Instructions Fetched Per Clock Cycle)
4 GB Direct Addressing Range
Virtual Memory: 4 Hexabytes (2
52
)
64-bit Data and 36-bit Address Bus Interface
Integrated L1: 32 KB Instruction and 32 KB Data Cache
Integrated L2: 512 KB
11 Independent Execution Units and 3 Register Files
Write-back and Write-through Operations
f
INT
Max = 1.33 GHz (1.42 GHz to be Confirmed)
f
BUS
Max = 133 MHz/166 MHz
Description
The PC7447A host processor is a high-performance, low-power, 32-bit implementa-
tions of the PowerPC Reduced Instruction Set Computer (RISC) architecture
combined with a full 128-bit implementation of Motorola
's AltiVec
TM
technology.
This microprocessor is ideal for leading-edge embedded computing and signal pro-
cessing applications. The PC7447A features 512 KB of on-chip L2 cache. The
PC7447A microprocessor has no backside L3 cache, allowing for a smaller package
designed as a pin-for-pin replacement for the PC7447 microprocessor. This device
benefits from a silicon-on-insulator (SOI) CMOS process technology, engineered to
help deliver tremendous power savings without sacrificing speed. A low-power version
of the PC7447A microprocessor is also available.
Figure 1 shows a block diagram of the PC7447A. The core is a high-performance
superscalar design supporting a double-precision floating-point unit and a SIMD multi-
media unit. The memory storage subsystem supports the MPX bus protocol and a
subset of the 60x bus protocol to the main memory and other system resources.
Note that the PC7447A is a footprint-compatible, drop-in replacement in a PC7447
application if the core power supply is 1.3V.
Screening
Full Military Temperature Range (T
j
= -55
C, +125
C),
Industrial Temperature Range (T
j
= -40
C, +110
C)
GH suffix
HITCE 360
Ceramic Ball Grid Array (TBC)
PowerPC
7447A
RISC
Microprocessor
PC7447A
Preliminary
Rev. 5387AHIREL04/04
2
PC7447A [Preliminary]
5387AHIREL04/04
Block Diagram
Figure 1. PC7447A Microprocessor Block Diagram
Additional Features

Time Base Counter/Decrementer

Clock Multiplier

JTAG/COP Interface

Thermal/Power Management
Performance
Monitor

Dynamic Frequency Switching (DFS)
Temperature Dioder
+
x
F
PSCR
F
PSCR
PA
+ x
In
s
t
r
u
c
t
io
n
Un
it
Instruction Queue
(12-Word)
9
6
-
B
it (
3
I
n
str
u
ctio
ns)
R
e
ser
vat
i
o
n
32-Bit
Flo
a
t
i
ng
-
Po
in
t
Unit
64
-Bi
t
Load/Store Unit
(E
A
Ca
l
c
ulat
i
o
n)
F
i
nish
ed
32
-Bi
t
Completion Unit
(16-Entry)
36-bit
64-bit
S
t
atio
ns (
2
)
FP
R
F
ile
16 Rena
me
Bu
f
f
e
r
s
G
P
R F
ile
16
Ren
a
me
Buffer
s
VR Fil
e
16 Rena
me
B
u
f
f
ers
64
-Bi
t
128
-Bi
t
12
8-Bi
t
Co
m
p
le
te
s u
p
SRs
(Shadow)
128-Entry
ITLB
St
o
r
e
s
Lo
ad
Mi
s
s
CTR
LR
V
e
c
t
or
T
o
u
c
h E
n
g
i
ne
32-Bit
EA
L
1
Castout
Status
L2 Store Queue (L2SQ)
Vector
FPU
Vector
Integer
Unit 1
Vector
Integer
Unit 2
Vector
Permute
Unit
Status
Block 1 (32-Byte)
Memory Subsystem
Snoop Push/
Interventions
L1 Castouts
Bus Accumulator
L1
Push
(4)
to
thr
ee
pe
r
clock
in
st
r
u
c
t
io
n
s
L1 Load Queue (LLQ)
L1 Load Miss (5)
Instruction Fetch (2)
Cacheable Store Request (1)
L1 Service
Queues
L1 Store Queue
(LSQ)
L2 Prefetch (3)
Address Bus
Data Bus
Castout
Queue (9) /
Push
Queue (10)
2
Bus Store Queue
Load
Queue (11)
Completion Queue
Reservation
Station
Reservation
Station
Reservation
Station
Reservation
Station
Branc
h Pr
ocessing Unit
BTIC (128-Entr
y)
BHT (2048-Entr
y)
VR Issue
(4-Entry/2-Issue)
GPR Issue
(6-Entry/3-Issue)
FPR Issue
(2-Entry/1-Issue)
Fetcher
Dispatch
Unit
Instruction MMU
IBAT Array
Data MMU
DBAT Array
128-Bit (4 Instructions)
Tags
32-Kbyte
I Cache
32-Kbyte
D Cache
Tags
SRs
(Original)
128-Entry
DTLB
Reservation
Stations (2-Entry)
Vector
Touch
Queue
Completed
Stores
512-Kbyte Unified L2 Cache Controller
Line
Tags
Block 0 (32-Byte)
System Bus Interface
Notes: The castout queue and push queue share resources such for a combined total of entries.
The castout queue itself is limited to 9 entries, ensuring 1 entry will be available for a push.
Integ
er
Unit 2
Reser
v
ation
Stations (2)
Reservation
Station
Integ
er
Unit 1
(3)
+
3
PC7447A [Preliminary]
5387AHIREL04/04
General Parameters
Table 1 provides a summary of the general parameters of the PC7477A.
Features
This section summarizes features of the PC7447A implementation of the PowerPC
architecture.
Major features of the PC7447A are as follows:
High-performance, superscalar microprocessor
Up to four instructions can be fetched from the instruction cache at a time
Up to 12 instructions can be in the instruction queue (IQ)
Up to 16 instructions can be at some stage of execution simultaneously
Single-cycle execution for most instructions
One instruction per clock cycle throughput for most instructions
Seven-stage pipeline control
Eleven independent execution units and three register files
Branch processing unit (BPU) features static and dynamic branch prediction
128-entry (32-set, four-way set-associative) branch target instruction cache
(BTIC), a cache of branch instructions that have been encountered in
branch/loop code sequences. If a target instruction is in the BTIC, it is
fetched into the instruction queue a cycle sooner than it can be made
available from the instruction cache. Typically, a fetch that hits the BTIC
provides the first four instructions in the target stream.
2048-entry branch history table (BHT) with two bits per entry for four levels of
prediction: not taken, strongly not taken, taken, and strongly taken
Up to three outstanding speculative branches
Branch instructions that do not update the count register (CTR) or link
register (LR) are often removed from the instruction stream
Eight-entry link register stack to predict the target address of Branch
Conditional to Link Register (BCLR) instructions
Table 1. Device Parameters
Parameter
Description
Technology
0.13 m CMOS, nine-layer metal
Die size
7.3 mm 9.32 mm
Transistor count
48.6 million
Logic design
Fully-static
Packages
Surface mount 360 ceramic ball grid array (HITCE)
Core power supply
1.3V 50 mV DC nominal
I/O power supply
1.8V 5% DC, or 2.5V 5% DC
4
PC7447A [Preliminary]
5387AHIREL04/04
Four integer units (IUs) that share 32 GPRs for integer operands
Three identical IUs (IU1a, IU1b, and IU1c) can execute all integer
instructions except multiply, divide, and move to/from special-purpose
register instructions.
IU2 executes miscellaneous instructions including the CR logical operations,
integer multiplication and division instructions, and move to/from special-
purpose register instructions.
Five-stage FPU and a 32-entry FPR file
Fully IEEE 754-1985-compliant FPU for both single- and double-precision
operations
Supports non-IEEE mode for time-critical operations
Hardware support for denormalized number
Thirty-two 64-bit FPRs for single- or double-precision operands
Four vector units and 32-entry vector register file (VRs)
Vector permute unit (VPU)
Vector integer unit 1 (VIU1) handles short-latency AltiVecTM integer
instructions, such as vector add instructions (for example, vaddsbs, vaddshs,
and vaddsws).
Vector integer unit 2 (VIU2) handles longer-latency AltiVec integer
instructions, such as vector multiply add instructions (for example,
vmhaddshs, vmhraddshs, and vmladduhm).
Vector floating-point unit (VFPU)
Three-stage load/store unit (LSU)
Supports integer, floating-point, and vector instruction load/store traffic
Four-entry vector touch queue (VTQ) supports all four architectures of the
AltiVec data stream operations
Three-cycle GPR and AltiVec load latency (byte, half word, word, vector) with
one-cycle throughput
Four-cycle FPR load latency (single, double) with one-cycle throughput
No additional delay for misaligned access within double-word boundary
Dedicated adder calculates effective addresses (EAs)
Supports store gathering
5
PC7447A [Preliminary]
5387AHIREL04/04
Performs alignment, normalization, and precision conversion for floating-
point data
Executes cache control and TLB instructions
Performs alignment, zero padding, and sign extension for integer data
Supports hits under misses (multiple outstanding misses)
Supports both big- and little-endian modes, including misaligned little-endian
accesses
Three issue queues, FIQ, VIQ, and GIQ, can accept as many as one, two, and three
instructions, respectively, in a cycle. Instruction dispatch requires the following:
Instructions can only be dispatched from the three lowest IQ entries: IQ0,
IQ1, and IQ2.
A maximum of three instructions can be dispatched to the issue queues per
clock cycle.
Space must be available in the CQ for an instruction to dispatch (this
includes instructions that are assigned a space in the CQ but not in an issue
queue).
Rename buffers
16 GPR rename buffers
16 FPR rename buffers
16 VR rename buffers
Dispatch unit
Decode/dispatch stage fully decodes each instruction
Completion unit
The completion unit retires an instruction from the 16-entry completion
queue (CQ) when all instructions ahead of it have been completed, the
instruction has finished execution, and no exceptions are pending
Guarantees sequential programming model (precise exception model)
Monitors all dispatched instructions and retires them in order
Tracks unresolved branches and flushes instructions after a mispredicted
branch
Retires as many as three instructions per clock cycle
Separate on-chip L1 instruction and data caches (Harvard Architecture)
32-Kbyte, eight-way set-associative instruction and data caches
Pseudo least-recently-used (PLRU) replacement algorithm
32-byte (eight-word) L1 cache block
Physically indexed/physical tags
Cache write-back or write-through operation programmable on a per-page or
per-block basis
Instruction cache can provide four instructions per clock cycle; data cache
can provide four words per clock cycle
Caches can be disabled in software
Caches can be locked in software
MESI data cache coherency maintained in hardware