Document

Anuncio
Laboratorio de
Tecnologías de Información
Arithmetic and Logic Unit
First Part
Arquitectura de Computadoras
Arturo Díaz Pérez
Centro de Investigación y de Estudios Avanzados del IPN
Laboratorio de Tecnologías de Información
[email protected]
Arquitectura de Computadoras
ALU1- 1
Typical Operations
Data Movement
Laboratorio de
Tecnologías de Información
Load (from memory)
Store (to memory)
memory-to-memory move
register-to-register move
input (from I/O device)
output (to I/O device)
push, pop (to/from stack)
Arithmetic
integer (binary + decimal) or FP
Add, Subtract, Multiply, Divide
Shift
shift left/right, rotate left/right
Logical
not, and, or, set, clear
Arquitectura de Computadoras
ALU1- 2
Operands for ALU instructions
Laboratorio de
Tecnologías de Información
♦ ALU instructions combine operands (e.g. ADD)
♦ Number of explicit operands
■ Two - destination equals one source
■ Three - orthogonal
Arquitectura de Computadoras
ALU1- 3
MIPS Addressing Modes/Instruction
Formats
Laboratorio de
Tecnologías de Información
• All instructions 32 bits wide
Register (direct)
op
rs
rt
rd
register
Immediate
Base+index
op
rs
rt
immed
op
rs
rt
immed
register
PC-relative
• Register Indirect?
Arquitectura de Computadoras
op
rs
PC
rt
Memory
+
immed
Memory
+
ALU1- 4
MIPS: Register State
Laboratorio de
Tecnologías de Información
♦ 32 integer registers
■ $0 is hardwared to 0
■ $31 is the return address
register
■ software convention for other
registers
♦ 32 single-precision FP
registers or 16 doubleprecision FP registers
♦ PC and other special registers
Arquitectura de Computadoras
ALU1- 5
MIPS I Operation Overview
Laboratorio de
Tecnologías de Información
♦ Arithmetic Logical:
■ Add, AddU, Sub, SubU, And, Or, Xor, Nor, SLT, SLTU
■ AddI, AddIU, SLTI, SLTIU, AndI, OrI, XorI, LUI
■ SLL, SRL, SRA, SLLV, SRLV, SRAV
♦ Memory Access:
■ LB, LBU, LH, LHU, LW, LWL,LWR
■ SB, SH, SW, SWL, SWR
Arquitectura de Computadoras
ALU1- 6
MIPS arithmetic instructions
Instruction
add
subtract
add immediate
add unsigned
subtract unsigned
add imm. unsign.
multiply
multiply unsigned
divide
Example
add $1,$2,$3
sub $1,$2,$3
addi $1,$2,100
addu $1,$2,$3
subu $1,$2,$3
addiu $1,$2,100
mult $2,$3
multu$2,$3
div $2,$3
divide unsigned
divu $2,$3
Move from Hi
Move from Lo
mfhi $1
mflo $1
Meaning
$1 = $2 + $3
$1 = $2 – $3
$1 = $2 + 100
$1 = $2 + $3
$1 = $2 – $3
$1 = $2 + 100
Hi, Lo = $2 x $3
Hi, Lo = $2 x $3
Lo = $2 ÷ $3,
Hi = $2 mod $3
Lo = $2 ÷ $3,
Hi = $2 mod $3
$1 = Hi
$1 = Lo
Laboratorio de
Tecnologías de Información
Comments
3 operands; exception possible
3 operands; exception possible
+ constant; exception possible
3 operands; no exceptions
3 operands; no exceptions
+ constant; no exceptions
64-bit signed product
64-bit unsigned product
Lo = quotient, Hi = remainder
Unsigned quotient & remainder
Used to get copy of Hi
Used to get copy of Lo
Which add for address arithmetic? Which add for integers?
Arquitectura de Computadoras
ALU1- 7
MIPS logical instructions
Instruction
and
or
xor
nor
and immediate
or immediate
xor immediate
shift left logical
shift right logical
shift right arithm.
shift left logical
shift right logical
shift right arithm.
Example
and $1,$2,$3
or $1,$2,$3
xor $1,$2,$3
nor $1,$2,$3
andi $1,$2,10
ori $1,$2,10
xori $1, $2,10
sll $1,$2,10
srl $1,$2,10
sra $1,$2,10
sllv $1,$2,$3
srlv $1,$2, $3
srav $1,$2, $3
Arquitectura de Computadoras
Meaning
$1 = $2 & $3
$1 = $2 | $3
$1 = $2 ⊕ $3
$1 = ~($2 |$3)
$1 = $2 & 10
$1 = $2 | 10
$1 = ~$2 &~10
$1 = $2 << 10
$1 = $2 >> 10
$1 = $2 >> 10
$1 = $2 << $3
$1 = $2 >> $3
$1 = $2 >> $3
Laboratorio de
Tecnologías de Información
Comment
3 reg. operands; Logical AND
3 reg. operands; Logical OR
3 reg. operands; Logical XOR
3 reg. operands; Logical NOR
Logical AND reg, constant
Logical OR reg, constant
Logical XOR reg, constant
Shift left by constant
Shift right by constant
Shift right (sign extend)
Shift left by variable
Shift right by variable
Shift right arith. by variable
ALU1- 8
Details of the MIPS instruction set
Laboratorio de
Tecnologías de Información
♦ Register zero always has the value zero (even if you try to write it)
♦ Branch/jump and link put the return addr. PC+4 or 8 into the link
register (R31) (depends on logical vs physical architecture)
♦ All instructions change all 32 bits of the destination register (including
lui, lb, lh) and all read all 32 bits of sources (add, sub, and, or, …)
♦ Immediate arithmetic and logical instructions are extended as follows:
■ logical immediates ops are zero extended to 32 bits
■ arithmetic immediates ops are sign extended to 32 bits (including addu)
♦ The data loaded by the instructions lb and lh are extended as follows:
■ lbu, lhu are zero extended
■ lb, lh are sign extended
♦ Overflow can occur in these arithmetic and logical instructions:
■ add, sub, addi
■ it cannot occur in addu, subu, addiu, and, or, xor, nor, shifts, mult, multu,
div,
divu
Arquitectura
de Computadoras
ALU1- 9
MIPS: Instruction Set Format
Laboratorio de
Tecnologías de Información
■ load/store architecture with 3 explicit operands (ALU ops)
■ fixed 32-bit instructions
■ 3 instruction formats
» R-Type
» I-Type
» J-Type
■ 6 instruction set groups:
»
»
»
»
»
»
load/store - data movement operations
computational - arithmetic, logical, and shift operations
jump/branch - including call and returns
coprocessor - FP instructions
coprocessor0 - memory management and exception handling
special - accessing special registers, system calls, breakpoint
instructions, etc.
Arquitectura de Computadoras
ALU1- 10
R2000/3000 Instruction Formats
Laboratorio de
Tecnologías de Información
♦ R-type (register)
e.g. add $8, $17, $18
31
26 25
OpCode
31
21 20
rs
26 25
0
Arquitectura de Computadoras
# $8 = $17 + $18
16 15
rt
21 20
17
11 10
rd
16 15
18
6 5
shamt
11 10
8
0
funct
6 5
0
0
32
ALU1- 11
R2000/3000 Instruction Formats
•
I-type (immediate)
e.g. addi
$8, $17, -44
lw
$8, -44($17)
beq
$17, $8, label
31
26 25
OpCode
31
21 20
rs
26 25
“op”
Arquitectura de Computadoras
17
# $8 = $17 -44
# $8 = M[$17 - 44]
# if( $8 == $17) go to label:
16 15
rt
21 20
0
immediate
16 15
8
Laboratorio de
Tecnologías de Información
0
-44
ALU1- 12
R2000/3000 Instruction Formats
•
J-type (jump)
e.g. jump
label
31
# call label: ;
$31 = $pc + 8
26 25
OpCode
31
0
target
26 25
3
Arquitectura de Computadoras
Laboratorio de
Tecnologías de Información
0
-44
ALU1- 13
Laboratorio de
Tecnologías de Información
Arquitectura de Computadoras
ALU1- 14
5 Steps of DLX Datapath
Instruction Fetch Instruction Decode/ Execute
Register Fetch
Addr. Calc.
Laboratorio de
Tecnologías de Información
Memory
Access
Write
Back
M
u
x
4
Add
Zero ?
NPC
A
PC
Inst.
Memory
IR
M
u
x
Add
Registers B
M
u
x
16
ALU
Output
SMD
LMD
Data
Memory
M
u
x
32
Sign
Extend
Arquitectura de Computadoras
ALU1- 15
Useful Circuits for Interconnection
Laboratorio de
Tecnologías de Información
♦ Four common and useful MSI circuits are:
■
■
■
■
Decoder
Demultiplexer
Encoder
Multiplexer
♦ Block-level outlines of MSI circuits:
code
input
Arquitectura de Computadoras
decoder
mux
select
entity
data
entity
data
encoder
demux
select
code
output
ALU1- 16
Decoders
Laboratorio de
Tecnologías de Información
♦ Codes are frequently used to represent entities
♦ These codes can be identified (or decoded) using a
decoder. Given a code, identify the entity.
♦ Convert binary information from n input lines to (max. of)
2n output lines.
♦ Known as n-to-m-line decoder, or simply n:m or n×m
decoder (m ≤ 2n).
♦ May be used to generate 2n (or fewer) minterms of n
input variables.
Arquitectura de Computadoras
ALU1- 17
Decoders
Laboratorio de
Tecnologías de Información
♦ Example: if codes 00, 01, 10, 11 are used to identify four
light bulbs, we may use a 2-bit decoder:
2x4
F0
X Dec F
2-bit
code
Bulb 0
Bulb 1
Bulb 2
Bulb 3
1
Y
F2
F3
ƒ This is a 2×4 decoder which selects an output line
ƒ
based on the 2-bit code supplied.
Truth table:
X
0
0
1
1
Arquitectura de Computadoras
Y F0 F1
0 1 0
1 0 1
0 0 0
1 0 0
F2
0
0
1
0
F3
0
0
0
1
ALU1- 18
Encoder
Laboratorio de
Tecnologías de Información
♦ Encoding is the converse of decoding.
♦ Given a set of input lines, where one has been selected,
provide a code corresponding to that line.
♦ Contains 2n (or fewer) input lines and n output lines.
♦ Implemented with OR gates.
♦ An example:
F0
Select via
switches
Arquitectura de Computadoras
F1
F2
F3
D0
4-to-2
Encoder
D1
2-bits
code
ALU1- 19
Encoder
Laboratorio de
Tecnologías de Información
ƒ Truth table:
F0 F1
1 0
0 1
0 0
0 0
0 0
0 0
0 1
0 1
0 1
1 0
1 0
1 0
1 1
1 1
1 1
1 1
Arquitectura de Computadoras
F2
0
0
1
0
0
1
0
1
1
0
1
1
0
0
1
1
F3
0
0
0
1
0
1
1
0
1
1
0
1
0
1
0
1
D1
0
0
1
1
X
X
X
X
X
X
X
X
X
X
X
X
D0
0
1
0
1
X
X
X
X
X
X
X
X
X
X
X
X
ALU1- 20
Encoder
Laboratorio de
Tecnologías de Información
♦ With the help of K-map (and don’t care conditions), can
obtain:
♦
D0 = F1 + F3
D1 = F2 + F3
which correspond to circuit:
F0
F1
F2
F3
Arquitectura de Computadoras
D0
Simple 4-to-2 encoder
D1
ALU1- 21
Demultiplexer
Laboratorio de
Tecnologías de Información
♦ Given an input line and a set of selection lines, the
demultiplexer will direct data from input to a selected
output line.
♦ An example of a 1-to-4 demultiplexer:
Outputs
Y0 = D.S1'.S0'
Data D
demux
Y1 = D.S1'.S0
Y2 = D.S1.S0'
Y3 = D.S1.S0
S1 So
0 0
0 1
1 0
1 1
Y0
D
0
0
0
Y1
0
D
0
0
Y2
0
0
D
0
Y3
0
0
0
D
S1 S0
select
Arquitectura de Computadoras
ALU1- 22
Demultiplexer
Laboratorio de
Tecnologías de Información
♦ The demultiplexer is actually identical to a decoder with
enable, as illustrated below:
S1
2x4
Decoder
S0
Y0 = D.S1'.S0'
Y1 = D.S1'.S0
Y2 = D.S1.S0'
E
Y3 = D.S1.S0
D
Exercise: Provide the truth table for above demultiplexer.
Arquitectura de Computadoras
ALU1- 23
Multiplexer
Laboratorio de
Tecnologías de Información
♦ A multiplexer is a device which has
(i) a number of input lines
(ii) a number of selection lines
(iii) one output line
♦ It steers one of 2n inputs to a single output line, using n
selection lines. Also known as a data selector.
inputs
:
2n:1
Multiplexer
output
...
select
Arquitectura de Computadoras
ALU1- 24
Multiplexer
Laboratorio de
Tecnologías de Información
ƒ Truth table for a 4-to-1 multiplexer:
I0
d0
d0
d0
d0
I1
d1
d1
d1
d1
Inputs
I0
I1
I2
I3
I2
d2
d2
d2
d2
I3
d3
d3
d3
d3
0
4:1
1
MUX
Y
2
3
S1 S0
select
Arquitectura de Computadoras
S1
0
0
1
1
S0
0
1
0
1
Y
d0
d1
d2
d3
Output
S1
0
0
1
1
S0
0
1
0
1
Inputs
I0
I1
I2
I3
Y
I0
I1
I2
I3
mux
Y
S1 S0
select
ALU1- 25
Laboratorio de
Tecnologías de Información
Arquitectura de Computadoras
ALU1- 26
Binary Representation
Laboratorio de
Tecnologías de Información
b31b30b29b28………………b3b2b1b0
b31 × 231 + b30 × 230 + b29 × 2 29 + b28 × 2 28 + ... + b2 × 2 2 + b1 × 21 + b0 × 20
0000 0000 0000 0000 0000 0000 0000 00002 = 010
0000 0000 0000 0000 0000 0000 0000 00012 = 110
0000 0000 0000 0000 0000 0000 0000 00102 = 210
0000 0000 0000 0000 0000 0000 0000 10112 = 1110
Arquitectura de Computadoras
ALU1- 27
Signed Numbers
Laboratorio de
Tecnologías de Información
♦ Sign+Magnitude
♦ For n-bit numbers, the most significant bit is reserved for
sign
0000 0000 0000 0000 0000 0000 0000 10112 = 1110
1000 0000 0000 0000 0000 0000 0000 10112 = -1110
Sign
Arquitectura de Computadoras
Magnitude
ALU1- 28
Signed Numbers
Laboratorio de
Tecnologías de Información
♦ For n-bit numbers, the negation of B in two’s complement
is 2n - B (this is one of the alternative ways of negating a
two’s-complement number).
- B = (2n - B)
0000 0000 0000 0000 0000 0000 0000 10112 = 1110
1111 1111 1111 1111 1111 1111 1111 01002
+
12
1111 1111 1111 1111 1111 1111 1111 01012
Arquitectura de Computadoras
ALU1- 29
Signed Numbers
Laboratorio de
Tecnologías de Información
♦ For n-bit numbers, the negation of B in two’s complement
is 2n - B (this is one of the alternative ways of negating a
two’s-complement number).
- B = (2n - B)
1111 1111 1111 1111 1111 1111 1111 01012 = -1110
0000 0000 0000 0000 0000 0000 0000 10102
+
12
0000 0000 0000 0000 0000 0000 0000 10112 = 1110
Arquitectura de Computadoras
ALU1- 30
Signed Number Systems
♦
♦
♦
♦
♦
Here are all the 4-bit numbers in the
different systems.
Positive numbers are the same in all
three representations.
Signed magnitude and one’s
complement have two ways of
representing 0. This makes things
more complicated.
Two’s complement has asymmetric
ranges; there is one more negative
number than positive number. Here,
you can represent -8 but not +8.
However, two’s complement is
preferred because it has only one 0,
and its addition algorithm is the
simplest.
Arquitectura de Computadoras
Laboratorio de
Tecnologías de Información
Decimal
S.M.
1’s comp.
2’s comp.
7
6
5
4
3
2
1
0
-0
-1
-2
-3
-4
-5
-6
-7
-8
0111
0110
0101
0100
0011
0010
0001
0000
1000
1001
1010
1011
1100
1101
1110
1111
—
0111
0110
0101
0100
0011
0010
0001
0000
1111
1110
1101
1100
1011
1010
1001
1000
—
0111
0110
0101
0100
0011
0010
0001
0000
—
1111
1110
1101
1100
1011
1010
1001
1000
ALU1- 31
Sign extension
♦
♦
Laboratorio de
Tecnologías de Información
In everyday life, decimal numbers are assumed to have an infinite number
of 0s in front of them. This helps in “lining up” numbers.
To subtract 231 and 3, for instance, you can imagine:
231
- 003
228
♦
♦
♦
You need to be careful in extending signed binary numbers, because the
leftmost bit is the sign and not part of the magnitude.
If you just add 0s in front, you might accidentally change a negative number
into a positive one!
For example, going from 4-bit to 8-bit numbers:
■ 0101 (+5) should become 0000 0101 (+5).
■ But 1100 (-4) should become 1111 1100 (-4).
♦
The proper way to extend a signed binary number is to replicate the sign
bit, so the sign is preserved.
Arquitectura de Computadoras
ALU1- 32
Two’s complement addition
Laboratorio de
Tecnologías de Información
♦ Negating a two’s complement number takes a bit of work, but addition
is much easier than with the other two systems
♦ To find A + B, you just have to:
■ Do unsigned addition on A and B, including their sign bits.
■ Ignore any carry out.
♦ For example, to find 0111 + 1100, or (+7) + (-4):
■ First add 0111 + 1100 as unsigned numbers:
01 1 1
+ 1 1 00
1 001 1
■ Discard the carry out (1).
■ The answer is 0011 (+3).
Arquitectura de Computadoras
ALU1- 33
Another two’s complement example
Laboratorio de
Tecnologías de Información
♦ Let’s try adding two negative numbers—1101 + 1110, or
(-3) + (-2) in decimal.
♦ Adding the numbers gives 11011:
1 1 01
+ 1110
1 1 01 1
♦ Dropping the carry out (1) leaves us with the answer,
1011 (-5).
Arquitectura de Computadoras
ALU1- 34
Why does this work?
Laboratorio de
Tecnologías de Información
♦ For n-bit numbers, the negation of B in two’s complement
is 2n - B (this is one of the alternative ways of negating a
two’s-complement number).
A - B = A + (-B)
= A + (2n - B)
= (A - B) + 2n
♦ If A ≥ B, then (A - B) is a positive number, and 2n
represents a carry out of 1. Discarding this carry out is
equivalent to subtracting 2n, which leaves us with the
desired result (A - B).
♦ If A < B, then (A - B) is a negative number and we have 2n
- (A - B). This corresponds to the desired result, -(A - B),
in two’s complement form.
Arquitectura de Computadoras
ALU1- 35
Signed overflow
♦
♦
With two’s complement and a 4-bit adder, for example, the largest
representable decimal number is +7, and the smallest is -8.
What if you try to compute 4 + 5, or (-4) + (-5)?
01 00
+ 01 01
01 001
♦
♦
Laboratorio de
Tecnologías de Información
(+4)
(+5)
(-7)
1 1 00
+ 1 01 1
1 01 1 1
(-4)
(-5)
(+7)
We cannot just include the carry out to produce a five-digit result, as for
unsigned addition. If we did, (-4) + (-5) would result in +23!
Also, unlike the case with unsigned numbers, the carry out cannot be used to
detect overflow.
■ In the example on the left, the carry out is 0 but there is overflow.
■ Conversely, there are situations where the carry out is 1 but there is no overflow.
Arquitectura de Computadoras
ALU1- 36
Detecting signed overflow
Laboratorio de
Tecnologías de Información
♦ The easiest way to detect signed overflow is to look at all the sign
bits.
01 00
+ 01 01
01 001
(+4)
(+5)
(-7)
1 1 00
+ 1 01 1
1 01 1 1
(-4)
(-5)
(+7)
♦ Overflow occurs only in the two situations above:
■ If you add two positive numbers and get a negative result.
■ If you add two negative numbers and get a positive result.
♦ Overflow cannot occur if you add a positive number to a negative
number. Do you see why?
Arquitectura de Computadoras
ALU1- 37
Refined Requirements
Laboratorio de
Tecnologías de Información
(1) Functional Specification
inputs:
2 x 32-bit operands A, B, 4-bit mode
outputs:
32-bit result S, 1-bit carry, 1 bit overflow
operations: add, addu, sub, subu, and, or, xor, nor, slt, sltU
(2) Block Diagram
(powerview symbol, VHDL entity)
32
A
c
ovf
32
ALU
B
m
4
S
32
Arquitectura de Computadoras
ALU1- 38
Gate-level Design: Half Adder
Laboratorio de
Tecnologías de Información
♦ Design procedure:
1) State Problem
Example: Build a Half Adder to add two bits
2) Determine and label the inputs & outputs of circuit.
Example: Two inputs and two outputs labeled, as
follows:
X
Y
Half
Adder
(X + Y)
S
C
X
0
0
1
1
Y
0
1
0
1
C
0
0
0
1
S
0
1
1
0
3) Draw truth table.
Arquitectura de Computadoras
ALU1- 39
Gate-level Design: Half Adder
4) Obtain simplified Boolean function.
Example: C = X.Y
S = X'.Y + X.Y' = X⊕Y
X
0
0
1
1
Laboratorio de
Tecnologías de Información
Y
0
1
0
1
C
0
0
0
1
S
0
1
1
0
5) Draw logic diagram.
X
Y
S
Half Adder
C
Arquitectura de Computadoras
ALU1- 40
Gate-level Design: Full Adder
Laboratorio de
Tecnologías de Información
♦ Half-adder adds up only two bits.
♦ To add two binary numbers, we need to add 3 bits
(including the carry).
♦ Example:
+
1
1
1
0
0
1
0
1
0
1
1
1
1
1
0
carry
X
Y
S
ƒ Need Full Adder (so called as it can be made from two halfadders).
X
Y
Z
Arquitectura de Computadoras
Full
Adder
(X + Y + Z)
S
C
ALU1- 41
Gate-level Design: Full Adder
Laboratorio de
Tecnologías de Información
ƒ Truth table:
X
0
0
0
0
1
1
1
1
Y
0
0
1
1
0
0
1
1
Z
0
1
0
1
0
1
0
1
C
0
0
0
1
0
1
1
1
S
0
1
1
0
1
0
0
1
Note:
Z - carry in (to the current
position)
C - carry out (to the next position)
YZ
X
00
1
1
YZ
X
1
10
1
1
1
01 11
10
1
1
S
00
0
Arquitectura de Computadoras
01 11
0
ƒ Using K-map, simplified SOP form:
C = X.Y + X.Z + Y.Z
S = X'.Y'.Z + X'.Y.Z'+X.Y'.Z'+X.Y.Z
C
1
1
ALU1- 42
Gate-level Design: Full Adder
Laboratorio de
Tecnologías de Información
ƒ Alternative formulae using algebraic manipulation:
C = X.Y + X.Z + Y.Z
= X.Y + (X + Y).Z
= X.Y + ((X⊕Y) + X.Y).Z
= X.Y + (X⊕Y).Z + X.Y.Z
= X.Y + (X⊕Y).Z
S = X'.Y'.Z + X'.Y.Z' + X.Y'.Z' + X.Y.Z
= X‘.(Y'.Z + Y.Z') + X.(Y'.Z' + Y.Z)
= X'.(Y⊕Z) + X.(Y⊕Z)'
= X⊕(Y⊕Z) or (X⊕Y)⊕Z
Arquitectura de Computadoras
ALU1- 43
Gate-level Design: Full Adder
Laboratorio de
Tecnologías de Información
ƒ Circuit for above formulae:
C = X.Y + (X⊕Y).Z
S = (X⊕Y)⊕Z
X
Y
(X⊕Y)
S
(XY)
C
Z
Full Adder made from two Half-Adders (+ OR gate).
Arquitectura de Computadoras
ALU1- 44
Gate-level (SSI) Design: Full Adder
Laboratorio de
Tecnologías de Información
ƒ Circuit for above formulae:
C = X.Y + (X⊕Y).Z
S = (X⊕Y)⊕Z
X
Y
X
Y
Block diagrams.
(X⊕Y)
Sum
X
Y
Half
Adder
Carry
Sum
S
Half
Adder
(X.Y)
Carry
C
Z
Full Adder made from two Half-Adders (+ OR gate).
Arquitectura de Computadoras
ALU1- 45
4-bit Parallel Adder
Laboratorio de
Tecnologías de Información
ƒ Consider a circuit to add two 4-bit numbers together and a carry-in, to
produce a 5-bit result:
X4 X3 X2 X1
C5
Y4 Y3 Y2 Y1
4-bit
Parallel Adder
S4 S3 S2 S1
C1
Black-box view of 4-bit
parallel adder
ƒ 5-bit result is sufficient because the largest result is:
(1111)2+(1111)2+(1)2 = (11111)2
Arquitectura de Computadoras
ALU1- 46
4-bit Parallel Adder
Laboratorio de
Tecnologías de Información
ƒ Truth table for 9 inputs very big, i.e. 29=512 entries:
X4X3X2X1
0 0 0 0
0 0 0 0
0 0 0 0
...
0 1 0 1
...
1 1 1 1
Y4Y3Y2Y1
0 0 0 0
0 0 0 0
0 0 0 1
...
1 1 0 1
...
1 1 1 1
C1
0
1
0
...
1
...
1
C5
0
0
0
...
1
...
1
S4S3S2S1
0 0 0 0
0 0 0 1
0 0 0 1
...
0 0 1 1
...
1 1 1 1
ƒ Simplification very complicated.
Arquitectura de Computadoras
ALU1- 47
4-bit Parallel Adder
Laboratorio de
Tecnologías de Información
♦ Alternative design possible.
♦ Addition formulae for each pair of bits (with carry in),
Ci+1Si = Xi + Yi + Ci
has the same function as a full adder.
Ci+1 = Xi .Yi + (Xi ⊕ Yi ) .Ci
S i = Xi ⊕ Y i ⊕ C i
Arquitectura de Computadoras
ALU1- 48
4-bit Parallel Adder
Laboratorio de
Tecnologías de Información
ƒ Cascading 4 full adders via their carries, we get:
Y4 X4
Y3 X3
C4
C5
FA
S4
Y2 X2
C2
C3
FA
S3
Y1 X1
FA
S2
FA
C1
S1
Input
Output
Arquitectura de Computadoras
ALU1- 49
Parallel Adders
Laboratorio de
Tecnologías de Información
♦ Note that carry propagated by cascading the carry from
one full adder to the next.
♦ Called Parallel Adder because inputs are presented
simultaneously (in parallel). Also, called Ripple-Carry
Adder.
Arquitectura de Computadoras
ALU1- 50
16-bit Parallel Adder
Laboratorio de
Tecnologías de Información
♦ Larger parallel adders can be built from smaller ones.
♦ Example: a 16-bit parallel adder can be constructed from
four 4-bit parallel adders:
X16..X13 Y16..Y13
4
C17
X12..X9 Y12..Y9
4
4-bit //
adder
4
S16..S13
4
C13
X8..X5
4
4-bit //
adder
4
S12..S9
Y8..Y5
4
C9
X4..X1
4
4-bit //
adder
4
S8..S5
Y4..Y1
4
C5
4
4-bit //
adder
C1
4
S4..S1
A 16-bit parallel adder
Arquitectura de Computadoras
ALU1- 51
But What about Performance?
Laboratorio de
Tecnologías de Información
♦ Critical Path of n-bit Rippled-carry adder is n*CP
CarryIn0
A0
B0
A1
B1
A2
B2
1-bit
Result0
ALU
CarryIn1 CarryOut0
1-bit
Result1
ALU
CarryIn2 CarryOut1
Design Trick:
Throw hardware at it
1-bit
Result2
ALU
CarryIn3 CarryOut2
A3
B3
1-bit
ALU
Result3
CarryOut3
Arquitectura de Computadoras
ALU1- 52
Calculation of Circuit Delays
Laboratorio de
Tecnologías de Información
ƒ In general, given a logic gate with delay, t.
t1
t2
:
tn
:
Logic
Gate
max (t1, t2, ..., tn ) + t
If inputs are stable at times t1,t2,..,tn, respectively; then the
earliest time in which the output will be stable is:
max(t1, t2, .., tn) + t
ƒ To calculate the delays of all outputs of a combinational circuit, repeat
above rule for all gates.
Arquitectura de Computadoras
ALU1- 53
Calculation of Circuit Delays
Laboratorio de
Tecnologías de Información
ƒ As a simple example, consider the full adder circuit where all inputs are
available at time 0. (Assume each gate has delay t.)
X
Y
0
0
max(0,0)+t = t
max(t,0)+t = 2t
S
t
2t
max(t,2t)+t = 3t
C
Z
0
where outputs S and C, experience delays
of 2t and 3t, respectively.
Arquitectura de Computadoras
ALU1- 54
Calculation of Circuit Delays
Laboratorio de
Tecnologías de Información
ƒ More complex example: 4-bits parallel adder.
Y4 X4
C4
0 0
C5
FA
S4
Arquitectura de Computadoras
Y3 X3
Y2 X2
C3
0 0
FA
S3
Y1 X1
C2
0 0
FA
S2
0 0
FA
0
C1
S1
ALU1- 55
Calculation of Circuit Delays
Laboratorio de
Tecnologías de Información
ƒ Analyse the delay for the repeated block:
Xi
Yi
Ci
0
0
mt
Si
Full
Adder
Ci+1
where Xi, Yi are
stable at 0t, while
Ci is assumed to
be stable at mt.
ƒ Performing the delay calculation gives:
Xi 0
Yi 0
max(0,0)+t = t
max(t,mt)+t
Si
t
max(t,mt)+t
max(t,mt)+2t
Ci+1
Ci
mt
Arquitectura de Computadoras
ALU1- 56
Calculation of Circuit Delays
Laboratorio de
Tecnologías de Información
ƒ Calculating:
When i=1, m=0: S1 = 2t and C2 = 3t.
When i=2, m=3: S2 = 4t and C3 = 5t.
When i=3, m=5: S3 = 6t and C4 = 7t.
When i=4, m=7: S4 = 8t and C5 = 9t.
ƒ In general, an n-bit ripple-carry parallel adder will
experience:
Sn = ((n-1)*2+2)t
Cn+1 = ((n-1)*2+3)t
ƒ
ƒ
as their delay times.
Propagation delay of ripple-carry parallel adders is
proportional to the number of bits it handles.
Maximum Delay: ((n-1)*2+3)t
Arquitectura de Computadoras
ALU1- 57
Faster Circuits
Laboratorio de
Tecnologías de Información
ƒ Three ways of improving the speed of these circuits:
™ (i) Use better technology (e.g. ECL faster than TTL gates), BUT
(a) faster technology is more expensive, needs more power, lower-level of
integrations.
(b) physical limits (e.g. speed of light, size of atom).
™ (ii) Use gate-level designs to two-level circuits! (use sum-
of-products/product-of-sums) BUT
™ (a) complicated designs for large circuits.
™ (b) product/sum terms need MANY inputs!
™ (iii) Use clever look-ahead techniques BUT there are
additional costs (hopefully reasonable).
Arquitectura de Computadoras
ALU1- 58
Look-Ahead Carry Adder
Laboratorio de
Tecnologías de Información
ƒ Consider the full adder:
Pi
Xi
Yi
Si
Gi
Ci+1
where intermediate
signals are labelled as
Pi, Gi, and defined as:
Ci
Pi = Xi⊕Yi
Gi = Xi.Yi
ƒ The outputs, Ci+1,Si, in terms of Pi ,Gi ,Ci , are:
Si = Pi ⊕ Ci
Ci+1 = Gi + Pi.Ci
…(1)
…(2)
ƒ If you look at equation (2),
Gi = Xi.Yi is a carry generate signal
Pi = Xi ⊕ Yi is a carry propagate signal
Arquitectura de Computadoras
ALU1- 59
Look-Ahead Carry Adder
Laboratorio de
Tecnologías de Información
ƒ For 4-bit ripple-carry adder, the equations to obtain four carry signals
are:
Ci+1 = Gi + Pi.Ci
Ci+2 = Gi+1 + Pi+1.Ci+1
Ci+3 = Gi+2 + Pi+2.Ci+2
Ci+4 = Gi+3 + Pi+3.Ci+3
ƒ These formula are deeply
nested, as shown here for
Ci+2:
Ci
Pi
Ci+1
Gi
Pi+1
Ci+2
Gi+1
4-level circuit for Ci+2 = Gi+1 + Pi+1.Ci+1
Arquitectura de Computadoras
ALU1- 60
Look-Ahead Carry Adder
Laboratorio de
Tecnologías de Información
ƒ Nested formula/gates cause ripple-carry propagation delay.
ƒ Can reduce delay by expanding and flattening the formula for carries.
For example, Ci+2
Ci+2 = Gi+1 + Pi+1.Ci+1
= Gi+1 + Pi+1.(Gi + Pi.Ci )
= Gi+1 + Pi+1.Gi + Pi+1.Pi.Ci
ƒ New faster circuit for Ci+2
Ci
Pi
Pi+1
Gi
Pi+1
Ci+2
Gi+1
Arquitectura de Computadoras
ALU1- 61
Look-Ahead Carry Adder
Laboratorio de
Tecnologías de Información
ƒ Other carry signals can also be similarly flattened.
Ci+3= Gi+2 + Pi+2Ci+2
= Gi+2 + Pi+2(Gi+1 + Pi+1Gi + Pi+1PiCi)
= Gi+2 + Pi+2Gi+1 + Pi+2Pi+1Gi + Pi+2Pi+1PiCi
Ci+4 = Gi+3 + Pi+3Ci+3
= Gi+3 + Pi+3(Gi+2 + Pi+2Gi+1 + Pi+2Pi+1Gi + Pi+2Pi+1PiCi)
= Gi+3 + Pi+3Gi+2 + Pi+3Pi+2Gi+1 + Pi+3Pi+2Pi+1Gi + Pi+3Pi+2Pi+1PiCi
ƒ Notice that formulae gets longer with higher carries.
ƒ Also, all carries are two-level “sum-of-products” expressions, in terms
of the generate signals, Gs, the propagate signals, Ps, and the first
carry-in, Ci.
Arquitectura de Computadoras
ALU1- 62
Look-Ahead Carry Adder
Laboratorio de
Tecnologías de Información
ƒ We employ the lookahead formula in this
lookahead-carry adder
circuit:
Arquitectura de Computadoras
ALU1- 63
Look-Ahead Carry Adder
Laboratorio de
Tecnologías de Información
ƒ The 74182 IC chip allows
faster lookahead adder to be
built.
ƒ Maximum propagation delay is
4t (t to get generate &
propagate signals, 2t to get
the carries and t for the sum
signals) where t is the
average gate delay.
Arquitectura de Computadoras
ALU1- 64
Making a subtraction circuit
Laboratorio de
Tecnologías de Información
♦ We could build a subtraction circuit directly, similar to the
way we made unsigned adders
♦ However, by using two’s complement we can convert any
subtraction problem into an addition problem.
Algebraically,
A - B = A + (-B)
♦ So to subtract B from A, we can instead add the negation
of B to A
♦ This way we can re-use the unsigned adder hardware
Arquitectura de Computadoras
ALU1- 65
Why does this work?
Laboratorio de
Tecnologías de Información
♦ For n-bit numbers, the negation of B in two’s complement
is 2n - B (this is one of the alternative ways of negating a
two’s-complement number).
A - B = A + (-B)
= A + (2n - B)
= (A - B) + 2n
♦ If A ≥ B, then (A - B) is a positive number, and 2n
represents a carry out of 1. Discarding this carry out is
equivalent to subtracting 2n, which leaves us with the
desired result (A - B).
♦ If A < B, then (A - B) is a negative number and we have 2n
- (A - B). This corresponds to the desired result, -(A - B),
in two’s complement form.
Arquitectura de Computadoras
ALU1- 66
A two’s complement subtraction
circuit
Laboratorio de
Tecnologías de Información
♦ To find A - B with an adder, we’ll need to:
■ Complement each bit of B.
■ Set the adder’s carry in to 1.
♦ The net result is A + B’ + 1, where B’ + 1 is the two’s complement
negation of B.
♦ A3, B3 and S3 here are actually sign bits.
Arquitectura de Computadoras
ALU1- 67
Small differences
Laboratorio de
Tecnologías de Información
♦ The only differences between the adder and subtractor circuits are:
■ The subtractor has to negate B3 B2 B1 B0.
■ The subtractor sets the initial carry in to 1, instead of 0.
♦ It’s not too hard to make one circuit that does both addition and
subtraction
Arquitectura de Computadoras
ALU1- 68
An adder-subtractor circuit
♦
♦
♦
Laboratorio de
Tecnologías de Información
XOR gates let us selectively complement the B input.
X⊕0=X
X ⊕ 1 = X’
When Sub = 0, the XOR gates output B3 B2 B1 B0 and the carry in is 0. The
adder output will be A + B + 0, or just A + B.
When Sub = 1, the XOR gates output B3’ B2’ B1’ B0’ and the carry in is 1.
Thus, the adder output will be a two’s complement subtraction, A - B.
Arquitectura de Computadoras
ALU1- 69
Subtraction summary
Laboratorio de
Tecnologías de Información
♦ A good representation for negative numbers makes subtraction
hardware much easier to design.
■ Two’s complement is used most often (although signed magnitude
shows up sometimes, such as in floating-point systems)
■ Using two’s complement, we can build a subtractor with minor changes
to the adder from last week.
■ We can also make a single circuit which can both add and subtract.
♦ Overflow is still a problem, but signed overflow is very different from
the unsigned overflow
♦ Sign extension is needed to properly “lengthen” negative numbers.
♦ We will use most of the ideas we’ve seen so far to build an ALU –
an important part of a processor.
Arquitectura de Computadoras
ALU1- 70
Homework 4
Laboratorio de
Tecnologías de Información
♦ Computer Organization and Design: The Hardware and Software
Interface. Third Edition. David A. Patterson and John L. Hennesy.
Morgan and Kauffmann Publishers. USA. 2005.
♦ Solve the following exercises:
♦ Chapter 1.
■ Exercises: 1.47, 1.48, 1.50, 1.51, 1.52, 1.53, 1.54
♦ Chapter 2.
■ Exercises: 2.6, 2.29, 2.30, 2.31, 2.32, 2.33, 2.37, 2.49, 2.51
♦ Send a pdf file
Due date: October 6th, 2008.
Arquitectura de Computadoras
ALU1- 71
Descargar