9.1 桶形移位寄存器
桶形移位寄存器的电路结构如下图所示。该电路包括3个独立的桶形移位器,左侧为第一级寄存器,只有1个‘0’连接到左下角的一个复用器上,第二级有2个‘0’,第三级有4个‘0’。对于位宽更高的矢量,将采用这种逐级加倍插入‘0’的方法来构造桶形移位寄存器。
---------------------桶形移位寄存器------------------------
LIBRARY ieee;
USE ieee.std_logic_1164.all;
-----------------------------------------------------------
ENTITY barrel IS
PORT(
inp : IN STD_LOGIC_VECTOR(7 DOWNTO 0);
shift : IN STD_LOGIC_VECTOR(2 DOWNTO 0); --移动的位数,用二进制数表示
outp : OUT STD_LOGIC_VECTOR(7 DOWNTO 0)
);
END barrel;
----------------------------------------------------------
ARCHITECTURE behavior OF barrel IS
BEGIN
PROCESS(inp,shift)
VARIABLE temp1 : STD_LOGIC_VECTOR(7 DOWNTO 0);
VARIABLE temp2 : STD_LOGIC_VECTOR(7 DOWNTO 0);
BEGIN
----------------------1st shifter-----------------
IF (shift(0) = '0') THEN
temp1 := inp;
ELSE
temp1(0) := '0';
FOR i IN 1 TO inp'HIGH LOOP
temp1(i) := inp(i-1);
END LOOP;
END IF;
----------------------2st shifter-----------------
IF (shift(1) = '0') THEN
temp2 := temp1;
ELSE
FOR i IN 0 TO 1 LOOP
temp2(i) := '0';
END LOOP;
FOR i IN 2 TO inp'HIGH LOOP
temp2(i) := temp1(i-2);
END LOOP;
END IF;
----------------------3st shifter----------------
IF (shift(2) = '0') THEN
outp <= temp2;
ELSE
FOR i IN 0 TO 3 LOOP
outp(i) <= '0';
END LOOP;
FOR i IN 4 TO inp'HIGH LOOP
outp(i) <= temp2(i-4);
END LOOP;
END IF;
END PROCESS;
END behavior;
9.2 逐级进位和超前进位加法器
逐级进位加法器
- 优点:需要较少的硬件资源
- 缺点:完成一次计算所需的时延较大(原因:每个全加器的输出结果都依赖于前一级产生的进位)
- 1位全加器FAU:a和b是输入位,cin是进位输入位,s是求和的结果,cout是进位输出位。当输入位有奇数个‘1’时,s必定是‘1’,而当有两个或更多的输入为‘1’时,cout必定是‘1’。逻辑表达式如下:
s=a XOR b XOR cin --a 异或 b:a和b不同时结果为1,反之为0
cout=(a AND b)OR(a AND cin)OR(b AND cin)
下图给出了一个4位无符号的逐级进位加法器,对每一位都使用了全加器FAU,同时给出了1位全加器的真值表。
---------------------4位无符号数的逐级进位加法器------------------------
LIBRARY ieee;
USE ieee.std_logic_1164.all;
-----------------------------------------------------------
ENTITY adder_cripple IS
GENERIC(
n : INTEGER :=4
);
PORT(
a : IN STD_LOGIC_VECTOR(n-1 DOWNTO 0);
b : IN STD_LOGIC_VECTOR(n-1 DOWNTO 0);
cin : IN STD_LOGIC;
s : OUT STD_LOGIC_VECTOR(n-1 DOWNTO 0);
cout : OUT STD_LOGIC
);
END adder_cripple;
----------------------------------------------------------
ARCHITECTURE adder OF adder_cripple IS
SIGNAL c : STD_LOGIC_VECTOR(n DOWNTO 0);
BEGIN
c(0) <= cin;
G1:FOR i IN 0 TO n-1 GENERATE
s(i) <= a(i) XOR b(i) XOR c(i);
c(i+1) <= (a(i) AND b(i)) OR
(a(i) AND c(i)) OR
(b(i) AND c(i));
END GENERATE;
cout <= c(n);
END adder;
- 注意:对于操作符中预定义的“+”操作符,加法器可以直接通过此操作符来实现,且此时综合工具一般会采用逐级进位加法器来实现目标电路。如果不希望采用这种方法,则必须在代码描述上清晰地体现出来。
超前进位加法器
- 优点:运行速度较快
- 缺点:硬件的复杂程度增加
- 运算原理:根据1位全加器的真值表有:
①\[{S_i} = \overline {{A_i}} {B_i}\overline {{C_i}} +{A_i}\overline {{B_i}} \cdot \overline {{C_i}} + \overline {{A_i}} \cdot \overline {{B_i}} {C_i} + {A_i}{B_i}{C_i} = \left( {{A_i} \oplus{B_i}} \right)\overline {{C_i}} + \overline {\left( {{A_i} \oplus {B_i}}\right)} {C_i} = {A_i} \oplus {B_i} \oplus {C_i}\]
②\[{C_{i + 1}} = {A_i}{B_i}\overline {{C_i}} + {A_i}{B_i}{C_i} + \overline {{A_i}} {B_i}{C_i} + {A_i}\overline {{B_i}} {C_i} = {A_i}{B_i} + \left( {{A_i} \oplus {B_i}} \right){C_i}\]
此时令:进位产生信号(generate)\[{G_i} = {A_i} *{B_i}\],进位传递信号(propagate)\[{P_i} = {A_i}\oplus {B_i}\],则有\[{C_{i + 1}} ={G_i} + {P_i} * {C_i}\]。
另外,4位超前进位加法器的电路结构图如下所示:
现在假设有两个输入矢量:a=a(n-1)a(n-2)...a(1)a(0)和b=b(n-1)b(n-2)...b(1)b(0),那么相应的generate矢量为g=g(n-1)g(n-2)...g(1)g(0),相应的propagate矢量为p=p(n-1)p(n-2)...p(1)p(0)。其中
g(j) = a(j) AND b(j)
p(j) = a(j) XOR b(j)
现在假设进位矢量c=c(n)c(n-1)c(n-2)...c(1)c(0)且\[{C_{i + 1}} ={C_i} * {P_i} + {G_i}\], 即有:
c(0) = cin
c(1) = c(0)p(0)+g(0)
c(2) = c(1)p(1)+g(1) = (c(0)p(0)+g(0))p(1)+g(1) = c(0)p(0)p(1)+g(0)p(1)+g(1)
c(3) = c(2)p(2)+g(2) = (c(0)p(0)p(1)+g(0)p(1)+g(1))p(2)+g(2) = c(0)p(0)p(1)p(2)+g(0)p(1)p(2)+g(1)p(2)+g(2)
c(4) = c(3)p(3)+g(3) = (c(0)p(0)p(1)p(2)+g(0)p(1)p(2)+g(1)p(2)+g(2))p(3)+g(3)
= c(0)p(0)p(1)p(2)p(3)+g(0)p(1)p(2)p(3)+g(1)p(2)p(3)+g(2)p(3)+g(3),等等。
注意:上面的表达式没有一个取决于前级进位输出的计算结果,这是超前进位加法器电路执行速度快的根本原因。另一方面,这种方式会使硬件电路的复杂程度增加,所以只能用于实现位数不是很多的加法器(如4位),位宽更高的超前进位加法器就需要将上述小规模电路组合起来了。
4位超前进位加法器:
-----------------------------------------------------------
LIBRARY ieee;
USE ieee.std_logic_1164.all;
-----------------------------------------------------------
ENTITY CLA_Adder IS
PORT( a: IN STD_LOGIC_VECTOR(3 DOWNTO 0);
b: IN STD_LOGIC_VECTOR(3 DOWNTO 0);
cin: IN STD_LOGIC;
s: OUT STD_LOGIC_VECTOR(3 DOWNTO 0);
cout: OUT STD_LOGIC
);
END CLA_Adder;
-----------------------------------------------------------
ARCHITECTURE CLA_Adder OF CLA_Adder IS
SIGNAL c: STD_LOGIC_VECTOR(4 DOWNTO 0);
SIGNAL p: STD_LOGIC_VECTOR(3 DOWNTO 0);
SIGNAL g: STD_LOGIC_VECTOR(3 DOWNTO 0);
BEGIN
---------------------PGU:------------------------------
G1: FOR i IN 0 TO 3 GENERATE
p(i) <= a(i) XOR b(i); --异或
g(i) <= a(i) AND b(i);
s(i) <= p(i) XOR c(i);
END GENERATE;
---------------------CLAU:-----------------------------
c(0) <= cin;
c(1) <= (cin AND p(0)) OR
g(0);
c(2) <= (cin AND p(0) AND p(1)) OR
(g(0) AND p(1)) OR
g(1);
c(3) <= (cin AND p(0) AND p(1) AND p(2)) OR
(g(0) AND p(1) AND p(2)) OR
(g(1) AND p(2)) OR
g(2);
c(4) <= (cin AND p(0) AND p(1) AND p(2) AND p(3)) OR
(g(0) AND p(1) AND p(2) AND p(3)) OR
(g(1) AND p(2) AND p(3)) OR
(g(2) AND p(3)) OR
g(3);
cout <= c(4);
END CLA_Adder;
-----------------------------------------------------------
9.3 定点除法
注意:预定义运算符“/”只能进行2^n类型的除法运算,其运算本质实际上就是移位操作。
除法电路的算法
假设要计算 y=a/b,其中a>b ,且a是 n 位二进制数。假设:a=“1011”,b=“0011”(这里左边可补0可不补0),改除法运算的结果是y=“0011”和余数“0010”.该除法算法如下表:
索引i | 与a相关的输入a_inp(i) | 比较 | 与b相关的输入b_inp(i) | 商y(i) | 操作 |
3 | 1011 | < | 0011000 | 0 | none |
2 | 1011 | < | 0001100 | 0 | none |
1 | 1011 | > | 0000110 | 1 | a_inp(0)=a_inp(1)-b_inp(1) |
0 | 0101 | > | 0000011 | 1 | rem=a_inp(0)-b_inp(0) |
0010(rem) |
对于除法运算有:
①i 的范围为 (n-1 downto 0);
②b_inp(i) 为b 左移 i 位,低位补零;
③与a相关的输入中,a_inp(n-1) = a,另有
,
其中余数 rem=a_inp(-1) 且遵循上述等式;
④商
。
VHDL除法器:
-----------------------------------------------------------
library ieee;
use ieee.std_logic_1164.all;
-----------------------------------------------------------
entity divider is
generic(n: integer := 4);
port(
a: in integer range 0 to 15;
b: in integer range 0 to 15;
y: out std_logic_vector( n-1 downto 0 );
rest: out integer range 0 to 15;
err: out std_logic
);
end divider;
-----------------------------------------------------------
architecture rtl of divider is
begin
process (a,b) --“process”中为顺序描述
variable temp1: integer range 0 to 15;
variable temp2: integer range 0 to 15;
begin
--------------Error and initialization-------------
temp1 := a;
temp2 := b;
if (b = 0) then
err <= '1';
else
err <= '0';
end if;
-------------------------y:------------------------
for i in n-1 downto 0 loop
if (temp1 >= temp2*2**i) then
y(i) <= '1';
temp1 := temp2*2**i;
else
y(i) <= '0';
end if;
end loop;
----------------------Remainder--------------------
rest <= temp1;
end process;
end rtl;