Radix4div

From Hamsterworks Wiki!

Jump to: navigation, search

This FPGA Project was started November 2011. It is still in formative stage, but has been abandoned as a pipelined radix-2 solution was more efficient.

I'm going to implement a generic Radix4 division package in VHDL.

Contents

Algorithm

Unlike simple binary division Radix 4 division generates two bits of the result for each operation, decreasing latency. To decide what bits to add to the quotient it compares against the divisor, 2*divisor and 3*divisor.

It can either be implemented as feedback (so the same gates are used for each bit pair) or as a pipeline.

See Binary division for information, and C code for it.

Interface

  • Dividend - Unsigned integer
  • Divisor - Unsigned integer
  • Clock signal - Unsigned integer

Constant (generic) information also needed:

  • Width of Dividend
  • Width of Divisor
  • Width of Quotient

Outputs:

  • Quotient

Top-level VDHL entity

entity radix4division is
  Generic (divisor_width  : natural := 4;
           dividend_width : natural;
           quotient_width : natural;
  );
  Port ( Clk      : in  STD_LOGIC;
         dividend : in  STD_LOGIC_VECTOR(dividend_width-1 downto 0);
         divisor  : in  STD_LOGIC_VECTOR(divisor_width-1  downto 0);
         quotient : out STD_LOGIC_VECTOR(quotient_width-1 downto 0);
  );
end radix4division;

Each step

Inputs:

  • Existing quotient
  • Current dividend
  • Divisor
  • Clock signal
  • Bit offset of bits to generate

Constant (generic) information also needed:

  • Width of Divisor
  • Width of Quotient
  • Bit offset of bits to generate

Outputs:

  • updated quotient
  • updated dividend
  • Divisor

It is possible to reduce the length of comparisons - rather than comparing n2-2 bits only 4 bits need subtraction - the lowwer order bits can just be ORed together.

Equations for first step of a/b:

  • Comparing against divisor
    • x1 <= (0 & a(n-1 ... n-2)))
    • y1 <= (OR(b(n-1 ... 2)) & b(1 ... 0))
    • out1 <= (x-y)(1 ... 0) & a(n-3 ... 0)
  • Comparing against 2*divisor
    • x2 <= (0 & a(n-1))
    • y2 <= (OR(b(n-1 ... 1)) & b(0))
    • out2 <= (x-y)(0) & a(n-2 ... 0)
  • Comparing against 3*divisor
    • sum <= (0 & b(1 ... 0)) + (0 & b(0) & 0)
    • x3 <= (0 & a(n-1) & a(n-2))
    • y3 <= (OR(d(n-1 ... 1),sum(2)) & sum(1 ... 0))
    • out3 <= (x-y)(1 ... 0) & a(n-3 ... 0)

which 'out' is passed to the next stage is selected based on which of out1, out2 or out3 were positive (if any).

If you have n bits, you need 'n/2' stages, and if you number stages from (n/2-1) (dealing to the high bits) to 0 (dealing to bits 1 and 0) the equations become more generic:

For stage i (need to verify!):

  • Comparing against divisor
    • x1 <= (0 & a(n-1 ... 2i) )
    • y1 <= (OR(b(n-1 ... n-2i)) & b(n-2i ... 0))
    • out1 <= (x-y)(n/2-i-1 ... 0) & a(2i-1 ... 0)
  • Comparing against 2*divisor
    • x2 <= (0 & a(n-1 ... 2i+1) )
    • y2 <= (OR(b(n-1 ... n-2i+1)) & b(0))
    • out2 <= (x-y)(n/2-2i-2..0) & a(2i-1 ... 0)
  • Comparing against 3*divisor
    • sum <= (0 & b(n-2i-1 .. 0)) + (0 & b(n-2i-2 ... 0) & 0)
    • x3 <= (0 & a(n-1 ... 2i))
    • y3 <= (OR(d(n-1 ... n-2i),sum(2)) & sum(1 ... 0))
    • out3 <= (x-y)(n/2-i-1 ... 0) & a(2i-1 ... 0)

Step VHDL module specification

Personal tools