# Radix4div

### From Hamsterworks Wiki!

This FPGA Project was started November 2011. It is still in formative stage, but has been abandoned as a pipelined radix-2 solution was more efficient.

I'm going to implement a generic Radix4 division package in VHDL.

## Contents |

## Algorithm

Unlike simple binary division Radix 4 division generates two bits of the result for each operation, decreasing latency. To decide what bits to add to the quotient it compares against the divisor, 2*divisor and 3*divisor.

It can either be implemented as feedback (so the same gates are used for each bit pair) or as a pipeline.

See Binary division for information, and C code for it.

## Interface

- Dividend - Unsigned integer
- Divisor - Unsigned integer
- Clock signal - Unsigned integer

Constant (generic) information also needed:

- Width of Dividend
- Width of Divisor
- Width of Quotient

Outputs:

- Quotient

## Top-level VDHL entity

entity radix4division is Generic (divisor_width : natural := 4; dividend_width : natural; quotient_width : natural; ); Port ( Clk : in STD_LOGIC; dividend : in STD_LOGIC_VECTOR(dividend_width-1 downto 0); divisor : in STD_LOGIC_VECTOR(divisor_width-1 downto 0); quotient : out STD_LOGIC_VECTOR(quotient_width-1 downto 0); ); end radix4division;

## Each step

Inputs:

- Existing quotient
- Current dividend
- Divisor
- Clock signal
- Bit offset of bits to generate

Constant (generic) information also needed:

- Width of Divisor
- Width of Quotient
- Bit offset of bits to generate

Outputs:

- updated quotient
- updated dividend
- Divisor

It is possible to reduce the length of comparisons - rather than comparing n2-2 bits only 4 bits need subtraction - the lowwer order bits can just be ORed together.

Equations for first step of a/b:

- Comparing against divisor
- x1 <= (0 & a(n-1 ... n-2)))
- y1 <= (OR(b(n-1 ... 2)) & b(1 ... 0))
- out1 <= (x-y)(1 ... 0) & a(n-3 ... 0)

- Comparing against 2*divisor
- x2 <= (0 & a(n-1))
- y2 <= (OR(b(n-1 ... 1)) & b(0))
- out2 <= (x-y)(0) & a(n-2 ... 0)

- Comparing against 3*divisor
- sum <= (0 & b(1 ... 0)) + (0 & b(0) & 0)
- x3 <= (0 & a(n-1) & a(n-2))
- y3 <= (OR(d(n-1 ... 1),sum(2)) & sum(1 ... 0))
- out3 <= (x-y)(1 ... 0) & a(n-3 ... 0)

which 'out' is passed to the next stage is selected based on which of out1, out2 or out3 were positive (if any).

If you have n bits, you need 'n/2' stages, and if you number stages from (n/2-1) (dealing to the high bits) to 0 (dealing to bits 1 and 0) the equations become more generic:

For stage i (need to verify!):

- Comparing against divisor
- x1 <= (0 & a(n-1 ... 2i) )
- y1 <= (OR(b(n-1 ... n-2i)) & b(n-2i ... 0))
- out1 <= (x-y)(n/2-i-1 ... 0) & a(2i-1 ... 0)

- Comparing against 2*divisor
- x2 <= (0 & a(n-1 ... 2i+1) )
- y2 <= (OR(b(n-1 ... n-2i+1)) & b(0))
- out2 <= (x-y)(n/2-2i-2..0) & a(2i-1 ... 0)

- Comparing against 3*divisor
- sum <= (0 & b(n-2i-1 .. 0)) + (0 & b(n-2i-2 ... 0) & 0)
- x3 <= (0 & a(n-1 ... 2i))
- y3 <= (OR(d(n-1 ... n-2i),sum(2)) & sum(1 ... 0))
- out3 <= (x-y)(n/2-i-1 ... 0) & a(2i-1 ... 0)