This FPGA Project was started in June 2018

I've been wondering about how Bose–Chaudhuri–Hocquenghem codes work. I know that the are used in the HDMI data packets to correct errors, but original HDMI projects ignored them - I had no idea how they actually work.

The description on Wikipedia ( https://en.wikipedia.org/wiki/BCH_code ) is impenetrable to me, and most of the information on the web is about more advanced applications, designed to allow multi-bit errors to be corrected.

I want to add ECC of the data packets to my HDMI projects, so tried to figure it all out.

## So what is really going on?

You can either look at the code used as a cyclic redundancy check, based around an 8-bit Linear Feedback Shift Register. This doesn't give any insight into how you can efficiently use them to correct errors.

A better way to look at the is this:

Each of the 24 data bits cause a different data bit to be XORed into the final parity.

```Checksum
7      0 23       data bits       0
10000011 10000000 00000000 00000000
11000010 01000000 00000000 00000000
01100001 00100000 00000000 00000000
10110011 00010000 00000000 00000000
11011010 00001000 00000000 00000000
01101101 00000100 00000000 00000000
10110101 00000010 00000000 00000000
11011001 00000001 00000000 00000000
11101111 00000000 10000000 00000000
11110100 00000000 01000000 00000000
01111010 00000000 00100000 00000000
00111101 00000000 00010000 00000000
10011101 00000000 00001000 00000000
11001101 00000000 00000100 00000000
11100101 00000000 00000010 00000000
11110001 00000000 00000001 00000000
11111011 00000000 00000000 10000000
11111110 00000000 00000000 01000000
01111111 00000000 00000000 00100000
10111100 00000000 00000000 00010000
01011110 00000000 00000000 00001000
00101111 00000000 00000000 00000100
10010100 00000000 00000000 00000010
01001010 00000000 00000000 00000001

```

In the literature, the parity of a signal bit is called a "syndrome". If that bit is changed, the 'syndrome' is what you will see when you compare (XOR) the received and computed parity values.

The BCH code is constructed so that the parity of the parity bits are sort of their own parity. If these bits get corrupted, the checksum process will point directly at them. So the syndrome for parity bit 0 is 00000001, for bit 1 is 00000010, and so on.

The solving process for these simple codes is simple:

1. Recalculate the parity bits of the data
2. XOR the parity from the received data and the freshly computed parity
3. If the result of the XOR is zero, then you have no error
4. You can either use a lookup table to see what causes that syndrome, or calculate which bit has that syndrome.
5. If it isn't one of the syndromes then it is a multi-bit error

## Why can't BCH(32,24) fix multi-bit errors?

Simply put, there isn't enough parity - there is over 480 different double-bit errors, but only 256 different possible error patterns so the error can not be conclusively identified.

How does the math work for this?

If two bit flips occur then then error is the two syndromes XORed together. So if bits 0 and 1 were flipped to, then the error will be 10010100 XOR 01001010 = 11011110. However, that is the same error you get if bit 6 and parity bit 5 is flipped as 11111110 XOR 00100000 = 11011110. That is the reason why BCH(32,24) can detect multi-bit errors, but can only correct a single bit error.

Some BCH codes with more parity bits can detect and correct multi-bit errors - it can be solved efficiently by using math to 'factor' the error pattern into individual syndromes.

## An example of an impractical multi-bit BCH code

Say only want to transmit one bit - a somewhat important bit - and it is a very noisy channel, one with a 20% chance of error on each bit.

Add 6 parity bits to the single data bit, and then you have two codewords - either 0000000 or 1111111. On reception of a garbled codeword you can correct up to 3 errors, and have better than a 99% chance of correctly receiving the data.

## Source

### bch_correction_32bit_frame.vhd

```-------------------------------------------------------------------------------
-- bch_correction_32bit_frame.vhd
--
-- Author: Mike Field <hamster@snap.net.nz>
--
-- Perform the BHC(32,24) ECC code used in the HDMI data islands. The
-- 32 bits of data is streamed in, with start_of_frame asserted during
-- the first data bit
--
-- The corrected data is streamed out, with any single-bit errors
-- corrected. o_valid frame is asserted when the resulting data packet
-- no pass validation (i.e. does not have multiple bit errors)
--
-- Approximate simulation waveforms:
--
-- Inputs:
--   i_data_bit       :  --ddddddddddddddddddddddddpppppppp------------------------------------------------------------------
--   i_start_of_frame :  --10000000000000000000000000000000------------------------------------------------------------------
-- Outputs:
--   o_data_bit       :  ----------------------------------------------------------------ddddddddddddddddddddddddpppppppp----
--   o_valid_frame    :  ----------------------------------------------------------------10000000000000000000000000000000----
--   o_start_of_frame :  ----------------------------------------------------------------V0000000000000000000000000000000----
--
--    d = data bit, p = parity bit, V = generated 'frame valid' bit.
--
-- If no "valid frame" signal is required, then the latency can be reduced
-- by shortening (or even removing) the "corrected_(data|start)_shift registers
--
-- Resoruces (Xilinx Spartan 6):
--   Flipflops   : 32
--   LUTS        : 24
--
-------------------------------------------------------------------------------

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;

entity bch_correction_32bit_frame is
Port ( clk              : in   STD_LOGIC;
i_data_bit       : in   STD_LOGIC;
i_start_of_frame : in   STD_LOGIC;
o_data_bit       : out  STD_LOGIC;
o_valid_frame    : out  STD_LOGIC;
o_start_of_frame : out  STD_LOGIC);
end bch_correction_32bit_frame;

architecture Behavioral of bch_correction_32bit_frame is
signal error_shift_reg           : std_logic_vector( 7 downto 0)  := (others => '0');
signal parity_shift_reg          : std_logic_vector( 7 downto 0)  := (others => '0');
signal original_data_shift_reg   : std_logic_vector(32 downto 0)  := (others => '0');
signal start_shift_reg           : std_logic_vector(32 downto 0)  := (others => '0');
signal corrected_data_shift_reg  : std_logic_vector(30 downto 0)  := (others => '0');
signal corrected_start_shift_reg : std_logic_vector(30 downto 0)  := (others => '0');
begin

process(clk)
begin
if rising_edge(clk) then
---------------------------------------------------------
-- Output the data - it is only balid when the 'error'
-- reg is either zero, or the 31st bit will be corrected
---------------------------------------------------------
if error_shift_reg = x"00" or error_shift_reg = x"4A" then
o_valid_frame    <= corrected_start_shift_reg(0);
else
o_valid_frame    <= '0';
end if;
o_data_bit       <= corrected_data_shift_reg(0);
o_start_of_frame <= corrected_start_shift_reg(0);

---------------------------------------------------------
-- Where the error correction actually happens
---------------------------------------------------------
if error_shift_reg = x"4A" then
corrected_data_shift_reg <= (not original_data_shift_reg(0)) & corrected_data_shift_reg(corrected_data_shift_reg'high downto 1);
error_shift_reg <= (others => '0');
else
corrected_data_shift_reg <= original_data_shift_reg(0) & corrected_data_shift_reg(corrected_data_shift_reg'high downto 1);

---------------------------------------------------------
-- Reversing the encoding LFSR one bit
---------------------------------------------------------
if error_shift_reg(0) = '1' then
error_shift_reg <= ("0" & error_shift_reg(7 downto 1)) XOR x"83";
else
error_shift_reg <= "0" & error_shift_reg(7 downto 1);
end if;
end if;
corrected_start_shift_reg <= start_shift_reg(0) & corrected_start_shift_reg(corrected_start_shift_reg'high downto 1);

-----------------------------------------------------------------------
-- Have we seen enough data that we can calculate the parity error?
-----------------------------------------------------------------------
if start_shift_reg(start_shift_reg'high-31) = '1' then
error_shift_reg <= parity_shift_reg xor original_data_shift_reg(original_data_shift_reg'high downto original_data_shift_reg'high-7);
end if;

-----------------------------------------------------------------------
-- Calculating the parity of the incoming data, but use bits that are
-- seven bits old
-----------------------------------------------------------------------
if start_shift_reg(start_shift_reg'high-7) = '1' then
-- Startng a new parity calculation
if original_data_shift_reg(original_data_shift_reg'high-7) = '1' then
parity_shift_reg <= x"83";
else
parity_shift_reg <= x"00";
end if;
else
-- Adding the next bit into the current parity calculations
if (original_data_shift_reg(original_data_shift_reg'high-7) xor parity_shift_reg(0))  = '1' then
parity_shift_reg <= ("0" & parity_shift_reg(7 downto 1)) XOR x"83";
else
parity_shift_reg <= ("0" & parity_shift_reg(7 downto 1));
end if;
end if;
-----------------------------------------------------------------------
-- Moving data through the shift registers
-----------------------------------------------------------------------
original_data_shift_reg  <= i_data_bit       & original_data_shift_reg(original_data_shift_reg 'high downto 1);
start_shift_reg          <= i_start_of_frame & start_shift_reg(start_shift_reg 'high downto 1);
end if;
end process;
end Behavioral;

```

### bch_correction_64bit_frame.vhd

I also need to code the ECC process for the 64-bit data frames, that are received two bits at a time.