Simple SDRAM Controller

From Hamsterworks Wiki!

Jump to: navigation, search

This FPGA Project was completed in October 2013.

I'm setting out to design a simple, functional SDRAM controller for the Papilio Pro and Logi-Pi FPGA boards.

Here's the Logi-Pi, sandwiched between an Arduino LCD shield and a Raspberry Pi:

Logi-pi.jpg

SDRAM has a completely different interface to that of async DRAM. A clock is provided from the memory controller to synchronise transfers. This added synchronisation allows for higher bandwidth transfers, and allows for a commands to be sent while transfers are still in progress.

Contents

The interface

COMPONENT SDRAM_Controller
PORT(
   clk : IN std_logic;
   clk_mem : IN std_logic;-- not needed at the moment
   reset : IN std_logic;
      
   -- Interface to issue commands
   cmd_ready       : OUT std_logic;
   cmd_enable      : IN std_logic;
   cmd_wr          : IN std_logic;
   cmd_address     : IN std_logic_vector(22 downto 0);
   cmd_byte_enable : IN std_logic_vector(3 downto 0);
   cmd_data_in     : IN std_logic_vector(31 downto 0);    
      
   -- Data being read back from SDRAM
   data_out        : OUT std_logic_vector(31 downto 0);
   data_out_ready  : OUT std_logic;

   -- SDRAM signals
   SDRAM_CLK  : OUT   std_logic;
   SDRAM_CKE  : OUT   std_logic;
   SDRAM_CS   : OUT   std_logic;
   SDRAM_RAS  : OUT   std_logic;
   SDRAM_CAS  : OUT   std_logic;
   SDRAM_WE   : OUT   std_logic;
   SDRAM_DQM  : OUT   std_logic_vector(1 downto 0);
   SDRAM_ADDR : OUT   std_logic_vector(12 downto 0);
   SDRAM_BA   : OUT   std_logic_vector(1 downto 0);
   SDRAM_DATA : INOUT std_logic_vector(15 downto 0)     
   );
END COMPONENT;

The use of the SDRAM controller is pretty simple. Any time that CMD_READY is asserted by the memory controller, and CMD_ENABLE is asserted by your logic the address, data, byte enable flags and write flag will be registered by the controller, and the command will be carried out as soon as possible. Should you want to write to memory, that is all that you have to do.

To perform reads the value of the cmd_data_in cmd_byte_enable inputs are ignored. As soon as the data has been read it will be presented on data_out, and data_out will be asserted for one cycle. This may happen many cycles after the command is issued (depending on when a refresh cycle is due), but sometime during the command execution CMD_READY will be asserted so you can queue up the next transaction.

Different boards and different clock speeds required tweaking to get the timing of the latching of SDRAM_DQ at the appropriate time. I am testing this on the Logi-Pi board by ValientFX, where as luck has it the same clock manages the control signals can be used to register the incoming data. This is what CLK_Mem is supposed to be used for. I've included but commented out a few different examples of latching on rising or falling edge of the clock.

Here is my approximate timing budget:

  • speed of the I/O buffer for the SDRAM_CLK signal - 2ns approx,
  • PCB propagation delay - 0.7ns per inch - 0.5ns approx
  • SDRAM clock to data valid time - 7ns
  • propagation time of the DQ signals back - 0.5ns
  • and finally the speed of the input buffers - 2ns?

So their is maybe 15ns between the FPGA clock tick that triggers the presentation of the data on the SDRAM's DQ signals and the valid data arriving back in the FPGA from the SDRAM. At 100MHz it moves it firmly into the next clock cycle. At 50MHz the data will be ready towards the end of the current clock cycle.

Performance

The design passes timing at 100MHz, and most likely could run faster. It is also small, using around 66 Spartan 6 slices.

This is are really, really simple controller. As it opens and closes the SDRAM row for every transaction, so each transaction takes 16 or 17 cycles. With each transaction reading or writing 4 bytes the performance is about 24MB/s - about 1/8th of what it could be.

Extending data interface to 64 bits could boost throughput to about 45MB/s with only minimal changes to the logic, it would however add two more cycles of latency on the read transaction.

It should also be relatively simple to add back-to-back writes and back-to-back reads to the state machine, and this will enable performance of up to 200MB/s when a stream of reads or writes to the same SDRAM row are issued.

Another possible performance improvement could be to have an "idle with row open" state, which should allow reads and writes to the same row to proceed quickly, and then only precharge the row when a refresh cycle or change of row is required. This might boost performance to about 100MBs/ depending on workload - a workload that reads from two different rows might actually see higher latency if it waits for the read to complete before issuing the next read.

Timing margin

On the Papilio Pro I've done some testing to assess the timing margin.

Mhz Cycle (ns) Clock to capture (ns) Papilio Pro
72 13.89 20.83 FAILED
76 13.16 19.73 FAILED
80 12.50 18.75 OK
84 11.91 17.76 OK
88 11.36 17.05 OK
92 10.87 16.30 OK
96 10.42 15.62 OK
100 10.00 15.00 OK
104 9.61 14.42 OK
108 9.26 13.89 OK
112 8.93 13.39 OK
116 8.62 12.93 OK
120 8.33 12.50 OK
124 8.06 12.10 OK
128 7.81 11.72 OK
133 7.50 11.25 FAILED

Project files

Version 0.1 - minimal controller

Here is a zip file of the whole ISE project:

File:Logipi-sdram v2.zip

File:Papilio pro sdram.zip

The Papilio Pro version has different column/row widths (to account for the different SDRAM part), and runs at 96MHz rather than 100MHz (due to 32MHz system clock). It also still needs a bit of customization to halve the number of refresh cycles.

Version 0.3 - A more advanced controller

Version 0.3 is much improved. The following changes have been made:

  1. There are now four generics to configure the SDRAM controller for different parts
    • sdram_address_width - total number of bits in the SDRAM's address (e.g. row bits + bank bits + column bits)
    • sdram_column_bits - number of bits in the SDRAM's column address
    • sdram_startup_cycles: - how long (in memory cycles) the startup process should take. This is usually = 101us.
    • cycles_per_refresh - how often (in memory cycles) a refresh is required
  2. Now can perform back-to-back transactions in the same row.

File:Papilio pro sdram v0.3.zip

The performance is now improved:

  • interleaved single word reads and writes now get around 80MB/s
  • blocks of writes get about 124MB/s
  • blocks of reads are about 186MB/s

The downside is that the size has grown to about 80 slices.

Version 0.4 - A little bit better

This is pretty much optimal for a memory controller that doesn't open multiple banks open at once, or reorders or merges writes.

It is optimised to close rows when it becomes idle, to low the maximum latency for the next transaction. This might not be the best strategy for your use-case.

File:Papilio pro sdram v0.4.zip

Additional performance improvements:

  • blocks of read and writes now both get about 186MB/s
  • Better scheduling of refresh cycles has halved refresh overhead by taking advantage of row/bank switches
  • Squeezed back to 62 slices or so, depending on use

The majority of the controller can now be configured for different SDRAM devices using the following four generic parameters:

   generic (
     sdram_address_width : natural;
     sdram_column_bits   : natural;
     sdram_startup_cycles: natural;
     cycles_per_refresh  : natural
   );

Version 0.5 - Enhanced debugging

I've been doing a lot of debugging to identify PCB issues. The actual controller code is much the same as v0.4, but the tester/debugging code is much improved.

File:Logipi sdram v0.5.zip

File:Logipi sdram v0.5.RA2.zip

File:PapPro sdram v0.5.zip

Version 0.6 - Bugfix

There were issues with back-to-back reads in version 0.5 (and possibly earlier). Please do not use them!

File:Logipi sdram v0.6.zip

File:Logipi sdram v0.6.RA2.zip

File:PapPro sdram v0.6.zip


Verilog version 0.1

This is a conversion of the v0.6 code to Verilog, for use on an Open Source Hardware Project. It has passed the same verification tests as the VHDL version, but as it is my first ever bit of Verilog it may be very ungainly. Please don't laugh too hard!

File:Verilog Memory controller v0.1.zip

SDRAM_controller source

Here is the source of simplest controller. Performance is pretty average, taking 22 seconds to test the 8MB part on the Papilio Pro. The performance is much improved in the v0.4 controller takes around 4 seconds per test.

 
----------------------------------------------------------------------------------
-- Engineer: Mike Field <hamster@snap.net.nz>
-- 
-- Create Date:    14:09:12 09/15/2013 
-- Module Name:    SDRAM_Controller - Behavioral 
-- Description:    Simple SDRAM controller for a Micron 48LC16M16A2-7E
--                 or Micron 48LC4M16A2-7E @ 100MHz      
-- Revision: 
-- Revision 0.02 - Removed second clock signal that isn't needed.
-- Additional Comments: 
--
-- Performance is about
-- Writes 16 cycles = 6,250,000 writes/sec = 25.0MB/s (excluding refresh)
-- Reads  17 cycles = 5,882,352 reads/sec  = 23.5MB/s (excluding refresh)
--
----------------------------------------------------------------------------------
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
library UNISIM;
use UNISIM.VComponents.all;
use IEEE.NUMERIC_STD.ALL;


entity SDRAM_Controller is
    Port ( clk           : in  STD_LOGIC;
           reset         : in  STD_LOGIC;
           
           -- Interface to issue reads or write data
           cmd_ready         : out STD_LOGIC;                     -- '1' when a new command will be acted on
           cmd_enable        : in  STD_LOGIC;                     -- Set to '1' to issue new command (only acted on when cmd_read = '1')
           cmd_wr            : in  STD_LOGIC;                     -- Is this a write?
           cmd_address       : in  STD_LOGIC_VECTOR(22 downto 0); -- address to read/write
           cmd_byte_enable   : in  STD_LOGIC_VECTOR(3 downto 0);  -- byte masks for the write command
           cmd_data_in       : in  STD_LOGIC_VECTOR(31 downto 0); -- data for the write command
           
           data_out          : out STD_LOGIC_VECTOR(31 downto 0); -- word read from SDRAM
           data_out_ready    : out STD_LOGIC;                     -- is new data ready?
           
           -- SDRAM signals
           SDRAM_CLK     : out   STD_LOGIC;
           SDRAM_CKE     : out   STD_LOGIC;
           SDRAM_CS      : out   STD_LOGIC;
           SDRAM_RAS     : out   STD_LOGIC;
           SDRAM_CAS     : out   STD_LOGIC;
           SDRAM_WE      : out   STD_LOGIC;
           SDRAM_DQM     : out   STD_LOGIC_VECTOR( 1 downto 0);
           SDRAM_ADDR    : out   STD_LOGIC_VECTOR(12 downto 0);
           SDRAM_BA      : out   STD_LOGIC_VECTOR( 1 downto 0);
           SDRAM_DATA    : inout STD_LOGIC_VECTOR(15 downto 0));
end SDRAM_Controller;

architecture Behavioral of SDRAM_Controller is
   -- From page 37 of MT48LC16M16A2 datasheet
   -- Name (Function)       CS# RAS# CAS# WE# DQM  Addr    Data
   -- COMMAND INHIBIT (NOP)  H   X    X    X   X     X       X
   -- NO OPERATION (NOP)     L   H    H    H   X     X       X
   -- ACTIVE                 L   L    H    H   X  Bank/row   X
   -- READ                   L   H    L    H  L/H Bank/col   X
   -- WRITE                  L   H    L    L  L/H Bank/col Valid
   -- BURST TERMINATE        L   H    H    L   X     X     Active
   -- PRECHARGE              L   L    H    L   X   Code      X
   -- AUTO REFRESH           L   L    L    H   X     X       X 
   -- LOAD MODE REGISTER     L   L    L    L   X  Op-code    X 
   -- Write enable           X   X    X    X   L     X     Active
   -- Write inhibit          X   X    X    X   H     X     High-Z

   -- Here are the commands mapped to constants   
   constant CMD_UNSELECTED    : std_logic_vector(3 downto 0) := "1000";
   constant CMD_NOP           : std_logic_vector(3 downto 0) := "0111";
   constant CMD_ACTIVE        : std_logic_vector(3 downto 0) := "0011";
   constant CMD_READ          : std_logic_vector(3 downto 0) := "0101";
   constant CMD_WRITE         : std_logic_vector(3 downto 0) := "0100";
   constant CMD_TERMINATE     : std_logic_vector(3 downto 0) := "0110";
   constant CMD_PRECHARGE     : std_logic_vector(3 downto 0) := "0010";
   constant CMD_REFRESH       : std_logic_vector(3 downto 0) := "0001";
   constant CMD_LOAD_MODE_REG : std_logic_vector(3 downto 0) := "0000";

   constant MODE_REG          : std_logic_vector(12 downto 0) := 
    -- Reserved, wr bust, OpMode, CAS Latency (2), Burst Type, Burst Length (2)
         "000" &   "0"  &  "00"  &    "010"      &     "0"    &   "001";

   signal iob_command     : std_logic_vector( 3 downto 0) := CMD_NOP;
   signal iob_address     : std_logic_vector(12 downto 0) := (others => '0');
   signal iob_data        : std_logic_vector(15 downto 0) := (others => '0');
   signal iob_dqm         : std_logic_vector( 1 downto 0) := (others => '0');
   signal iob_cke         : std_logic := '0';
   signal iob_bank        : std_logic_vector( 1 downto 0) := (others => '0');
   
   attribute IOB: string;
   attribute IOB of iob_command: signal is "true";
   attribute IOB of iob_address: signal is "true";
   attribute IOB of iob_dqm    : signal is "true";
   attribute IOB of iob_cke    : signal is "true";
   attribute IOB of iob_bank   : signal is "true";
   attribute IOB of iob_data   : signal is "true";
   
   signal captured_data      : std_logic_vector(15 downto 0) := (others => '0');
   signal captured_data_last : std_logic_vector(15 downto 0) := (others => '0');
   signal sdram_din          : std_logic_vector(15 downto 0);
   attribute IOB of captured_data : signal is "true";
   
   type fsm_state is (s_startup,
                      s_idle_in_9,   
                      s_idle_in_8,   
                      s_idle_in_7,   
                      s_idle_in_6,   
                      s_idle_in_5, s_idle_in_4,   s_idle_in_3, s_idle_in_2, s_idle_in_1,
                      s_idle,
                      s_open_in_2, s_open_in_1,
                      s_write_1, s_write_2, s_write_3,
                      s_read_1,  s_read_2,  s_read_3,  s_read_4,  
                      s_precharge
                      );

   signal state              : fsm_state := s_startup;
   signal startup_wait_count : unsigned(15 downto 0) := to_unsigned(10100,16);  -- 10100
   
   signal refresh_count   : unsigned(9 downto 0) := (others => '0');
   signal pending_refresh : std_logic := '0';
   constant refresh_max   : unsigned(9 downto 0) := to_unsigned(3200000/8192-1,10);  -- 8192 refreshes every 64ms (@ 100MHz)
   
   signal addr_row         : std_logic_vector(12 downto 0);
   signal addr_col         : std_logic_vector(12 downto 0);
   signal addr_bank        : std_logic_vector( 1 downto 0);
   
   -- signals to hold the requested transaction
   signal save_wr          : std_logic := '0';
   signal save_row         : std_logic_vector(12 downto 0);
   signal save_bank        : std_logic_vector( 1 downto 0);
   signal save_col         : std_logic_vector(12 downto 0);
   signal save_d_in        : std_logic_vector(31 downto 0);
   signal save_byte_enable : std_logic_vector( 3 downto 0);
   
   signal iob_dq_hiz     : std_logic := '1';

   -- signals for when to read the data off of the bus
   signal data_ready_delay : std_logic_vector( 4 downto 0);
   
   signal ready_for_new   : std_logic := '0';
   signal got_transaction : std_logic := '0';
begin
   -- tell the outside world when we can accept a new transaction;
   cmd_ready <= ready_for_new;

   ----------------------------------------------------------------------------
   -- Seperate the address into row / bank / address
   -- fot the x16 part, columns are addr(8:0).
   -- for 32 bit (2 word bursts), the lowest bit will be controlled by the FSM
   ----------------------------------------------------------------------------
   -- for the logi-pi
   ----------------------------------------------------------------------------
   -- addr_row  <= cmd_address(22 downto 10);  
   -- addr_bank <= cmd_address( 9 downto  8);
   -- addr_col  <= cmd_address( 7 downto  0) & '0';   -- This changes for the x4, x8 or x16 parts.   
   ----------------------------------------------------------------------------
   --- for the papilio pro
   ----------------------------------------------------------------------------
   addr_row  <= cmd_address(21 downto  9);  
   addr_bank <= cmd_address( 8 downto  7);
   addr_col  <= "00000" & cmd_address( 6 downto  0) & '0';   -- This changes for the x4, x8 or x16 parts.   

   -----------------------------------------------------------
   -- Forward the SDRAM clock to the SDRAM chip - 180 degress 
   -- out of phase with the control signals (ensuring setup and holdup 
  -----------------------------------------------------------
 sdram_clk_forward : ODDR2
   generic map(DDR_ALIGNMENT => "NONE", INIT => '0', SRTYPE => "SYNC")
   port map (Q => sdram_clk, C0 => clk, C1 => not clk, CE => '1', R => '0', S => '0', D0 => '0', D1 => '1');

   -----------------------------------------------
   --!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
   --!! Ensure that all outputs are registered. !!
   --!! Check the pinout report to be sure      !!
   --!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
   -----------------------------------------------
   sdram_cke  <= iob_cke;
   sdram_CS   <= iob_command(3);
   sdram_RAS  <= iob_command(2);
   sdram_CAS  <= iob_command(1);
   sdram_WE   <= iob_command(0);
   sdram_dqm  <= iob_dqm;
   sdram_ba   <= iob_bank;
   sdram_addr <= iob_address;
   
iob_dq_g: for i in 0 to 15 generate
   begin
iob_dq_iob: IOBUF
   generic map (DRIVE => 12, IOSTANDARD => "LVTTL", SLEW => "FAST")
   port map ( O  => sdram_din(i),   -- Buffer output
              IO => sdram_data(i),  -- Buffer inout port (connect directly to top-level port)
              I  => iob_data(i),    -- Buffer input
              T  => iob_dq_hiz      -- 3-state enable input, high=input, low=output 
   );
end generate;
   
                                                 
capture_proc: process(clk) 
   begin
     if rising_edge(clk) then
         captured_data      <= sdram_din;
      end if;
   end process;
   

main_proc: process(clk) 
   begin
      if rising_edge(clk) then
         captured_data_last <= captured_data;
      
         ------------------------------------------------
         -- Default state is to do nothing
         ------------------------------------------------
         iob_command     <= CMD_NOP;
         iob_address     <= (others => '0');
         iob_bank        <= (others => '0');
         iob_dqm         <= (others => '1');  

         -- countdown for initialisation
         startup_wait_count <= startup_wait_count-1;
         
         -- Logic to decide when to refresh
         if refresh_count /= refresh_max then
            refresh_count <= refresh_count + 1;
         else
            refresh_count <= (others => '0');
            if state /= s_startup then
               pending_refresh <= '1';
            end if;
         end if;
         
         ---------------------------------------------
         -- It we are ready for a new tranasction 
         -- and one is being presented, then accept it
         -- remember what we are reading or writing
         ---------------------------------------------
         if ready_for_new = '1' and cmd_enable = '1' then
            save_row         <= addr_row;
            save_bank        <= addr_bank;
            save_col         <= addr_col;
            save_wr          <= cmd_wr; 
            save_d_in        <= cmd_data_in;
            save_byte_enable <= cmd_byte_enable;
            got_transaction  <= '1';
            ready_for_new    <= '0';
         end if;

         ------------------------------------------------
         -- Read transactions are completed when the last
         -- word of data has been latched. Writes are 
         -- completed when the data has been sent
         ------------------------------------------------
         data_out_ready <= '0';
         if data_ready_delay(0) = '1' then
            data_out <= captured_data & captured_data_last;
            data_out_ready <= '1';
         end if;

         -- update shift registers used to present data read from memory
         data_ready_delay <= '0' & data_ready_delay(data_ready_delay'high downto 1);
         
         --
         case state is 
            when s_startup =>
               ------------------------------------------------------------------------
               -- This is the initial startup state, where we wait for at least 100us
               -- before starting the start sequence
               -- 
               -- The initialisation is sequence is 
               --  * de-assert SDRAM_CKE
               --  * 100us wait, 
               --  * assert SDRAM_CKE
               --  * wait at least one cycle, 
               --  * PRECHARGE
               --  * wait 2 cycles
               --  * REFRESH, 
               --  * tREF wait
               --  * REFRESH, 
               --  * tREF wait 
               --  * LOAD_MODE_REG 
               --  * 2 cycles wait
               ------------------------------------------------------------------------
                  iob_CKE <= '1';
               
               if startup_wait_count = 21 then      
                   -- ensure all rows are closed
                  iob_command     <= CMD_PRECHARGE;
                  iob_address(10) <= '1';  -- all banks
                  iob_bank        <= (others => '0');
               elsif startup_wait_count = 18 then   
                  -- these refreshes need to be at least tREF (66ns) apart
                  iob_command     <= CMD_REFRESH;
               elsif startup_wait_count = 11 then
                  iob_command     <= CMD_REFRESH;
               elsif startup_wait_count = 4 then    
                  -- Now load the mode register
                  iob_command     <= CMD_LOAD_MODE_REG;
                  iob_address     <= MODE_REG;
               else
                  iob_command     <= CMD_NOP;
               end if;

               pending_refresh    <= '0';

               if startup_wait_count = 0 then
                  state           <= s_idle;
                  ready_for_new   <= '1';
                  got_transaction <= '0';
               end if;
            when s_idle_in_9 => state <= s_idle_in_8;
            when s_idle_in_8 => state <= s_idle_in_7;
            when s_idle_in_7 => state <= s_idle_in_6;
            when s_idle_in_6 => state <= s_idle_in_5;
            when s_idle_in_5 => state <= s_idle_in_4;
            when s_idle_in_4 => state <= s_idle_in_3;
            when s_idle_in_3 => state <= s_idle_in_2;
            when s_idle_in_2 => state <= s_idle_in_1;
            when s_idle_in_1 => state <= s_idle;

            when s_idle =>
               -- Priority is to issue a refresh if one is outstanding
               if pending_refresh = '1' then
                 ------------------------------------------------------------------------
                  -- Start the refresh cycle. 
                  -- This tasks tRFC (66ns), so 6 idle cycles are needed @ 100MHz
                  ------------------------------------------------------------------------
                  state            <= s_idle_in_6;
                  iob_command      <= CMD_REFRESH;
                  pending_refresh  <= '0';
               elsif got_transaction = '1' then
                  --------------------------------
                  -- Start the read or write cycle. 
                  -- First task is to open the row
                  --------------------------------
                  state       <= s_open_in_2;
                  iob_command <= CMD_ACTIVE;
                  iob_address <= save_row;
                  iob_bank    <= save_bank;
               end if;               
            ------------------------------------------
            -- Opening the row ready for read or write
            ------------------------------------------
            when s_open_in_2 => state <= s_open_in_1;

            when s_open_in_1 =>
               -- still waiting for row to open
               if save_wr = '1' then
                  state              <= s_write_1;
                  iob_dq_hiz         <= '0';
                  iob_data           <= save_d_in(15 downto 0); -- get the DQ bus out of HiZ early
               else
                  iob_dq_hiz         <= '1';
                  state              <= s_read_1;
                  ready_for_new      <= '1'; -- we will be ready for a new transaction next cycle!
                  got_transaction    <= '0';
               end if;

            ----------------------------------
            -- Processing the read transaction
            ----------------------------------
            when s_read_1 =>
               state              <= s_read_2;
               iob_command     <= CMD_READ;
               iob_address     <= save_col; 
               iob_address(10) <= '0'; -- A10 actually matters - it selects auto prefresh
               iob_bank        <= save_bank;
               
               -- Schedule reading the data values off the bus
               data_ready_delay(data_ready_delay'high)   <= '1';
               
               -- Set the data masks to read all bytes
               iob_dqm         <= (others => '0');    -- For CAS = 2
               
            when s_read_2 =>
               state              <= s_read_3;
               -- Set the data masks to read all bytes
               iob_dqm         <= (others => '0');   -- For CAS = 2 or CAS = 3

            when s_read_3 => state <= s_read_4;
               -- iob_dqm         <= (others => '0');    -- For CAS = 3
            when s_read_4 => state <= s_precharge;

            -------------------------------------------------------------------
            -- Processing the write transaction
            -------------------------------------------------------------------
            when s_write_1 =>
               state              <= s_write_2;
               iob_command     <= CMD_WRITE;
               iob_address     <= save_col; 
               iob_address(10) <= '0'; -- A10 actually matters - it selects auto prefresh
               iob_bank        <= save_bank;
               iob_dqm         <= NOT save_byte_enable(1 downto 0);    
               iob_data        <= save_d_in(15 downto 0);
               ready_for_new   <= '1';
               got_transaction <= '0';
            when s_write_2 =>
               state           <= s_write_3;
               iob_dqm         <= NOT save_byte_enable(3 downto 2);    
               iob_data        <= save_d_in(31 downto 16);
         
            when s_write_3 =>  -- must wait tRDL, hence the extra idle state
               iob_dq_hiz         <= '1';
               state              <= s_precharge;

            -------------------------------------------------------------------
            -- Closing the row off (this closes all banks)
            -------------------------------------------------------------------
            when s_precharge =>
               state           <= s_idle_in_9;
               iob_command     <= CMD_PRECHARGE;
               iob_address(10) <= '1'; -- A10 actually matters - it selects all banks or just one

            -------------------------------------------------------------------
            -- We should never get here, but if we do then reset the memory
            -------------------------------------------------------------------
            when others => 
               state <= s_startup;
               ready_for_new       <= '0';
               startup_wait_count  <= to_unsigned(10100,16);
         end case;
         
         -- Sync reset
         if reset = '1' then
            state               <= s_startup;
            ready_for_new       <= '0';
            startup_wait_count  <= to_unsigned(10100,16);
         end if;
      end if;      
   end process;
end Behavioral;

Implementation details

SDRAM Command set

Although SDRAM's control signals are much like async RAM they are better thought of as forming one command channel. Here are the commands

Command Action
NOP Do nothing
AUTO REFRESH Perform a refresh cycle - no row should be active.
ACTIVE Retrieve a row of data from the memory array
READ Read a burst of words from the current row
WRITE Write a burst of words to the current row
PRECHARGE Save the current row back into the memory array
TERMINATE Stop the current burst transfer
LOAD MODE REGISTER Load the SDRAM's internal mode register (data is on the ADDRESS bus)

The basic operation is pretty easy - to read issue an ACTIVATE, the perform some READs and WRITEs, and finally issue a PRECHARGE to store the values away in the memory array. In addition an occasional AUTO REFRESH cycle is needed to ensure that the DRAM cells keep the correct values.

The actual commands are passed using three control signals when CS is asserted (CS, RAS, CAS and WE are all active low)

Command RAS CAS WE
NO OPERATION (NOP) H H H
ACTIVE (select bank and activate row) L H H
READ (select bank and column, and start READ burst) H L H
WRITE (select bank and column, and start WRITE burst) H L L
BURST TERMINATE H H L
PRECHARGE (Deactivate row in bank or banks) L H L
AUTO REFRESH or SELF REFRESH (enter self refresh mode) L L H
LOAD MODE REGISTER L L L

Command Latencies

The hard bit about dealing with SDRAM is that nearly all the commands have timing restrictions. The art to designing a good memory controller is in ensuring that these restrictions are met, and minimal NOPs are needed to maximise throughput. This memory controller will have little artistic merit!

Here they are the restrictions:

Command Before Command During command After command
NO OPERATION (NOP) No restriction No restriction No restriction
ACTIVE (select bank and activate row) A bank must have not activated a row for tRRD (14 clocks) No restriction Takes three cycles (tCAS)
READ (select bank and column, and start READ burst) Row must be active No restriction Data bus must be available when transfer occurs
WRITE (select bank and column, and start WRITE burst) Row must be active no read transfer in progress No restriction (however a gap is needed before a PRECHARGE)
BURST TERMINATE A READ or WRITE should be in progress No restriction Takes ? cycles to take effect
PRECHARGE (Deactivate row in bank or all banks) A row must be active. No write in the two cycles before hand (tRDL) No restriction Takes 15ns (tRP)
AUTO REFRESH No rows can be active 8192 must be issued every 64ms Takes at least 66ns cycles (tREF)
LOAD MODE REGISTER No rows can be active. At least two refresh cycles must have occurred No restriction Takes tLMR to take effect

Ensuring stable timing of signals

To ensure minimal signal skew, all signals should be registered in the I/O blocks of the FPGA. This can be quite painful, and introduces another clock cycle of latency.

This can be done by adding attributes to the signals:

 attribute IOB: string;
 attribute IOB of iob_command: signal is "true";

And can be verified by looking at the IOB Properties report:

Iob props.jpg

Startup sequence

The initialisation sequence for a SDRAM module is quite complex. Here are the requirements:

  1. Apply power
  2. Keep CKE at a LVTTL logic LOW
  3. Provide stable CLOCK signal
  4. Wait at least 100μs prior to issuing any command other than a NOP, or keep CS high
  5. Starting at some point during this 100μs period, bring CKE HIGH.
  6. Apply at least one or more NOP commands
  7. Perform a PRECHARGE ALL command.
  8. Wait at least tRP time, issuing NOP commands
  9. Issue an AUTO REFRESH command.
  10. Wait at least tRFC time, issuing NOP commands
  11. Issue an AUTO REFRESH command.
  12. Wait at least tRFC time, issuing NOP commands
  13. Program the mode register with a LOAD MODE REGISTER command.
  14. Wait at least tMRD time, issuing NOP commands

Refresh cycles

To ensure that the DRAM does not lose its contents 8192 AUTO REFRESH commands need to be issued every 64ms, and to issue a AUTO REFRESH command no row must be active. After a refresh command is issued you must wait at least 70ns (tREF) before issuing any other commands.

This can be quite painful as it interrupts what can otherwise be a smooth flow of data. The interruption is far bigger than the 70ns, as you have to PRECHARGE any open row, then ACTIVATE it again before then next READ or WRITE.

Progress so far

The memory controller

Update 15/10/2013 Controller pretty much finished, with v0.4 release now up on the web site.

Update 09/10/2013 Found and fixed the last known bug (thanks Jim!) - verified all 8MB of RAM with 32 different patterns - 512 million transactions per verification pass (80 seconds) for the Logi-Pi.

Update 08/10/2013 Verified for the the first 1024 SDRAM words - still have an issue when accessing rows other than zero - I think I have this fixed as I was precharging the wrong bank.

Update 03/10/2013 Verified the first sixteen 32-bit addresses (addresses 0 through 31). Well on the way to working!

Update 02/10/2013 After much testing and head scratching I've found the problem with the prototype board I've got a solder bridge!

Solder bridge.jpg

With that removed the design is showing signs of life!

Update 25/09/2013 This is just about ready for testing. I've had to turn on the "Pack I/O registers/latches into IOBs" to For "inputs and outputs" for the project.

It still has a lot of work to be done on it, but it looks OK in simulation. Here is the trace of the initialization phase (PRECHARGE, REFRESH, REFRESH, LOAD_MODE_REG):

Mc init.jpg

And here is the trace of a write followed by a read (I've left the LOAD MODE REG on the left hand side):

Mc wr read.jpg

At the moment it has an extra 'read' state in the FSM that is there to help with testing.

References

http://download.micron.com/pdf/datasheets/dram/sdram/256MSDRAM.pdf

Personal tools