Loading BRAM data

From Hamsterworks Wiki!

Jump to: navigation, search

Loading your own programs into block RAM

there are many ways to load data into a block RAM. These can be broken down into two general categories - pre-synthisis or post-presynthisis. Assuming that you are going to place a compiled AVR project into the program memory of the AVR8 processor you will most probably want to use both at differnt times, as pre-systhisis is best suited for debugging test programs for the FPGA "hardware", where as post-systhisis is really convienient for implmenting new software revisions.



This involves adding the required data into the FPGA source tree, and then building a new FPGA programming bitstream.

Pros: Data is present in BRAM allowing source level simulation Easy to visualise and understand Cons: Long turnaround time, as a full rebuild is required to change RAM contents before configuring devices Data must be converted into a FPGA tool friendly format Data must be known at designtime A full FPGA development toolset is needed to update the BRAM contents If you use the VDHL 'GENERATE' contruct to create multiple BRAM blocks they will all have the same contents, limiting you to around 1k of code. Only really works with BRAMs who's width matches that of the target architecture. Putting data into an 8kx8bit memory out of eight 8kx1bit bit-planes isn't feasible. A new FPGA build is required for each different BRAM contents (in this case programs)


This involves updating the contents of the FPGA bitstram to contain the correct data in the correct location.

Pros: All BRAM instances can have different contents Only a limited FPGA toolset is required to merge the new BRAM contents into the bitstream (the data2mem utility). Fast turnaround, as no project rebuild is required Can usually be integrated into the a software development toolchain (e.g. "makefile"). Can handle more complex memory configurations such as 'bit slices'. you build the hardware once and then merge many different programs quickly

Cons: Can not be used with source level simulation, but can be used with device level simulation Far more complex to understand and implement

How to update BRAM contents pre-synthesis

The recipie for this is:

  • Convert your '.hex' file to VDHL 'INIT_xx' attributes. In my Papillo build this is XPM8kx16.vhd
  • Replace the existing 'INIT_xx' in the PM_INST instance in the AVR8 processor
  • Rebuild the project
  • Configure the device with the resulting bit file

Note that the 'INIT_xx' parameters are interpeated like really large integers (32 byte integers) in LSB format. So if your wanted to have the following bytes in memory 0000:

 0000: 00 11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF

Your init string will be:

 INIT_00 => X"00000000000000000000000000000000FFEEDDCCBBAA99887766554433221100",

This doesn't make much sense until put in the context that BRAM blocks can have different widths.

I've written a small 'C' program (available in the Papillo playground) which takes an Intel '.hex' file and outputs the lines to cut and past into the VDHL source.

// hex2mem.cpp
#include "stdafx.h"
#include <stdlib.h>
#include <string.h>
#include <stdio.h> 

char buffer[1000];
FILE *f;

unsigned char data[256*256];
int maxaddr = 0;

int htoi(char c)
  if(c >= '0' && c <= '9')   return c-'0';
  if(c >= 'a' && c <= 'f')   return c-'a'+10;
  if(c >= 'A' && c <= 'F')   return c-'A'+10;
  printf("Invalid hex\n");

int readaline(char * buffer)
  int c;
  int i = 0;
  while(c != EOF && c != '\n')   {
    buffer[i++] = c;
  buffer = '\0';
  return i > 0;

void decodeBytes(char *buffer)
   int len, addr,type;
   int i;
   if(strlen(buffer) < 10)return;
   len  = (htoi(buffer[1])<<4) +htoi(buffer[2]);
   addr = (htoi(buffer[3])<<12)+(htoi(buffer[4])<<8)
        + (htoi(buffer[5])<<4) +htoi(buffer[6]);
   type = (htoi(buffer[7])<<4) +htoi(buffer[8]);
   if(type == 1) // EOF marker

  for(i = 0; i < len; i++) {
       data[addr+i] = htoi(buffer[9+i*2])*16+htoi(buffer[10+i*2]);
     if(addr+i > maxaddr)
        maxaddr = addr+i;

void dump(void)
  int addr = 0;
  int line = 0;
  while(addr < maxaddr)   {
     int i;
      printf("INIT_%02X => X\"",line);
     for(i = 31; i >=0; i--)
     addr += 32;

int main(int c, char *v[])
  int line = 0;
  f = fopen(v[1],"r");
  if(f == NULL) {
    printf("Unable to open file\n");
    return 0;
  return 0;

Another option (not useful in the context of the AVR8 processor) is to create a '.coe' file, which is used by the Block RAM generator wizard to set the inital values. A small example is:


If you do use a '.coe' file you will need to rebuild the BRAM IP and then the entire project after changing the file to include the updated data in your project. It is very long winded!

How to update BRAM contents post-synthisis

The recipie for this is:

  • Convert your HEX file to a ".mem" file.
  • Use the Xilinx "data2mem" program to insert the data in the '.mem' file into the '.bit' file.
  • Configure the device with the resulting bit file.

The merging process uses a "_bd.bmm" file define the "address space" created by one or more BRAM blocks. The BMM file for my AVR8 looks like:

 ADDRESS_MAP avrmap PPC405 0
 0x00000000:0x00003FFF (16 KBytes).
   ADDRESS_SPACE rom_code RAMB16 [0x00000000:0x00003FFF]
           avr_processor/PM_Inst/RAM_Inst[0].RAM_Word [15:0] PLACED = X1Y7;

           avr_processor/PM_Inst/RAM_Inst[1].RAM_Word [15:0] PLACED = X1Y0;


           avr_processor/PM_Inst/RAM_Inst[7].RAM_Word [15:0] PLACED = X1Y6;

What first perplexed me was how to get the vaules for the "PLACED = XxYy" clause, but I discoved that the FPGA toolset does this for you. You first create a template "whatever.bmm" (e.g. "progmem.bmm") without the PLACED clauses, and during the Place & Route this gets updated and saved as "whatever_bd.bmm". Simple really once you know how it works.

I've chosen to implement my merge as a windows CMD script, which copies in the ".hex" file from the project, then srec_cat converts it, and finally data2mem merges it with the FPGA bitstream:

 copy "c:AVR\vgatest\default\vgatest.hex" . 
 srec_cat vgatest.hex -Intel --byte-swap 2 -Data_Only -Line_Length 105 -o vgatest.mem -vmem 8 
 C:\Xilinx\12.4\ISE_DS\ISE\bin\nt\data2mem -bm progmem_bd.bmm -bd vgatest.mem -bt avr8.bit -o b vgatest.bit

Due to unexpected behaviour in data2mem the ".mem" file needs to have an even number of bytes per line. A line length of "105" is enough to have 32 bytes on each line and matches nicely with the data in the '.hex' file.

Debugging when things go wrong

As with all things, debugging is the hard bit. The data2mem utility allows you to dump the contents of a bitstream, and you can then view it in a text editor. If you are lucky to be running on UNIX you can then use the 'diff' utility to compare the bitstream contents before and after the data2mem.

As the AVR8 has an interrupt table at the start of memory it's regular structure is a great help. If you also use supply "_bd.bmm" file the BRAMs will be named with their instance names:

C:\Xilinx\12.4\ISE_DS\ISE\bin\nt\data2mem -bm progmem_bd.bmm -bt avr8.bit -d

Gives me:

  BRAM data, Column 01, Row 07. Design instance "avr_processor/PM_Inst/RAM_Inst[0].RAM_Word". 

00000000:   94 0C 00 30 94 0C 00 52 94 0C 00 52 94 0C 00 52 94 0C 00 52 94 0C 00 52 94 0C 00 52 94 0C 00 52   ...0...R...R...R...R...R...R...R   
00000020:   94 0C 00 52 94 0C 00 52 94 0C 00 52 94 0C 00 52 94 0C 00 52 94 0C 00 52 94 0C 00 52 94 0C 00 52   ...R...R...R...R...R...R...R...R

And there you have it! Hope it saves you a few days of banging you head against what feels like a brick wall.

Personal tools