The EDA tools I'm using are: verilator, gtkwave and vivado

  • verilator is a verilog hardware simulator
  • gtkwave is signal viewer for signals generated by verilator
  • vivado is the Xilinx/AMD tool to compile verilog code and configure FPGAs

verilator and gtkwave are open souce and vivado is free for small FPGAs. These tools are not well integrated together but they are good enough for small designs and free.

Verilator

GTKwave

Vivado

Related: System with the c16 CPU and Gpu1 C16 CPU Gpu1

Install

gtkwave is available in apt:

apt-get install gtkwave

verilator is also available in apt but it is better to compile it from source and use the latest version to avoid issues.

apt-get install verilator

I compile verilator with these commands:

git clone https://github.com/verilator/verilator
sudo apt install autoconf flex help2man
cd verilator/
autoconf
./configure
make -j `nproc`
cd bin
./verilator --version
cd -
sudo make install

vivado needs a specific version of a linux distribution so I created a virtual machine with Ubuntu 24.04 LTS. I installed vivado in my home directory. I don't update the system because after each update there is a risk vivado is not compatible with the updates, here is the compatibility table:

OS                Versions      2025.1 2025.2 2026.1
Ubuntu Linux 24   24.04   LTS   Yes    Yes    Yes
Ubuntu Linux 24   24.04.1 LTS   Yes    Yes    Yes
Ubuntu Linux 24   24.04.2 LTS    No    Yes    Yes
Ubuntu Linux 24   24.04.3 LTS    No     No    Yes

I have 32GB RAM for the virtual machine which is enough for small design, when compiling my design with vivado it takes less than 16GB RAM.

If the design is too big for the target FPGA, vivado doesn't stop, it tries to fit the design and takes more than 50GB RAM after running for 3 days.

vivado compiles small designs in less than 10 minutes using 8 cores in some steps.

Compiling verilog code

I usually compile the design with verilator and vivado because they don't issue the same errors, when vivado generates a bit stream then the design is ok.

I simulate the verilog code with verilator by compiling it to an executable and then run the executable:

verilator --binary -j 0 -Wall -Wno-BLKSEQ --timing --trace-fst tb.v
./obj_dir/Vtb
# top is tb module

I prefer having a testbench written in verilog and the design under test is inside the testbench. I use --trace-fst to save the signals in fst format instead of vcd. With fst, the traces are smaller because fst is binary and compressed whereas vcd is text and not compressed.

I compile the design with vivado using batch mode:

source Xilinx/2025.1/Vivado/settings64.sh
vivado -mode batch -script flow.tcl

flow.tcl looks like this:

set outputDir ./out
file mkdir $outputDir

set_param general.maxThreads 8

read_verilog systop.v
read_xdc io.xdc

synth_design -top systop -part xc7a100tcsg324-2
write_checkpoint -force $outputDir/post_synth.dcp
report_utilization -file $outputDir/synth_report.txt

opt_design
place_design
write_checkpoint -force $outputDir/post_place.dcp
route_design
write_checkpoint -force $outputDir/post_route.dcp
write_bitstream -force $outputDir/stream.bit
quit

The compilation result is ./out/stream.bit which should be uploaded to the FPGA.

io.xdc looks like:

set_property BITSTREAM.GENERAL.COMPRESS True [current_design]

set_property -dict { PACKAGE_PIN E3     IOSTANDARD LVCMOS33 } [get_ports { clk }];
#100mhz
create_clock -add -name sys_clk_pin -period 10.00 -waveform {0 5} [get_ports { clk }];

set_property -dict { PACKAGE_PIN K1     IOSTANDARD LVCMOS33 } [get_ports { rst }];

Use latest verilator from git repo

When I was using verilator 5.006 from apt, I encountered an error initializing an array with this code:

  reg [15:0] mem[0:127];

  always @ (posedge clk) begin
    if (rst) begin
      integer i;
      for (i = 0 ; i < 128 ; i = i+1) begin
        mem[i] <= 'hD800;
      end
    end
  end

%Error-BLKLOOPINIT: fetchmem.v:25:16: Unsupported: Delayed assignment to array inside for loops (non-delayed is ok - see docs)
   25 |         mem[i] <= 'hD800;
      |                ^~
                   fetchmem_tb.v:2:1: ... note: In file included from 'fetchmem_tb.v'
                   ... For error description see https://verilator.org/warn/BLKLOOPINIT?v=5.020
%Error: Exiting due to 1 error(s)
        ... See the manual at https://verilator.org/verilator_doc.html for more assistance.

This error is fixed in verilator 5.032 2025-01-01.

Writing verilog code

Unused signals have to used somehow:

Signal is not used: 'rdata1'
                                            : ... note: In instance 'tb'
   11 |   reg [63:0] rdata1;
      |              ^~~~~~
                      ... For warning description see https://verilator.org/warn/UNUSEDSIGNAL?v=5.020
                      ... Use "/* verilator lint_off UNUSEDSIGNAL */" and lint_on around source to disable this message.

At the end of the tb file, add:

  wire _unused_ok = &{1'b0,
    rdata1,
    1'b0};

The output type from modules have to be type wire, vivado requires it.

regs have to be declared outside always blocks, vivado requires it . I use the -Wno-BLKSEQ option for verilator because I'm ok with this:

https://verilator.org/guide/latest/warnings.html
 BLKSEQ

    This indicates that a blocking assignment (=) is used in a sequential block. Generally, non-blocking/delayed assignments (<=) are used in sequential blocks, to avoid the possibility of simulator races. It can be reasonable to do this if the generated signal is used ONLY later in the same block; however, this style is generally discouraged as it is error prone.

Big arrays take a lot of resources in the FPGA, they should be replaced with a RAM interface and they will be mapped to block RAM.

vivado doesn't support fopen and fread to load data, use $readmemh("data.mem", mem); instead. data.mem is a text file with data in hexadecimal for one memory address per line.

I convert binary files to mem text file with hexdump:

hexdump -v -e '1/2 "%04x\n"' c16.out > c16.mem

blog post about converting binary files

vivado maps modules like this to block ram:

module blockram #(parameter WIDTH = 32, parameter ADDR_SIZE = 10/*bits*/)(
  input clk,
  input [ADDR_SIZE-1:0] addr,
  input [WIDTH-1:0] wdata,
  input wen,
  output reg [WIDTH-1:0] rdata
);

  reg [WIDTH-1:0] mem[0:(1<<ADDR_SIZE)-1];

  initial begin
    $readmemh("data.mem", mem);
  end

  // When there is a continuous assign like these:
  // assign rdata = mem[addr];
  // always @*
  //   rdata = mem[addr];
  //
  // Vivado doesn't use block ram to model the module
  // It uses LUTs

  always @ (posedge clk) begin
    if (wen) begin
      mem[addr] <= wdata;
      rdata <= wdata;
    end
    else begin
      rdata = mem[addr];
    end
  end

endmodule

The data is read 1 clock cycle after the address.

Links

Small FPGA designs