It is quite common for a DUT to have two or more interfaces from which, independent monitors, will send data to a scoreboard in the same simulation time. Because this is done from parallel threads, the order in which the data arrived in our scoreboard is random. If the order in which the data is processed is important, then those parallel threads can become a big headache.
In this post I will present one way of tackling the problem of parallel threads in an UVM SystemVerilog environment.
If you want to jump straight to the proposed solution click here.
Let’s consider that we need to verify a very simple FIFO with two interfaces: one for pushing data in the FIFO and one for popping data out of it.
The FIFO has a width of 16 bits and a depth of 4. The push and pop communication protocols are very similar and quite simple: a transaction starts when req signal gets asserted and it is completed when also ack signal gets asserted. At that point data is considered to have a valid value.
For this FIFO, our model/scoreboard looks very simple.
Whenever there is some push transaction, we first make sure that there is room in the FIFO in which to place the data. Then, we do a simple push in a SystemVerilog queue:
class cfs_fifo_scoreboard extends uvm_component; ... //Action triggered when there is a PUSH into the FIFO virtual protected function void do_push(cfs_push_item_mon item); if(is_full()) begin `uvm_error("DUT_ERROR", "Trying to push while FIFO is full") end fifo.push_back(item.data); endfunction ... endclass
Whenever there is some pop transaction, we first make sure that there is some data available in our FIFO model. Then, we compare the popped value and received value:
class cfs_fifo_scoreboard extends uvm_component; ... //Action triggered when there is a POP from the FIFO virtual protected function void do_pop(cfs_pop_item_mon item); cfs_fifo_data exp_data; if(is_empty()) begin `uvm_error("DUT_ERROR", "Trying to pop while FIFO is empty") end exp_data = fifo.pop_front(); if(exp_data != item.data) begin `uvm_error("DUT_ERROR", $sformatf( "Data mismatch - expected: %0h, received: %0h", exp_data, item.data)) end endfunction ... endclass
The only thing that we have to do at this point is to declare some analysis ports in order to get the information from some PUSH and POP agents:
`uvm_analysis_imp_decl(_push) `uvm_analysis_imp_decl(_pop) class cfs_fifo_scoreboard extends uvm_component; ... //Analysis port for getting the pushed data from the PUSH monitor uvm_analysis_imp_push#(cfs_push_item_mon, cfs_fifo_scoreboard) port_push; //Analysis port for getting the popped data from the POP monitor uvm_analysis_imp_pop#(cfs_pop_item_mon, cfs_fifo_scoreboard) port_pop; virtual function void build_phase(uvm_phase phase); super.build_phase(phase); port_push = new("port_push", this); port_pop = new("port_pop", this); endfunction //Function associated with the analysis port connected to the PUSH monitor virtual function void write_push(cfs_push_item_mon item); do_push(item); endfunction //Function associated with the analysis port connected to the POP monitor virtual function void write_pop(cfs_pop_item_mon item); do_pop(item); endfunction ... endclass
Here comes the tricky part: the functions write_push() and write_pop() are called from independent threads – one in each of the two agents (PUSH and POP agents). This means that if a push and a pop transaction happens in the same simulation time we can not say which of our functions will be called first.
This is also mentioned in the SystemVerilog LRM in the chapter 4.7 Nondeterminism:
One source of nondeterminism is the fact that active events can be taken off the Active or Reactive event region and processed in any order.
There are two corner case scenarios relevant for our threads order problem:
- While FIFO is empty and there is a push and a pop transaction at the same time, if write_pop() is called before write_push(), our model will complain that there was a POP while the FIFO was empty – so there is nothing to pop out of the FIFO.
- While FIFO is full and there is a push and a pop transaction at the same time, if write_push() is called before write_pop(), out model will complain that there was a PUSH while the FIFO was full – so there is no room to push the data.
I build a small verification environment on EDA Playground to showcase this problem – How to Handle Data Coming From Parallel Threads 1. You can see that changing the simulator yields different errors based on the order on which write_push() and write_pop() are called. There is no problem with the simulators, all of them behave correctly, we just need to solve this nondeterminism in our code.
In order to solve this problem we need to implement the following behavior in our scoreboard/model:
- When the FIFO is empty we want the push transaction to be processed before the pop transaction
- When the FIFO is full we want the pop transaction to be processed before the push transaction
- In any other case we do not care of the order between the two.
One possible solution for our problem is to do the following steps:
- Buffer transactions – when the PUSH and POP monitors are sending transactions, do not process them immediately, but buffer them for later use.
- Wait for all – wait until all the monitors had their chance to send the transactions happening in the same simulation time.
- Compute a priority – determine for each of the buffered transactions the priority with which it should be processed.
- Process transactions – use the priority of the buffered transactions to determine the order in which to process them.
Step #1 – Buffer Transactions
The first step in our small algorithm is to buffer all the transactions which are sent by the PUSH and POP monitors at the same simulation time, into some queue. This will allow us in next steps to process them in the order that we need.
class cfs_fifo_scoreboard extends uvm_component; ... //List of pending transactions to be processed protected uvm_object pending_items[$]; //Handle the information coming from the monitors, regardless if //it is a push or a pop item protected virtual function void handle_item(uvm_sequence_item item); pending_items.push_back(item.clone()); endfunction //Function associated with the analysis port connected to the PUSH monitor virtual function void write_push(cfs_push_item_mon item); handle_item(item); endfunction //Function associated with the analysis port connected to the POP monitor virtual function void write_pop(cfs_pop_item_mon item); handle_item(item); endfunction endclass
Step #2 – Wait For All
In this step we need to wait some time so that all the monitors manage to send their information to the scoreboard. However, please keep in mind that this “waiting” is for transactions happening in the same simulation time.
In our case, the two PUSH and POP agents are working on the same clock signal. This means that it is safe to assume that the two threads will actually be handled by the simulator in random order but in the same batch of … “stuff” handled by the simulator per each event. So waiting for the next NBA region relative to the first thread handled should give “enough time” to the simulator to call both threads.
First, we need to start a task, called parse_items(), after every information pushed in the pending_items queue:
class cfs_fifo_scoreboard extends uvm_component; ... //Handle the information coming from the monitors, regardless if //it is a push or a pop item protected virtual function void handle_item(uvm_sequence_item item); pending_items.push_back(item.clone()); fork begin parse_items(); end join_none endfunction ... endclass
Next, we need to make sure that the job done by parse_items() is done only once for a given simulation time, regardless of how many times parse_items() is called in the same simulation time.
This can be achieved by saving a reference to the process associated with task parse_items() and only allowing the main code of the task to be executed for the first call of the task, when the process is still null:
class cfs_fifo_scoreboard extends uvm_component; ... //Process associated with task parse_items local process process_parse_items; //Task which waits for all the transactions to be queued //and then start parsing them protected virtual task parse_items(); fork begin if(process_parse_items == null) begin process_parse_items = process::self(); //Give time to the simulator to call both threads: //one from the PUSH monitor and one from the POP monitor uvm_wait_for_nba_region(); //TODO: do the parsing - this is implemented in the next steps //Clear the reference to the process associated with parse_items() //task so that in the next clock cycle everything is clean. process_parse_items = null; end end join endtask ... endclass
You can make use of uvm_wait_for_nba_region() only if all the agents are working on the same clock signal. If you have a scenario in which the monitors are working, for example, on divided but synchronized clocks then it would be safer to use some physical delay like “#1ps”.
Step #3 – Compute Priority
Next, we need a function which computes the priority of a transaction based on the following variables:
- the type of the transaction – push or pop
- “is empty” indicator of the FIFO
- “is full” indicator of the FIFO
Here is one possible implementation of such function:
class cfs_fifo_scoreboard extends uvm_component; ... //Get the priority of an item with which it will be processed by the scoreboard protected virtual function int unsigned get_priority(uvm_object item); cfs_push_item_mon push_item; cfs_pop_item_mon pop_item; if($cast(push_item, item)) begin if(is_full()) begin //SCENARIO: PUSH at FULL //FIFO is already full so associate a smaller priority than in "POP at FULL" //as a pop should be processed first in this case return 10; end else if(is_empty()) begin //SCENARIO: PUSH at EMPTY //FIFO is empty so associate a higher priority than in "POP at EMPTY" //as a pop should be processed second in this case return 15; end else begin //FIFO is neither full nor empty so the priority does not really matter return 6; end end else if($cast(pop_item, item)) begin if(is_full()) begin //SCENARIO: POP at FULL //FIFO is already full so associate a higher priority than in "PUSH at FULL" //as a pop should be processed first in this case return 15; end else if(is_empty()) begin //SCENARIO: POP at EMPTY //FIFO is empty so associate a smaller priority than in "PUSH at EMPTY" //as a pop should be processed second in this case return 10; end else begin //FIFO is neither full nor empty so the priority does not really matter return 5; end end else begin `uvm_fatal("ALGORITHM_ISSUE", $sformatf( "Trying to get priority for an unhandled class: %s", item.get_type_name())); end endfunction ... endclass
There are two important things to notice in the get_priority() function:
- The scenario “PUSH at EMPTY” has priority 15 which is greater than the priority of scenario “POP at EMPTY”, which is 10. This solves the first corner case that we talk about as, when the FIFO is empty, a PUSH will be processed before a POP happening in the same simulation time.
- The scenario “POP at FULL” has a priority 15 which is greater than the priority of scenario “PUSH at FULL”, which is 10. This solved the second corner case that we talk about as, when the FIFO is full, a POP will be processed before the PUSH happening in the same simulation time.
Step #4 – Process Transactions
The final step is to process transactions in the order imposed by the get_priority() function. We fill in the code of parse_items() so that we first sort pending_items by the result of get_priority() and then we call the appropiate do_push() and do_pop():
class cfs_fifo_scoreboard extends uvm_component; ... //Task which waits for all the transactions to be queued //and then start parsing them protected virtual task parse_items(); fork begin if(process_parse_items == null) begin process_parse_items = process::self(); //Give time to the simulator to call both threads: //one from the PUSH monitor and one from the POP monitor uvm_wait_for_nba_region(); //Sort the transactions based on their priority - from high to low pending_items.rsort(item) with (get_priority(item)); while(pending_items.size() > 0) begin uvm_object item = pending_items.pop_front(); cfs_push_item_mon push_item; cfs_pop_item_mon pop_item; if($cast(push_item, item)) begin do_push(push_item); end else if($cast(pop_item, item)) begin do_pop(pop_item); end else begin `uvm_fatal("ALGORITHM_ISSUE", $sformatf( "Trying to parse an unhandled class: %s", item.get_type_name())); end end //Clear the reference to the process associated with parse_items() //task so that in the next clock cycle everything is clean. process_parse_items = null; end end join endtask ... endclass
That’s it! With this algorithm the nondeterminism issue of SystemVerilog is solved and the data will be processed based on the priority determined by the function get_priority().
If you want to see the complete code running, then check out the project on EDA Playground called How to Handle Data Coming From Parallel Threads 2. Try to change the simulators and see that the order in which the PUSH and POP transactions are processed does not change and it is the one determined by the get_priority() function.
If you have other methods of tackling this problem please share in the comments section as it would be very interesting to see different solutions to this issue.
Hope you found this article useful 🙂
Interesting work, thanks!
How can we accomplish the same task without checking the process handle? Have you ever thought?
I thought what happens when another handle_item calls are made when your parse_items did not finish. Maybe we can wait until process is not equal to null?
Lets say in parse_item,
‘uvm_wait_for_nba_region();’, is executed and another handle_item calls are already made, this’ll make your previous parse_item to consume the requests that registered after uvm_wait_for_nba_region() is finished, thus the next handle_item’s parse_item call might end up with doing nothing since queue is empty.