Learning SystemC: #004 Primitive Channels
In this post I will talk about SystemC primitive channels. These channels help us establish an easier and safer communication between SystemC modules.
Here is a list of content if you want to jump to a particular subject:
1. Why Do We Need Communication Channels?
1.1. Example
2. Mutex – sc_mutex
2.1. Example
3. Semaphore- sc_semaphore
3.1. Example
4. FIFO – sc_fifo
4.1. Example
1. Why Do We Need Communication Channels?
When you have to build a SystemC program in most of the cases you will probably have to split your design in multiple modules which will have to communicate between them in some way.
If we have to build such an inter-module communication, with the information that we accumulated so far from this tutorial, then the most common solution would be to use events and public fields. But this comes with some pitfalls and the most obvious is the situation when you are waiting for an event which already happen. I’ll try to emphasize this in the next example.
1.1. Example
Let’s say that we have a module producing random numbers, called generator, and a module which reads these numbers, called reader.
The thread for the generator might look like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | void generator::generate() { for(;;) { cout << "[" << sc_time_stamp() << "] GENERATOR: random number began generation" << endl; //wait some random time wait(sc_time(std::rand() % 6, SC_NS)); //generate a random value data = std::rand() % 100; //announce the reader that random data is available data_ready_ev.notify(); cout << "[" << sc_time_stamp() << "] GENERATOR: random number was generated" << endl; //wait for the reader to read the data wait(data_read_ev); cout << "[" << sc_time_stamp() << "] GENERATOR: random number was read" << endl; } }; |
The generator loop is pretty simple:
- wait some random time
- generate a random value
- announce the reader via data_ready_ev that random data is available
- wait for the reader to read the data via event data_read_ev
The thread for the reader might look like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | void reader::read() { for(int i = 0; i < 3; i++) { cout << "[" << sc_time_stamp() << "] READER: prepare to read data" << endl; //wait for generator to finish generation wait(generator_ptr->data_ready_ev); //read the random data cout << "[" << sc_time_stamp() << "] READER: got data: " << generator_ptr->data << endl; //announce the generator that the data was read generator_ptr->data_read_ev.notify(); //wait some random time before the next read wait(sc_time(std::rand() % 6, SC_NS)); }; }; |
The reader loop is pretty simple:
- wait for generator to finish generation via event data_ready_ev
- read the random data
- announce the generator via data_read_ev that the data was read
- wait some random time before the next read
At first glance this looks ok but the changes are pretty high that the reader thread gets stuck because it starts to wait for data_ready_ev after generation was finish.
1 2 3 4 5 6 7 8 | [0 s] READER: prepare to read data [0 s] GENERATOR: random number began generation [1 ns] GENERATOR: random number was generater [1 ns] READER: got data: 86 [1 ns] GENERATOR: random number was read [1 ns] GENERATOR: random number began generation [2 ns] GENERATOR: random number was generater [4 ns] READER: prepare to read data |
You can run this example on your own on EDA Playground.
A possible solution is to use a flag which keeps track if the data was already generated like in this example.
As you saw in the example above, it is pretty easy to make mistakes in such scenarios where you have to pass information between parallel threads. Luckily, SystemC library comes with some predefined classes, called channels, which makes communication between modules much easier and safer.
In the next sections I will present several of these predefined channels.
2. Mutex – sc_mutex
In computer science, a lock or mutex (from mutual exclusion) is a synchronization mechanism for enforcing limits on access to a resource in an environment where there are many threads of execution. A lock is designed to enforce a mutual exclusion concurrency control policy.
The above statement is actually the mutex definition from Wikipedia.
SystemC implements this mechanism via sc_mutex class. The API is very simple containing only three functions:
This mutex mechanism is perfect for solving problems like managing accesses to a register or a memory (e.g. not having a write overlapping with a read).
2.1. Example
Let’s say that we have a register for which the read and write actions last a random time period. In a real electronic digital component accesses can not overlap as all are done via the same physical bus, sequentially. We can model this behavior very easy using the sc_mutex class.
The read function looks like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | sc_int<32> simple_reg::read() { cout << "[" << sc_time_stamp() << "] READ: start" << endl; mutex.lock(); cout << "[" << sc_time_stamp() << "] READ: mutex locked" << endl; wait(sc_time(std::rand() % 6, SC_NS)); mutex.unlock(); cout << "[" << sc_time_stamp() << "] READ: finish - value: " << value << endl; return value; }; |
The write function looks like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | void simple_reg::write(sc_int<32> new_value) { cout << "[" << sc_time_stamp() << "] WRITE: start " << endl; mutex.lock(); cout << "[" << sc_time_stamp() << "] WRITE: mutex locked" << endl; wait(sc_time(std::rand() % 6, SC_NS)); value = new_value; mutex.unlock(); cout << "[" << sc_time_stamp() << "] WRITE: finish - value: " << value << endl; }; |
I added some messages before and after locking the mutex so we can see how the mutex works. Here is the output of this example:
1 2 3 4 5 6 7 8 9 10 11 12 | [0 s] WRITE: start [0 s] WRITE: mutex locked [0 s] READ: start [4 ns] WRITE: finish - value: 83 [4 ns] READ: mutex locked [5 ns] READ: finish - value: 83 [7 ns] WRITE: start [7 ns] WRITE: mutex locked [10 ns] READ: start [11 ns] WRITE: finish - value: 35 [11 ns] READ: mutex locked [14 ns] READ: finish - value: 35 |
Notice how both threads were started at time 0, but throughout the simulation they never overlapped due to the usage of the mutex.
Some of the use-cases for the sc_mutex class are:
- bus modeling for managing sequencial access to some resources, like registers or memory
- arbitration to manage the blocking period in which a winner holds the arbiter busy
If you have more use-cases in mind for sc_mutex class don’t hesitate and leave your thoughs in a comment for this post.
3. Semaphore – sc_semaphore
In computer science, a semaphore is a variable or abstract data type used to control access to a common resource by multiple processes in a concurrent system such as a multiprogramming operating system.
The above statement is actually the semaphore definition from Wikipedia.
SystemC implements this mechanism via sc_semaphore class. The API is very simple:
- sc_semaphore(int init_value) – constructor to initialize the semaphore with the number of available slots
- wait() – function for locking the semaphore. When already locked it waits for semaphore to be unlocked.
- trywait() – function for trying to lock the semaphore. If locking fails (e.g. already locked) returns -1
- post() – function for unlocking the semaphore
- get_value() – function for getting the available slots in the semaphore
Some of the use-cases for the sc_semaphore class are:
- modeling of a multi-core processor
- memories with multiple busses for reads and/or writes
If you have more use-cases in mind for sc_semaphore class don’t hesitate and leave your thoughs in a comment for this post.
3.1. Example
Let’s say that we have a very simple processor which can do only one operation: increment. However, this processor has two cores so it can do in parallel up to two operations.
We can use the semaphore to ensure that only up to two increments are done in parallel.
To see the semaphore in action we have four threads which run in parallel and all of them want to access the processor.
The code of the processor looks like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | SC_MODULE(processor) { private: sc_semaphore semaphore; public: //function for incrementing a value int unsigned increment(int unsigned value, const char* thread_name); SC_CTOR(processor) : semaphore(2) { }; }; |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | int unsigned processor::increment(int unsigned value, const char* thread_name) { semaphore.wait(); int unsigned duration = std::rand() % 6; cout << "[" << sc_time_stamp() << "] PROCESSOR: @ " << thread_name << " - start incrementing value: " << value << ", estimated duration: " << duration << " ns" << endl; wait(sc_time(duration, SC_NS)); int unsigned result = value + 1; semaphore.post(); return result; }; |
The code for the parallel threads requesting increments in the same time looks like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | void example::thread(const char* thread_name) { for(int i = 0; i < number_of_accesses; i++) { int unsigned value = std::rand() % 100; cout << "[" << sc_time_stamp() << "] " << thread_name << ": sending incrementing request for value: " << value << endl; value = processor_inst.increment(value, thread_name); cout << "[" << sc_time_stamp() << "] " << thread_name << ": received incremented value: " << value << endl; wait(sc_time(std::rand() % 6, SC_NS)); } }; void example::thread_1() { thread("THREAD_1"); }; void example::thread_2() { thread("THREAD_2"); }; void example::thread_3() { thread("THREAD_3"); }; void example::thread_4() { thread("THREAD_4"); }; |
The output of this example is this one:
1 2 3 4 5 6 7 8 9 10 11 12 | [0 s] THREAD_1: sending incrementing request for value: 83 [0 s] PROCESSOR: @ THREAD_1 - start incrementing value: 83, estimated duration: 4 ns [0 s] THREAD_2: sending incrementing request for value: 77 [0 s] PROCESSOR: @ THREAD_2 - start incrementing value: 77, estimated duration: 1 ns [0 s] THREAD_3: sending incrementing request for value: 93 [0 s] THREAD_4: sending incrementing request for value: 35 [1 ns] THREAD_2: received incremented value: 78 [1 ns] PROCESSOR: @ THREAD_3 - start incrementing value: 93, estimated duration: 0 ns [1 ns] THREAD_3: received incremented value: 94 [1 ns] PROCESSOR: @ THREAD_4 - start incrementing value: 35, estimated duration: 1 ns [2 ns] THREAD_4: received incremented value: 36 [4 ns] THREAD_1: received incremented value: 84 |
From this output we can make few important remarks:
- all four threads start in parallel at time 0 ns
- only threads #1 and #2 are immediately server by the processor at time 0 ns
- when thread #1 receives its result at 1 ns thread #3 is served by the processor
- when thread #3 receives its result immediately at 1 ns thread #4 is served by the processor
As you can see, with a semaphore, it is very easy to control the maximum number of threads which access a resource in the same time.
You can run this example on your own on EDA Playground.
4. FIFO – sc_fifo
FIFO is an acronym for First In, First Out, a method for organizing and manipulating a data buffer, where the oldest (first) entry, or ‘head’ of the queue, is processed first.
The above statement is actually the FIFO definition from Wikipedia.
SystemC implements this mechanism via template class sc_fifo. The template parameter is the type of information stored in the FIFO. This means that you can have a FIFO which stores simple information like sc_int or complex information like your own class.
The API is more complex when comparing with the previous channels but I’ll try to list here the functions that I consider to be relevant for us at this point in our learning stage.
Constructors:
- sc_fifo(int size = 16) – initialize a FIFO with a given size (notice that size has a default value)
- sc_fifo(const char* name, int size = 16) – initialize a FIFO with a size and a name
Destructor:
- ~sc_fifo() – this one is important to clear the internal array holding the actual values
Access Functions:
- void read(T&) – blocking read. Result is received via reference.
- T read() – blocking read
- bool nb_read(T&) – non blocking read. Function returns true if the read was successful.
- void write(const T&) – blocking write
- bool nb_write(const T&) – non blocking write. Function returns true if the write was successful.
Status Functions:
- int num_available() – returns the number of available elements inside the FIFO
- int num_free() – returns the number of available slots inside the FIFO
It is worth mentioning at this point that sc_fifo class also provides events which notify when reads and writes are performes. You can get references to those events via these functions:
- sc_event& data_written_event() – returns a reference to the write event
- sc_event& data_read_event() – returns a reference to the read event
4.1. Example
To see the FIFO in action we can use the classic example of a producer and a consumer.
In our case the producer generates data four times as fast as the consumer can digest it. We can rely on the FIFO to put a “break” on the producer and wait for the consumer to do its job.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | void example::producer_thread() { int unsigned number_of_accesses = 4; for(int i = 0; i < number_of_accesses; i++) { sc_int32 value(std::rand() % 100); cout << "[" << sc_time_stamp() << "] writing to FIFO value: " << value << ", free: " << fifo.num_free() << endl; fifo.write(value); cout << "[" << sc_time_stamp() << "] wrote to FIFO value: " << value << endl; wait(1, SC_NS); }; }; |
As you can see in the code snippet above the producer thread tries to send to the consumer four (see number_of_accesses) random data.
We have a print before the actual write() function invocation and after. This will show us how much time the write() consumed.
1 2 3 4 5 6 7 8 9 10 11 | void example::consumer_thread() { sc_int<32> value(0); for(;;) { wait(4, SC_NS); fifo.read(value); cout << "[" << sc_time_stamp() << "] read from FIFO value: " << value << endl; }; }; |
Consumer thread is much simpler: it waits for 4 ns and performs a read().
The output of this example is this one:
1 2 3 4 5 6 7 8 9 10 11 12 | [0 s] writing to FIFO value: 83, free: 2 [0 s] wrote to FIFO value: 83 [1 ns] writing to FIFO value: 86, free: 1 [1 ns] wrote to FIFO value: 86 [2 ns] writing to FIFO value: 77, free: 0 [4 ns] read from FIFO value: 83 [4 ns] wrote to FIFO value: 77 [5 ns] writing to FIFO value: 15, free: 0 [8 ns] read from FIFO value: 86 [8 ns] wrote to FIFO value: 15 [12 ns] read from FIFO value: 77 [16 ns] read from FIFO value: 15 |
There are a few important aspects to note in this aspect:
- when there is free space available in the fifo (@0ns, @1ns) the write happens instantaneously
- when there is no free space available in the fifo (@2ns, @5ns) the write is delayed until a read is performed
You can run this example on your own on EDA Playground.
Warning
If you play with the status functions num_available() and num_free() you might get some very confusing results. We will will go into more details in the next lesson but the short explanation is this:
sc_fifo makes use of the update phase of the simulator to update the number of readable entries. This update phase is the equivalent of non-blocking assignment from Verilog.
That’s it for this lesson, hope you found it useful 🙂
Next Lesson: Learning SystemC: #005 Signal Channels
Previous Lesson: Learning SystemC: #003 Time, Events and Processes
