Learning SystemC: #004 Primitive Channels

Learning SystemC: #004 Primitive Channels

Posted by

In this post I will talk about SystemC primitive channels. These channels help us establish an easier and safer communication between SystemC modules.

Here is a list of content if you want to jump to a particular subject:

1. Why Do We Need Communication Channels?
   1.1. Example
2. Mutex – sc_mutex
   2.1. Example
3. Semaphore- sc_semaphore
   3.1. Example
4. FIFO – sc_fifo
   4.1. Example



1. Why Do We Need Communication Channels?

When you have to build a SystemC program in most of the cases you will probably have to split your design in multiple modules which will have to communicate between them in some way.

If we have to build such an inter-module communication, with the information that we accumulated so far from this tutorial, then the most common solution would be to use events and public fields. But this comes with some pitfalls and the most obvious is the situation when you are waiting for an event which already happen. I’ll try to emphasize this in the next example.

1.1. Example

Let’s say that we have a module producing random numbers, called generator, and a module which reads these numbers, called reader.

The thread for the generator might look like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
void generator::generate() {
  for(;;) {
    cout << "[" << sc_time_stamp() << "] GENERATOR: random number began generation" << endl;
   
    //wait some random time  
    wait(sc_time(std::rand() % 6, SC_NS));
 
    //generate a random value
    data = std::rand() % 100;

    //announce the reader that random data is available
    data_ready_ev.notify();
     
    cout << "[" << sc_time_stamp() << "] GENERATOR: random number was generated" << endl;
 
    //wait for the reader to read the data
    wait(data_read_ev);
   
    cout << "[" << sc_time_stamp() << "] GENERATOR: random number was read" << endl;
  }
};

Process for generating random numbers

The generator loop is pretty simple:

  • wait some random time
  • generate a random value
  • announce the reader via data_ready_ev that random data is available
  • wait for the reader to read the data via event data_read_ev

The thread for the reader might look like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
void reader::read() {
  for(int i = 0; i < 3; i++) {
    cout << "[" << sc_time_stamp() << "] READER: prepare to read data" << endl;
   
    //wait for generator to finish generation
    wait(generator_ptr->data_ready_ev);
   
    //read the random data
    cout << "[" << sc_time_stamp() << "] READER: got data: " << generator_ptr->data << endl;
     
    //announce the generator that the data was read
    generator_ptr->data_read_ev.notify();
 
    //wait some random time before the next read
    wait(sc_time(std::rand() % 6, SC_NS));
  };
};

Process for reading the random numbers

The reader loop is pretty simple:

  • wait for generator to finish generation via event data_ready_ev
  • read the random data
  • announce the generator via data_read_ev that the data was read
  • wait some random time before the next read

At first glance this looks ok but the changes are pretty high that the reader thread gets stuck because it starts to wait for data_ready_ev after generation was finish.

1
2
3
4
5
6
7
8
[0 s] READER: prepare to read data
[0 s] GENERATOR: random number began generation
[1 ns] GENERATOR: random number was generater
[1 ns] READER: got data: 86
[1 ns] GENERATOR: random number was read
[1 ns] GENERATOR: random number began generation
[2 ns] GENERATOR: random number was generater
[4 ns] READER: prepare to read data

You can run this example on your own on EDA Playground.
A possible solution is to use a flag which keeps track if the data was already generated like in this example.

 
As you saw in the example above, it is pretty easy to make mistakes in such scenarios where you have to pass information between parallel threads. Luckily, SystemC library comes with some predefined classes, called channels, which makes communication between modules much easier and safer.
In the next sections I will present several of these predefined channels.

2. Mutex – sc_mutex

In computer science, a lock or mutex (from mutual exclusion) is a synchronization mechanism for enforcing limits on access to a resource in an environment where there are many threads of execution. A lock is designed to enforce a mutual exclusion concurrency control policy.

The above statement is actually the mutex definition from Wikipedia.

SystemC implements this mechanism via sc_mutex class. The API is very simple containing only three functions:

  • lock() – function for locking the mutex. When already locked it waits for mutex to be unlocked.
  • trylock() – function for trying to lock the mutex. If locking fails (e.g. already locked) returns -1
  • unlock() – function for unlocking the mutex

This mutex mechanism is perfect for solving problems like managing accesses to a register or a memory (e.g. not having a write overlapping with a read).

2.1. Example

Let’s say that we have a register for which the read and write actions last a random time period. In a real electronic digital component accesses can not overlap as all are done via the same physical bus, sequentially. We can model this behavior very easy using the sc_mutex class.

The read function looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
sc_int<32> simple_reg::read() {
  cout << "[" << sc_time_stamp() << "] READ: start" << endl;
 
  mutex.lock();
 
  cout << "[" << sc_time_stamp() << "] READ: mutex locked" << endl;
 
  wait(sc_time(std::rand() % 6, SC_NS));
 
  mutex.unlock();
 
  cout << "[" << sc_time_stamp() << "] READ: finish - value: " << value << endl;
 
  return value;
};

The write function looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
void simple_reg::write(sc_int<32> new_value) {
  cout << "[" << sc_time_stamp() << "] WRITE: start " << endl;
 
  mutex.lock();
 
  cout << "[" << sc_time_stamp() << "] WRITE: mutex locked" << endl;
 
  wait(sc_time(std::rand() % 6, SC_NS));
 
  value = new_value;
 
  mutex.unlock();
 
  cout << "[" << sc_time_stamp() << "] WRITE: finish - value: " << value << endl;
};

I added some messages before and after locking the mutex so we can see how the mutex works. Here is the output of this example:

1
2
3
4
5
6
7
8
9
10
11
12
[0 s] WRITE: start
[0 s] WRITE: mutex locked
[0 s] READ: start
[4 ns] WRITE: finish - value: 83
[4 ns] READ: mutex locked
[5 ns] READ: finish - value: 83
[7 ns] WRITE: start
[7 ns] WRITE: mutex locked
[10 ns] READ: start
[11 ns] WRITE: finish - value: 35
[11 ns] READ: mutex locked
[14 ns] READ: finish - value: 35

Notice how both threads were started at time 0, but throughout the simulation they never overlapped due to the usage of the mutex.

Some of the use-cases for the sc_mutex class are:

  • bus modeling for managing sequencial access to some resources, like registers or memory
  • arbitration to manage the blocking period in which a winner holds the arbiter busy

If you have more use-cases in mind for sc_mutex class don’t hesitate and leave your thoughs in a comment for this post.

3. Semaphore – sc_semaphore

In computer science, a semaphore is a variable or abstract data type used to control access to a common resource by multiple processes in a concurrent system such as a multiprogramming operating system.

The above statement is actually the semaphore definition from Wikipedia.

SystemC implements this mechanism via sc_semaphore class. The API is very simple:

  • sc_semaphore(int init_value) – constructor to initialize the semaphore with the number of available slots
  • wait() – function for locking the semaphore. When already locked it waits for semaphore to be unlocked.
  • trywait() – function for trying to lock the semaphore. If locking fails (e.g. already locked) returns -1
  • post() – function for unlocking the semaphore
  • get_value() – function for getting the available slots in the semaphore

Some of the use-cases for the sc_semaphore class are:

  • modeling of a multi-core processor
  • memories with multiple busses for reads and/or writes

If you have more use-cases in mind for sc_semaphore class don’t hesitate and leave your thoughs in a comment for this post.

3.1. Example

Let’s say that we have a very simple processor which can do only one operation: increment. However, this processor has two cores so it can do in parallel up to two operations.
We can use the semaphore to ensure that only up to two increments are done in parallel.

To see the semaphore in action we have four threads which run in parallel and all of them want to access the processor.

The code of the processor looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
SC_MODULE(processor) {
private:
 
  sc_semaphore semaphore;

public:
   
  //function for incrementing a value
  int unsigned increment(int unsigned value, const char* thread_name);
   
  SC_CTOR(processor) : semaphore(2) {
   
  };
 
};

processor.h

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
int unsigned processor::increment(int unsigned value, const char* thread_name) {
 
  semaphore.wait();
 
  int unsigned duration = std::rand() % 6;
 
  cout << "[" << sc_time_stamp() << "] PROCESSOR: @ "
    << thread_name << " - start incrementing value: " << value  
    << ", estimated duration: " << duration << " ns" << endl;
 
  wait(sc_time(duration, SC_NS));
 
  int unsigned result = value + 1;
 
  semaphore.post();
 
  return result;
};

processor.cpp

The code for the parallel threads requesting increments in the same time looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
void example::thread(const char* thread_name) {
  for(int i = 0; i < number_of_accesses; i++) {
    int unsigned value = std::rand() % 100;
   
    cout << "[" << sc_time_stamp() << "] " << thread_name
      << ": sending incrementing request for value: " << value  << endl;
 
    value = processor_inst.increment(value, thread_name);
   
    cout << "[" << sc_time_stamp() << "] " << thread_name
      << ": received incremented value: " << value  << endl;
 
    wait(sc_time(std::rand() % 6, SC_NS));
  }
};

void example::thread_1() {
  thread("THREAD_1");
};

void example::thread_2() {
  thread("THREAD_2");
};

void example::thread_3() {
  thread("THREAD_3");
};

void example::thread_4() {
  thread("THREAD_4");
};

Parallel threads trying to request increments in the same time

The output of this example is this one:

1
2
3
4
5
6
7
8
9
10
11
12
[0 s] THREAD_1: sending incrementing request for value: 83
[0 s] PROCESSOR: @ THREAD_1 - start incrementing value: 83, estimated duration: 4 ns
[0 s] THREAD_2: sending incrementing request for value: 77
[0 s] PROCESSOR: @ THREAD_2 - start incrementing value: 77, estimated duration: 1 ns
[0 s] THREAD_3: sending incrementing request for value: 93
[0 s] THREAD_4: sending incrementing request for value: 35
[1 ns] THREAD_2: received incremented value: 78
[1 ns] PROCESSOR: @ THREAD_3 - start incrementing value: 93, estimated duration: 0 ns
[1 ns] THREAD_3: received incremented value: 94
[1 ns] PROCESSOR: @ THREAD_4 - start incrementing value: 35, estimated duration: 1 ns
[2 ns] THREAD_4: received incremented value: 36
[4 ns] THREAD_1: received incremented value: 84

From this output we can make few important remarks:

  • all four threads start in parallel at time 0 ns
  • only threads #1 and #2 are immediately server by the processor at time 0 ns
  • when thread #1 receives its result at 1 ns thread #3 is served by the processor
  • when thread #3 receives its result immediately at 1 ns thread #4 is served by the processor

As you can see, with a semaphore, it is very easy to control the maximum number of threads which access a resource in the same time.
You can run this example on your own on EDA Playground.

4. FIFO – sc_fifo

FIFO is an acronym for First In, First Out, a method for organizing and manipulating a data buffer, where the oldest (first) entry, or ‘head’ of the queue, is processed first.

The above statement is actually the FIFO definition from Wikipedia.

SystemC implements this mechanism via template class sc_fifo. The template parameter is the type of information stored in the FIFO. This means that you can have a FIFO which stores simple information like sc_int or complex information like your own class.
The API is more complex when comparing with the previous channels but I’ll try to list here the functions that I consider to be relevant for us at this point in our learning stage.

Constructors:
Destructor:
  • ~sc_fifo() – this one is important to clear the internal array holding the actual values
Access Functions:
Status Functions:

It is worth mentioning at this point that sc_fifo class also provides events which notify when reads and writes are performes. You can get references to those events via these functions:

4.1. Example

To see the FIFO in action we can use the classic example of a producer and a consumer.

In our case the producer generates data four times as fast as the consumer can digest it. We can rely on the FIFO to put a “break” on the producer and wait for the consumer to do its job.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
void example::producer_thread() {
  int unsigned number_of_accesses = 4;
 
  for(int i = 0; i < number_of_accesses; i++) {
    sc_int32 value(std::rand() % 100);
   
    cout << "[" << sc_time_stamp() << "] writing to FIFO value: " << value <<
       ", free: " << fifo.num_free() << endl;
   
    fifo.write(value);
   
    cout << "[" << sc_time_stamp() << "] wrote to FIFO value: " << value << endl;
   
    wait(1, SC_NS);
  };
};

Producer thread

As you can see in the code snippet above the producer thread tries to send to the consumer four (see number_of_accesses) random data.
We have a print before the actual write() function invocation and after. This will show us how much time the write() consumed.

1
2
3
4
5
6
7
8
9
10
11
void example::consumer_thread() {
  sc_int<32> value(0);
 
  for(;;) {
    wait(4, SC_NS);
   
    fifo.read(value);
 
    cout << "[" << sc_time_stamp() << "] read from FIFO value: " << value << endl;
  };
};

Consumer thread

Consumer thread is much simpler: it waits for 4 ns and performs a read().

The output of this example is this one:

1
2
3
4
5
6
7
8
9
10
11
12
[0 s] writing to FIFO value: 83, free: 2
[0 s] wrote to FIFO value: 83
[1 ns] writing to FIFO value: 86, free: 1
[1 ns] wrote to FIFO value: 86
[2 ns] writing to FIFO value: 77, free: 0
[4 ns] read from FIFO value: 83
[4 ns] wrote to FIFO value: 77
[5 ns] writing to FIFO value: 15, free: 0
[8 ns] read from FIFO value: 86
[8 ns] wrote to FIFO value: 15
[12 ns] read from FIFO value: 77
[16 ns] read from FIFO value: 15

There are a few important aspects to note in this aspect:

  • when there is free space available in the fifo (@0ns, @1ns) the write happens instantaneously
  • when there is no free space available in the fifo (@2ns, @5ns) the write is delayed until a read is performed

You can run this example on your own on EDA Playground.

Warning

If you play with the status functions num_available() and num_free() you might get some very confusing results. We will will go into more details in the next lesson but the short explanation is this:

sc_fifo makes use of the update phase of the simulator to update the number of readable entries. This update phase is the equivalent of non-blocking assignment from Verilog.

That’s it for this lesson, hope you found it useful 🙂

Next Lesson: Learning SystemC: #005 Signal Channels
Previous Lesson: Learning SystemC: #003 Time, Events and Processes



Cristian Slav

Leave a Reply

Your email address will not be published.