Stream Accessing Management
If you are compiling a system, then there is a forth page in the compilation wizard for managing the ways streams are accessed. From here you can add managing info for either input or output streams such as the number of stream channels, and the maximum number of outstanding memory requests at any time. Let’s take a look at how this is done.
Using the Stream Accessing Management Page
On the Stream Accessing Management page, the table on the top will hold any values describing how many channels and outstanding memory requests an input stream can have. The table on the bottom will hold any values describing how many channels an output stream will have.
If there is no data for a given stream in either table, it is assumed by the compiler that the number of channels should be one and the number of outstanding memory requests will be one if it is an input stream. To add specific details about any of your input streams, select “Add” next to the table on the top.
This will open up a new window that allows you to specify the details about this input stream. Type in the name of the stream you wish to associate these values with, and give a positive integer number for both the number of input channels and outstanding memory requests.
Once you have put in the values for your stream, select “Finish” at the bottom.
This page will close returning you back to the Stream Accessing Management page in which you should now see the values you just put in on the table on the top. Once these values are in the table, you can edit them by double clicking individual cells and changing the values.
You can continue adding details for input streams by selecting “Add” at the top until you have filled out all the input stream details you need. When you want to starting adding the details about the output streams, select the “Add” button that is located next to the table on the bottom.
The page that pops up for output stream details will look very similar to the one for the input stream details. It will ask for the name of the stream these details are about and the number of output channels. You do not need to specify the number of outstanding memory requests for output streams.
Go ahead and fill out the values on that page and press “Finish” at the bottom.
This will add the output stream info to the table on the bottom of the Stream Accessing Management page. You can continue adding details about other output streams until you have described all that you need.
There are some caveats to take into account though when specifying details about the number of channels and memory requests. The number of outstanding memory requests must be equal to or greater than the number of stream channels and the number of stream channels must be a factor of the window the data is being accessed from for that stream and the step size of the loop.
Understanding the Stream Channels
So now that we know how to specify the number of channels for each stream, let’s see how they can be used. For example, let’s say we coded our system to accept 16-bit input and output streams. The platform we are going to synthesize this on has 16-bit memory interfaces. Everything should hook up nicely achieving maximum throughput since we have matching bit sizes on our streams and our platform’s memory interfaces.
Alright so now let’s say that we want to take our same code and port it to a different platform that has 64-bit memory interfaces instead. If we left our code the same, we would be having to fetch 16-bits from a 64-bit interface which is inefficient as well as creates a need for complicated glue logic to be able to access the 16-bit parts from the 64-bit memory values. Obviously we are only utilizing 1/4 of our bandwidth which is not optimal.
Well, we could get past this problem by leaving our C code exactly the same and use LoopUnrolling to make our system loop have four times the loop bodies. So now we are essentially accessing four times the data from our input streams and output streams per cycle. We can then specify in our Stream Accessing Management page that the input streams and output streams have four parallel channels each. Now instead of only getting 16-bits of data, we are getting 64-bits of data which utilizes all the bandwidth of our 64-bit machine as well as obviates the need for any complicated glue logic.
We were able to tune our generated hardware to different platforms without having to touch our C code at all.
Multiple Outstanding Memory Requests
As seen in the Stream Accessing Management page, we can specify the number of maximum outstanding memory requests for any given input stream. The reason for this is because the memory latency may be high on a specific platform. ROCCC can send multiple memory requests without receiving a response to make up for this and be able to stay fully pipelined.
|<< Area vs Frequency
||Tutorials Home||Bit Width Specification >>|