DB4HLS

Domain Specific Language for Design Space Explorations

Generating the different configurations associated with a DSE is a tedious and error-prone process when performed by hand. Therefore, we developed a Domain-Specific Language (DSL) to automatically and concisely define configuration spaces by employing Configuration Space Descriptors (CSDs).

Each line of a descriptor encodes a knob, which comprises a directive type, a label corresponding to its location in the design C/C++ code, and one or multiple sets of values. The number of sets is equal to the number of parameters required by the directive type. Values can be numerical when expressing optimizations such as loop unrolling or array partitioning factors, or categorical when determining the type of employed FPGA resources such as BRAM types. A shorthand is provided for expressing regular value series (e.g., a succession of power-of-two values). Finally, we provide a @bind decorator, which constraints the values associated with different directives.

The picture below shows an example of a configuration spaces descriptor and its elements:


Guided example

Code snippet of the last_step_scan function included in DB4HLS:

void last_step_scan(int bucket[SIZE], int sum[RADIX]) { int i, j, k; loop_1:for(i = 0; i < RADIX;i++) { loop_2:for(j = 0; j < BLOCK; j++) { k = (i * BLOCK) + j; bucket[k] = bucket[k] + sum[i]; } } }

Example of CSD for the last_step_scan function included in DB4HLS, describing 1600 different configurations:

resource;last_step_scan;bucket;{RAM_2P_BRAM} resource;last_step_scan;sum;{RAM_2P_BRAM} array_partition;last_step_scan;bucket;1;{cyclic,block};{1->512,pow_2} array_partition;last_step_scan;sum;1;{cyclic,block};{1->128,pow_2}@bind_a unroll;last_step_scan;last_1;{1->128,pow_2}@bind_a unroll;last_step_scan;last_2;{1,2,4,8,16} clock;{10}

The snippets above show for the function last_step_scan and an example of DSL defined to describe the configuration space defined for its DSE. The DSL defines seven different knobs. Line 1 of the DSL shows a knob with a single value: it associates a dual-ported BRAM to the array small bucket that is the input of the function. Similarly, line 2 defines a dual-ported BRAM for the array small sum. Line 3 instead defines a knob governing the array_partitioning directive defined by all the pairs having one of two partitioning strategies cyclic and block) as the first component, and the ten possible partitioning factors (all the powers of two from 1 up to 512) as the second one. The same is done in line 4, but defining a different set of partitioning factors (all the powers of two from 1 up to 128). Then lines 5 and 6 define for loop_1 and loop_2 the associated set of unrolling factors to consider during the exploration, all the powers of two from 1 up to 128 and 16, respectively. Both lines 4 and 5 have a binding decorator @bind_a, that specifies that the array partitioning directive and the unrolling one must have the same partitioning and unrolling factor for all the configurations described by the CSD. Finally, line 7 defines the target clock.

The DSL generates the set of configurations of the design space as the Cartesian product of all knob values: CS = K1 x K2 x ... x KN; where N is the number of considered knobs, and Ki is the set of values related to each i knob, i.e., the set of values that the directive associated to the knob i can assume, taking into account the restrictions imposed by the bind decorator. For a directive with multiple parameters, Ki is itself the Cartesian product among each set of values. The size of the configuration space is then given by its cardinality (|CS|). The configuration space descriptor of last_step_scan describes a configuration space with 1600 different configurations. Without the binding decorator, the cardinality of the configuration space would be 12800.

Code

The current implementation of the DSL supports the generation of configurations targeting the Vivado HSL synthesis tool.
The directive types currently accepted by the DSL are the following:

  • Loop unrolling: expressed as unroll in the DSL;
  • Array partitioning: expressed as array_partition in the DSL;
  • Resource type: expressed as resource in the DSL;
  • Function inlining: expressed as inline in the DSL;
  • Clock period: expressed as clock in the DSL;
The DSE can be adapted to generate configuration scripts for other synthesis tools and target different directive types. If you are interested in expanding the DSL, please contact Lorenzo Ferretti at [email protected].

Author

Lorenzo Ferretti, Ph.D., currently PostDoc at Università della Svizzera italiana (USI).
You can find more information about my work here and for any question you can contact me at [email protected].