The code generator introduces registers when you specify certain block implementations or use certain settings. You can follow these guidelines to learn more about these registers and how you can use them to optimize the timing of your design.
Each guideline has a severity level that indicates the level of compliance requirements. To learn more, see HDL Modeling Guidelines Severity Levels.
3.2.1
Informative
In most cases, the code generator introduces the registers in regions that run slower than the clock rate. To avoid or minimize additional latency, you can run these registers at the fast clock rate by using clock-rate pipelining. You can use clock-rate pipelining with these optimizations:
Input and output pipelining
Multi-cycle block implementations, such as complex math operations like Sqrt and Reciprocal.
Floating-point library mapping
Delay balancing
Resource sharing and streaming
In addition, for designs with multiple hierarchies, to improve opportunities for clock-rate pipelining, it is recommended that you have the HDL block property FlattenHierarchy enabled on the top-level Subsystem.
To learn more about clock-rate pipelining and blocks that act as barriers to this optimization, see Clock-Rate Pipelining.
3.2.2
Recommended
Distributed pipelining is a speed optimization that reduces the critical path by moving existing delays in your design while preserving the functional behavior.
To use this optimization for a Subsystem, set the
DistributedPipelining HDL block property set to
on
.
To more effectively use this optimization, in the Configuration Parameters dialog box, on the HDL Code Generation > Optimization pane, you can specify these additional settings.
ConstrainedOutputPipeline: Make sure that the total number of delays that are inserted including anyinput and output pipelining that you specify is greater than or equal to the value that you specify for ConstrainedOutputPipeline on the Subsystem.
Hierarchical distributed pipelining: Select this option if you
want to apply the distributed pipelining optimization across
multiple subsystem hierarchy. Make sure that the top-level
Subsystem and each subsystem in the hierarchy has
the DistributedPipelining HDL block property
set to on
.
Note
If you cannot enable DistributedPipelining on the top-level Subsystem, you can enable FlattenHierarchy, which enables pipelining with other blocks at a lower model hierarchy.
Clock-rate pipelining: Select this option if you want the code generator to insert registers at the clock rate instead of the data rate.
Allow clock-rate pipelining of DUT output ports: Select this option if you want the code generator to insert registers at the clock rate instead of the data rate at the DUT output ports.
Preserve design delays: Select this option if you do not want the code generator to move the delays you added to your design. The optimization only moves pipeline registers.
Distributed pipelining priority: Specify whether you want
the priority to be Numerical Integrity
or
Performance
. If you use
Performance
, make sure that the simulation
results match. In some cases, this setting moves registers into
blocks that have initial values such as constants, which can
affect simulation results.
The Subsystem for which you want to apply the optimization must meet these requirements:
Make sure that the Subsystem that you apply this optimization on does not contain any feedback loops.
Use blocks that are supported for distributed pipelining. For a list of unsupported blocks, see Limitations of Distributed Pipelining. As a workaround:
Place some of the unsupported blocks such as Dot Product inside another Subsystem that does not have distributed pipelining enabled.
Change the Distributed pipelining
priority to
Performance
for certain blocks
such as Enabled Subsystem.
The Sample Time of the blocks must be
discrete. If you have blocks with Sample Time
set to Inf
, change them to -1
.
To identify and change the sample time programmatically, see Change Block Parameters by Using find_system and set_param.
Remove any input ports on Scope blocks to avoid generation of infinite sample time.