siddhakarana: Gate Level Simulations : A Necessary Evil - Part 2
Part 1 - siddhakarana: Gate Level Simulations: A Necessary Evil - Part 1 Rising co. Join Date: Jan ; Location: Bangalore, India; Posts: 55 the GLS bring up uses selected functional tests with zero/unit delay (timing. As the design gets refined into lower levels of abstraction, such as gate-level and layout level, functional (zero-delay) and timing simulations can validate the. Gate-level simulation dates back to a simpler time when IC designs Zero-delay gate-level simulations (netlist simulations with no SDF or.
A complete SDF with correct timing values typically will not be available until a late stage of verification. Under such scenarios, early gate-level timing simulations using an SDF that is not timing-clean can be expensive to simulate with timing checks, if they are to be waived off or individually suppressed on an instance-basis at runtime.
In this case, it is ideal to compile the netlist with the option to remove all timing checks from the design. Most tools provide a run-time option to prevent the simulation of timing checks. However, we recommend compiling the design with the option to allow the simulator to apply any additional optimizations after removal of the timing check statements.
Other ways to improve gate-level simulation performance include: If possible, set up simulations to run at a coarser resolution such as ps rather than at a finer resolution such as fs.
Do not log cell internals. If cell internals are logged, the simulation performance can be severely impacted multiple-X slowdownand the resulting debug database can end up being large and difficult to work with. Replace verified gate-level simulation blocks with RTL or stubs. As a full-chip configuration is built through IP and sub-system integration, it is a good idea to replace verified netlist blocks with equivalent RTL blocks or even stubs with appropriate port connectivity.
Thi process can provide a significant performance boost as you verify only what is required.
How to improve throughput for gate-level simulation
Switch timing corners during simulation. A very effective way to improve gate-level simulation throughput when simulating multiple timing corners is to switch the SDF at the start of simulation. This saves precious recompilation time in gate-level designs. On large-scale designs, the simulation time to run ATPG tests can vary from a few hours to a few weeks. Duration will be based on the design size, scan-chain size, and number of patterns tested.
This is true whether users are doing stuck-at-fault simulation or chain-integrity-tests, and also whether they are doing serial or parallel pattern testing. Verification time is enormous for some teams, and they are actively looking to improve or shrink the time they need for running simulation. Fortunately, there are a few ways that throughput can be improved in this context. ATPG test regressions can often be categorized into different test configurations, where multiple tests share the same test configurations.
ATPG simulation can also be split into two phases: Again, in an ATPG test regression, multiple tests share the same test-setup phase and then start the pattern-simulation phase. During the pattern-simulation phase, users often have multiple patterns in a single test that are being simulated serially. Here is how that process works: Next create a set of such tests. Then create multiple sets of such test-sets.
For each test-set, run the simulation until the test setup phase is done and then checkpoint or save the state of the simulation. Now users can run multiple tests directly by restoring the state and resuming simulations. All these tests can be run in parallel to efficiently use the grid system. A single test with multiple patterns can be split to run those patterns in parallel.
Based on the number of patterns test test has, our recommendation is to create sets of at least three or four patterns to get the best throughput efficiency. When it is time for debugging a failing pattern, the user first needs to wait for the simulation to finish to find which pattern failed and then start the debug process.
Tests that run on physical testers in seconds of real-time at GHz speeds can take hours to days in simulation.
However, several approaches can speed these simulations. Figure 3 shows a typical ATPG test flow. The red boxes highlighted in the graphic indicate qualitatively where the majority of simulation time is spent. There are several actions that can be taken to speed these simulations.
Serial patterns take n clock cycles to scan in and n cycles to scan out. Even with optimized scan chains, there can be thousands of clock cycles per pattern.
Parallel load and unload techniques can be used to drastically reduce the simulation time in these cases. Unit delay simulations can be run before full timing.
SDF annotated timing simulations with advanced node libraries are very long and will uncover both timing and non-timing related issues. Since the timing simulations could take four to five times more wall-clock time and memory resources to simulate than unit delay, debugging time lengthens accordingly.
Pre-layout unit delay simulations can catch errors early. For example, a few functional tests should be simulated to build confidence in the test insertion. In addition, a single ATPG pattern should be run in serial mode that exercises all scan chains to verify scan chain integrity.
- Why Run Gate-Level Simulations?
- Zero delay gate-level simulation dating
Depending upon the amount of compute resources available for the job at the time, a few additional, top-ranking patterns for simulation should be selected, based on coverage grading produced by the test tool. A hardware acceleration solution can be used to verify the functional integrity between RTL and pre-layout netlists because acceleration may run 10, totimes faster than GLS.
Simulating a single pattern could take several hours in serial mode. If 10 to 12 patterns are chained together in a single simulation run, a simulation run could take days. Using a compute-farm to break down long serial runs into several shorter simulation runs to run in parallel will shrink the overall regression time.
Doing so involves trading compute efficiency for turnaround time when validating late design changes and debug time. A calculation can determine the right tradeoff. Now measure simulation times associated with a few patterns and take an average to compute average time per pattern. Note that pattern simulation time will vary due to different event densities produced by different patterns.
Next, calculate the amount of time it takes to start a simulation on the farm, and call it TSimStart. Constraint solving can be used to identify the optimal solution for the minimum number of machines required to achieve a target regression time, as shown in the table below all numbers are in minutes.
Why do we need zero-delay netlist simulation?
Table 1 shows that partitioning long single-pattern simulations into multiple shorter simulations can achieve faster regression turnaround time. This step must be planned up front so this step does not become the critical constraint right before tape-out. At first, the EDA vendor reaction was to just build a bigger hammer in the form of a faster gate level simulator. While this is necessary, it isn't sufficient.
Gate Level Simulation is Increasing Trend | Tech Trends
We need to apply new techniques to speed DFT and timing simulations as outlined in this paper. Whether you are deep into this space at 14nm, or just entering at 40nm, the good news is that many of these new techniques are available today. Yes, you do need a bigger hammer.