All verification techniques can be effective given the right scope and applied abstraction. At least that’s the argument I started in Portable Stimulus and Integrated Verification Flows with a graphic that plots the effectiveness of several techniques as a function of scope and abstraction. I have more people agreeing with the idea than disagreeing so I’ve carried on with it. I decided to dig a little deeper into exactly where and how well techniques apply.
For reference, I’ve pasted in the original map with effectiveness plotted for each of unit testing, directed testing, constrained random, portable stimulus and integrated HW/SW testing (reminder that this is partly based on the forward looking assumption that portable stimulus actually becomes a practical mainstream verification technique).
I see scope as the determining factor when choosing the right technique. Given a particular scope, verification engineers can then choose the most effective technique and an appropriate level of abstraction. For example, verifying some low level detail early in development is best done in a unit test or a directed test with interactions modeled at the wire or method level, verifying entire feature sets closer to release could involve complex scenarios modeled with portable stimulus.
I like how this effectiveness map has evolved, but scope is still pretty theoretical the way I have it. A ‘detail’, for example, is an arbitrary name I chose for the lowest level design scope. Likewise for functions, features and the rest. Given all the room for interpretation, I figured breaking down scope into labels that correspond to specific design characteristics and intentions would add some clarity. For that I chose expressions, branches, transitions, interactions, communication, integration, interoperability and normal operation.
As you can see, I didn’t stop there. I saw more meaningful labels for scope would allow us to grade each technique at a finer granularity; I suggest ‘most effective’ down to ‘counterproductive’ for each. With this, verification engineers can look specifically at what they’re verifying – interactions between two design functions for example – and choose the most effective technique – a constrained random test and/or a handful of directed tests.
This detailed scope breakdown should still be treated as an approximation because even though it’s more descriptive, it’s still subjective. It’s also context dependant. But I think linking scope to a set of characteristics, even though they are still subjective, could be quite useful. It gives developers criteria for recognizing (a) ‘sweet spots’ for each technique where they are most effective; (b) transition points where teams should consider moving from one technique to the next; and (c) situations where techniques should be avoided.
To that last point, I think recognizing where techniques become ineffective is even more useful than knowing where they are effective. You can look at this and easily deduce only crazy people would count on unit tests for an entire system (even I wouldn’t do that!) or an integrated HW/SW platform for pin level interaction.
It also makes it easier to assert that no one technique covers the entire verification spectrum. I’m thinking of constrained random here; it’s been the default choice in our industry for a while now. Constrained random may have a sweet spot around transaction level interactions but that effectiveness tapers off quickly as scope widens to the system (inefficient) or narrows to the details (haphazard and unpredictable).
With that, I’ll leave it to you to decide whether or not I’ve scored each technique appropriately. I’m off to think more about how techniques can be used to compliment each other as part of a start to finish verification flow.
-neil
PS: I’ve deliberately missed formal here because I personally don’t know enough about it to place it. I suspect it’s near unit testing and directed testing though I’ll leave it up to any formal experts out there who’d like to offer a more informed opinion.
I think Formal really “breaks” things (but in a *good* way). A lot of unit-testing and directed testing can get replaced by Formal. Though perhaps another way of thinking about it is: Formal is merely a different way to implement unit-testing and directed-testing. But it has a big enough impact on schedule that it makes a measurable impact on the big picture.
Additionally, in a study of the SW industry, 1/3 of all defects in products were traced to problems with the requirements. Merely implementing formal techniques (writing assertions) early in the project and using them in unit-test/directed test, will help flush out problems with the requirements. And requirements problems usually imply DUT refactoring, including changing interfaces.
Also: where does the reference model come from for directed-random? If it’s not provided by another group, then the Verification team will need to spend the same order of magnitude of effort coding the reference model, as the Design team spent coding the RTL. So a project that starts that reference model early (and preferably can use something developed by the system-architecture group) can start directed-random sooner/with a lot less effort.
Register-based stimulus and verification is far better done by the SW team that is developing the product code, than done by the Verification team. The SW team has to learn the HW, and if they start early and implement the bringup routines, etc. then they’ll flush out issues with the HW/SW interface, and remove the duplicate effort of first Verification coding bringup in SystemVerilog, then SW team coding bringup in C++. That causes a big shift in when bugs are found, and when “system” testing starts.
So perhaps multiple graphs are needed, depending on what approaches the project (across all teams) is willing to take.
“Register-based stimulus and verification is far better done by the SW team that is developing the product code, than done by the Verification team. The SW team has to learn the HW, and if they start early and implement the bringup routines, etc. then they’ll flush out issues with the HW/SW interface, and remove the duplicate effort of first Verification coding bringup in SystemVerilog, then SW team coding bringup in C++. That causes a big shift in when bugs are found, and when “system” testing starts.”
True, but not all effort is uniformly distributed in terms of value. It’s much better to find bugs before they are manufactured. I’d much rather my Pre-silicon team find a bug in pre-production rather than a SW team find one after it would require metal fixes. It’s more costly and its duplicate effort, but its necessary unless you want to expose bugs to customers. No one said verification was cheap 🙂
Hi Hendrik,
I think there’s been a misunderstanding. What I meant was that the SW team started far earlier, and developed the device driver (in C/C++) instead of the Verification team doing it in SystemVerilog/SystemC. That allows:
1. SW team finds HW/SW interface architectural and requirements issues early on.
2. Removes duplication of work on creating the device driver
3. Allows the UVM-knowledgeable people to work on the testbench, and all the stimulus-generators that are for the non-register-bus interfaces.
Hi Erik,
I don’t think getting the SW team write the device driver instead verification team writing UVM sequence to configure the device works. We tried it a few project ago and everyone hates it. Now instead of one code base to test, you have two code (RTL and C/C++) base to test. When something is not working, the verification team has to spend a lot of time to figure whose fault is it, the SW guys or the RTL designer. It is far easier to deal with the RTL on register level directly when you need to debug. You know exactly how the RTL suppose to behave and you can control it with precision instead of jumping through hoops in the SW layer.
We have to catch RTL bugs before tape-out, but SW bugs can wait, plus it is cheap to fix. Time wasted in debugging SW code is much worse than the duplication of work on creating the device driver.
Effectiveness map concept is really nice. Although the rating can evolve but it’s good visualisation and in right direction.