Coverage hangover.
That’s what sets in after we’re done hitting all the easy stuff and move on to targeting the more obscure corners of a coverage model. The pace slows. Progress slows. We have lots of review meetings to debate the merits of coverpoints, some of which we may not even understand. Through trial-and-error we plod along as best we can until someone says whatever we have is good enough (because 100% coverage is impossible). Then we shrug our shoulders, add a few exclusions, write up a few waivers, shake off the hangover and move on.
I’ve had coverage hangover several times. I’m sure we all have. With some devices – the really massive SoCs – there are verification engineers that live through coverage hangover for months at a time. Their only reprieve, if you can call it that, tends to be bug fixing. If they’re lucky, they’ll get to implement a new feature now and then. Otherwise, it’s a cycle of regression, analysis, tweak and repeat.
The worst part of a coverage hangover is that the next hangover is guaranteed to be worse because the next device is always bigger. At least that’s what happens with current verification strategies. I’d like to propose we break common practice with a reset. Regrettably, it won’t change the fact that coverage space will continue to grow. But it will give some relief for the next few generations while folks smarter than me find better ways to define, collect and analyze coverage.
In closing coverage, our focus and timing have always struck me as being out of tune with the reality of the mission. I’m proposing we change that by breaking coverage into a series of steps that we can focus on independently, moving from one type to the next as features mature.
Code coverage is first – line coverage at a minimum but also expression coverage; collected, analyzed and closed immediately after a line of RTL is written. Closing code coverage first is a major deviation from and months earlier than the norm.
There are options for closing code coverage this early. The best option, option ‘A’ in my opinion, is through a designer written unit test suite that’s built concurrently with the RTL. Second best are unit tests and directed tests written by the verification team. Fall back would be code coverage collected from a constrained random test suite (i.e. what most of us do now).
I’m aware that most people will see option ‘A’ and push back with the idea that designers should not be trusted – or bothered for that matter – to exhaustively test their own code before handing it to the verification team. That and it sounds like the least efficient means of verification possible. If that’s your response, I acknowledge it’s a stretch but I stand by the suggestion. The absolute best person to verify the intent of some line of RTL is the person that wrote that line of RTL. Further, I think having designers test their own code as they write it helps all subsequent verification activity proceed much more smoothly than would any other option. By a mile. So it’s ok to push back, but if you’re quoting extra effort as the reason, keep in mind it’s effort you’ll spend one way or another. And the cost of that effort increases dramatically for every day a line of RTL goes untested.
Atomic functional coverage is next where functional snapshots of a design are captured and analyzed. Atomic functional coverage would apply to individual properties and be derived from atomic transactions. Then comes atomic cross-coverage derived from the same transactions but across a combination of properties. Last are temporal functional coverage and temporal cross-coverage derived over multiple transactions.
Up to here, code and functional coverage applies to exhaustive verification of subsystems and IP. I think there’s value in planning, implementing, collecting, analyzing, refining and closing each type of coverage in order because they build on each other and compartmentalizing increases focus by limiting the problem space. A further compartmentalization – necessary in my mind – is approaching each type of coverage on a feature-by-feature basis (i.e. incrementally closing coverage on features as they are built instead of closing coverage on entire designs).
After subsystem coverage comes integration coverage.
An integration coverage model can be derived exclusively from device and subsystem IO – which probably includes a good portion of the software interface – and contains a combination of code coverage and functional coverage focusing on (a) connectivity and (b) functionality.
The connectivity aspect of integration coverage is easier to define. We have toggle coverage for subsystem connectivity complemented by a minimum atomic functional coverage subset on each interface. Pin level activity under adequate load gives us the confidence in connectivity.
The functional aspect of integration coverage is more of a grey area that segues into our next topic of constraining the size of coverage models. Before we get there, I think a mandatory rule of thumb is that a system level functional coverage model must be derived from customer use cases. A system level functional coverage model must never be defined as some aggregation of subsystem functional coverage.
(fwiw… this blog is just an excuse to write that last sentence. If there’s one takeaway here, please make it be that).
Compartmentalizing coverage could be a great way to relieve the coverage hangover, but reducing the amount of coverage we collect would bring even more relief.
Our HVL coverage constructs make it very easy to create massive coverage models – hence massive coverage hangovers – with just a few lines of code. For example, it’s easy to:
- cover all values of ‘A’
- cover all values of ‘B’
- cross cover ‘A’ and ‘B’
- cross cover a history of ‘A’ and ‘B’
- if ‘A’ and ‘B’ why not ‘C’
- and ‘D’…
- etc…
My one suggestion here is to think more critically about the functional coverage models we create with an emphasis on constraining the size of our coverage models as reasonably as possible. Differentiate between ‘must have’ coverage items that observe necessary functionality and derivative ‘nice to have’ coverage items that are easy to observe but don’t add much in terms of design quality and confidence. This is a tough distinction to make given our penchant for risk aversion paired with the coverage hyperbole we’re constantly subjected to, but it’s necessary to contain the cost of verification.
The lines may blur between each of these coverage steps, but trying to compartmentalize our coverage nightmare as much as we can would go a long way toward mitigating our coverage hangover. Likewise for constraining the size of our coverage models.
-neil
I don’t agree with the “100% coverage is impossible statement” and that we should strive for good enough. Having had the privilege of being on critical-path projects a couple of times and being micro-managed by the boss of the design team, I’ve heard the phrase “I don’t want to see 100% coverage that is too focused; I would rather see 90% coverage that is more spread out”. I’d like to hear a solution to the question “is that the right 90% we’re seeing?” that doesn’t involve reviewing everything. Guess who has to spend time reviewing everything… Hint: not him.
If you’re happy with less coverage, then you can code it in such a way as to state this. This ties to what you said at the end of your post that it’s way too easy to cover all values of ‘A’ and people tend to do that. If not all values of ‘A’ are equally important, if there is symmetry, if they can be grouped into categories, etc., then do that. This is possible in the language, though most of the times it’s not that easy. Still, it pays off in the end, since you’re saving time by not having to review results over and over.
Define less (which if done properly doesn’t mean that you’re losing anything) and fill it completely and you won’t be so hung over. If you have a good testbench, you’re anyway stimulating more than you’re covering and you get an extra chance to catch bugs. Use information about these bugs to refine your coverage model and keep it manageable.
The 100% coverage is impossible comment was sarcasm. I should have made that more clear 😉
When you say “closing coverage feature-by-feature”, do you mean working on a feature from start to finish (i.e. nothing implemented to fully implemented)? I’m not so sure developing like this is such a good idea. Polishing a feature just to have it removed/completely changed later is really frustrating (I can say from experience and probably many others can too). I’ve found that prioritizing development such that the things that get delivered are the things your “customers” need is much more efficient. Customer doesn’t have to be someone buying the full chip. I do a lot of block level verification and our “customers” are the integration team, who need some features sooner than others, but they don’t need them to be so mature that they work in absolutely all cases, as only a subset of those are relevant to verify integration aspects.
I’ll stand by the feature-by-feature development, or maybe a small set of features at a time is practical, because I think having everything half done is partly to blame for things building up to a coverage hangover in the first place. Polish as you go is the best way to focus and maintain productivity. if features get chucked, they get chucked. that’ll always be frustrating. as far as delivering to your integration team, it’s fine to break features down so that you can polish just what they need and do the rest later.
It becomes difficult to polish if the integration team moves faster than your team. Or if the back end team wants other features so that they can do area/power/etc. estimations.
A lot of the frustration to filling coverage is writing the tests that will hit the coverage holes. I’ve started thinking about what my coverage goals could be a lot earlier and building controllability into the verification environment to be able to easily reach them. This makes writing pseudo-directed tests (that are more constrained than random) a blast. Such tests are ideal when you want to start off slow developing a new feature, as they make designers very productive.
They also make up a great smoke-testing suite which you can use as a gate for commits, either to the DUT or to the testbench. This way you’re always making incremental progress and avoid debug hell. You never get into the situation that basic stuff that was working yesterday is broken today. It’s also very easy to debug failures. If you have a couple of test fails for tests called “test_feature_a_in_mode_m1”, “test_feature_a_in_mode_m2”, etc. it gives you a pretty good idea of what you broke, even before you run a simulation.
At the same time, having pseudo-directed tests which you constrained so precisely that certain scenarios are guaranteed to happen is pretty advantageous from a verification planning perspective. It gives you a picture of what you tested and is working (kind of like what coverage does) without having to write any coverage. I’m not advocating doing away with coverage, please don’t get me wrong. The issue with this is that managers who have a superficial view of what you need from a planning perspective (see my post w.r.t. 90% coverage but broad), that are in love with coverage, will frown at you. I’m guessing this is because they think you’re advocating going back to the dark ages, where all tests were directed, had no variation and took a lot of time to develop.
The main point is you can get a lot done and get pretty good confidence even before writing/filling any coverage. I’d rather spend a bit more time writing a couple of more constrained test variants in the beginning, for things that I know I definitely want to see and especially if I want to see them in the smoke-testing suite than start too early to define coverage, start filling it and notice that I have to write those tests anyway, since much randomness means less chances I’ll hit my coverage points. Going test first has a better chance to find bugs earlier and keeps the design team occupied. Hopefully, by the time we start writing coverage, we’ve shaken so many bugs out, that coverage is mostly just “paperwork”.
//I’d rather spend a bit more time writing a couple of more constrained test variants in the beginning, for things that I know I definitely want to see and especially if I want to see them in the smoke-testing suite than start too early to define coverage, //
Isn’t your constrained stimulus driven by your coverage plan?
If not, what is the reference and how do you define your constrained stimulus?
Having everything random and finding a legal subset among the crosses wouldn’t be ideal.
I would personally like to see a correlation between your testplan, coverage plan and stimulus.