The First Step is Acceptance (Hardware Verification is Broken)

A couple weeks ago, I had the chance to do a lunch-n-learn seminar for about 20 verification engineers in Mountain View. It was an hour talk about an incremental approach to functional verification; one that I’ve given about a half a dozen times to various teams.

I like giving this talk because I find people on the whole are surprisingly receptive to the ideas. There’s also been some skepticism, though, that I’ve seen more than once. Seeing as how it’s still fresh in my mind, I figured now would be a good time to pass some of this skepticism on as food-for-thought.

“These are interesting ideas, but this isn’t how we do it.”

That’s a common response to many of the ideas in the talks I give on agile hardware development and functional verification. Not surprising since much of what I talk about is quite different from what people are used to. After someone points out that this isn’t how we do it, there’s usually a follow-up description of what their design process looks like: big up front design (BUFD), documentation first, design-then-test, solid deadlines, etc.

Because I’ve heard it more than once, this time around I added slides to the talk to specifically address the this isn’t how we do it comment. The first has data from Mentor Graphic’s verification survey that shows 36% of a verification engineer’s time wasted on debug. The second is a slide from the hardware planning survey I did with Catherine Louis last year that shows about 85% of hardware teams releasing behind schedule. Using those two references specifically, my response to the this isn’t how we do it people: we waste a lot of effort debugging issues we create ourselves and we regularly release late so don’t assume what you’re doing now is right. When organizations mature to the point of being able accept that, there’ll room for new ideas (hopefully like those I propose in my talk) and drastic improvement. Until then, organizations will continue to optimize what I see as a broken approach.

“Are you suggesting more directed tests? That’s an inefficient step backward.”

This is a functional verification talk and a big part of what I suggest is more directed testing before constrained random tests (not lightly constrained, entirely directed… as in stimulus directed at one specific condition with no randomization). The problem with this is that some people still believe in the great mythical constrained random test and intelligent testbench. They see constrained random tests as efficient and directed tests as inefficient.

Now the reason I suggest more directed tests is because I think the mythical constrained random test – the one where rapid progress is immediate and sustained until it tapers off to 100% feature coverage – isn’t something that actually exists. What I have seen is that once a person’s testbench is done, they use that all powerful testbench to take wee little baby steps through their feature space. Real progress is gradual because defect rates are far too high to achieve the kind of rapid progress the old constrained random v. directed testing graphic suggests. In effect, the technique that suggests efficiency doesn’t end up delivering.

Last week, for the first time, I found myself calling what we do in hardware verification “the misapplication of constrained random verification”. Currently my opinion is that there’s nothing necessarily wrong with the technique itself, the problem is when we use it. You don’t need fancy stimulus to find issues in defect-ridden code. Defect-ridden code is what we start with which is why directed testing is the right technique early on. When our code is solid and we’re ready to look for the unknown, that’s when constrained random may be more useful.

So yes, directed tests are a step backward… but they’re a step backward that makes us more efficient, not less.

“Test our testbench code? Whoa, we don’t do that.”

Testing testbench code is the easiest idea in this talk to dismiss but it’s also the easiest to justify. It’s rare that organizations invest in the testing of testbench code so the change in approach is seen as extra work that they can’t afford. Problem is, these same organizations are already investing in fixing their testbenches as part of a death-by-a-1000-cuts debt repayment arrangement where people spend a few hours a day, every day, debugging their code.

If you’re in an organization that doesn’t see the value in testing your testbench code, here’s a few things to consider that’ll hopefully change you’re mind…

Your testbench is the benchmark against which the quality of your design is measured. If your testbench is poor quality (unverified), how do you think that translates to the quality of your design?

Designers are smart people and yet designers produce defects – lots of them – which is why we verify designs. We verification engineers are pretty smart, too. We also produce our fair share of defects. Yet for some reason we don’t see the need for verifying testbench code. That doesn’t make sense. We verify designs because we want defect free designs. We should be verifying testbench code for the same reason.

If neither of those change your mind, consider the numbers I proposed in this EETimes article. You’ll see that a third of a verification engineer’s time multiplied by the size of your verification team multiplied again by over a year of development quickly adds up to hundreds of thousands of dollars. We’re already sinking huge amounts of money into poor code to the point where an ounce or two of prevention shouldn’t really be up for debate.

There’s other comments that come up but those are the big ones. Hopefully they’ve been enough to convince you we can do better in hardware development… or a least get you started with a little skepticism!

-neil

PS: if you’re interested in critiquing this incremental approach to functional verification first hand, let me know at neil.johnson@agilesoc.com.

AgileSoC

Bring Agile to the World of Hardware Development

The First Step is Acceptance (Hardware Verification is Broken)

Related

Leave a Reply Cancel reply