If you’ve read Does Constrained Random Verification Really Work and Functional Verification Doesn’t Have to be a Sideshow, you’ll know that I’ve become a bit of a skeptic when it comes to constrained random. My opinion hasn’t changed much since those posts and I think I’ve got a couple visuals that will help people see the point I was arguing in Functional Verification Doesn’t Have to be a Sideshow, that a successful constrained random verification effort starts with directed testing… a lot of directed testing.
To recap… here’s the graphic we’ve all seen more than once.
It was probably in the early 2000’s when this was shown quite often in sales/marketing pitches and in technical papers. Every time some one referenced this graphic, it was used to note how much better constrained random verification was relative to directed testing. For just a little more effort up front, you get far faster and more rigorous coverage of your design state space later. I used to think this graphic was pretty accurate. In around 2001 when I started with Vera and started building constrained random testbenches, there was lots of new stuff to learn (which was great) and I was finding more bugs – more odd, corner case bugs – than I ever had before. Yes, it was costing more effort to build the testbench but the extra effort seemed worth it. Just as the graphic suggests.
Time has passed and now I think the blue – ideal – constrained random curve that we’ve all been shown numerous times is unattainable for most teams. It’s a great idea. It sells the technique, but I don’t think it’s at all realistic and I don’t think many teams reap the advertised benefits. In my opinion, what hardware teams see in reality is probably more like the dashed red line in this next graphic with a ramp up and productivity that fall short of the ideal.
What makes the dashed red line more probable than the blue line? Poor code quality (the Quality Lag) and long debug cycles (the Debug Lag).
So you’re “done” your testbench (if you want to know what I mean by “done”, you can read about it here) and you’ve just written your first constrained random test. Your test constraints are fairly lax because you want to cover as much of the state space as possible as quickly as possible. You run the test with your fingers crossed.
The first issue you see is a time-out. Nuts! It takes a while, but you find the cause of the time-out, fix it up and run the test again. Another time-out. Bah!! Find it, fix it, run the test again. Null object access in your BFM. Forgot to initialize a transaction. Fix that, run it again. Still a null access, only this time it’s in your coverage object hanging off the BFM. Fix that, run it again. etc, etc, etc.
For constrained random tests, in order to make the steep progress immediately (the point at which you abruptly shoot northward from the initial development flatline), you need high quality code (aka: a bug free testbench). The problem is, we don’t build bug free testbenches. The development techniques we use, namely big upfront planning and coding everything all in one shot with little or no testing, produce bug riddled testbenches. Just to clarify, bug riddled testbenches are the opposite of what we need when we run that first random test. What usually ends up happening is what I’ve described above: several days of debugging silly 1st issues like misconnects and null objects, several days or weeks of debugging 2nd order issues (that ‘+1’ should be a ‘+2’ except when x==10) and several days or weeks re-writing code due to 3rd order issues (oops… I didn’t know <it> worked like <that>). These are the 1st, 2nd and 3rd order issues I talk about in Functional Verification Doesn’t Have to be a Sideshow.
From the point you’re “done” coding your testbench, there’s always extra effort required to improve your code to the point it’s solid enough to make decent progress (as the blue curve suggests). Initially, the testbench quality is so poor that you’re just barely making progress. But over time and as you fix more bugs, quality improves and you’re able to make decent progress. Quality lag is the time between.
Here’s another scenario… you have a constrained random test failing and you don’t really know why. You’re confident it’s a design issue though so you file a bug report with the designer. The designer comes to you a day later…
designer: “can you tell me more about the bug you’re seeing?”
you: “not really. I know this instruction isn’t being handled properly but I’m not sure why.”
designer: “OK. Can you tell me what you’re test is doing?”
you: “well… it’s a random test. I know what instructions I’m sending in and I’m pretty sure the problem happens on the 27 instruction… but that’s about it.”
designer: “Can we build a test that sends in that instruction on it’s own?”
you: “The test I wrote with that instruction on it’s own passes. So I guess we can assume the problem in the random test has nothing to do with the actual instruction.”
designer: “Is there any other way to isolate what’s going on?”
you: “Not sure. I still don’t really know what’s going on.”
The other characteristic the constrained random curve depends on is tight debug cycles so you can keep the coverage momentum going. One problem though: constrained random is… well… random. It’s not always obvious what’s going on and you don’t always know where to look when something goes wrong. Basically, the problem space with constrained random testing can be immense and an immense problem space doesn’t facilitate tight debug cycles. The time required to navigate the problem space is debug lag.
But Constrained Random Is Better Than Directed Testing, Right?
Yes and no. I think they both have their strengths which is why I present the hybrid approach of directed testing for 1st and 2nd order bugs with constrained random for the 3rd order bugs in Functional Verification Doesn’t Have to be a Sideshow. When constrained random verification is the goal, we don’t often see directed testing as the first step in achieving that goal… but it should be.
This is what the hybrid approach looks like relative to the constrained random and directed curves.
If you want to read more about the mechanics and motivations behind it, I’d really suggest going back to Functional Verification Doesn’t Have to be a Sideshow to see the strategy I present near the end. It has ideas for when and why you use directed tests and how we overcome the quality and debug lags. If you’re team sinks a lot of time into debug, I think this approach can really help.
I think there’s huge value in this hybrid approach. It’s realistic, it acknowledges the strengths and weaknesses of directed and constrained random testing and it addresses 2 major issues that seem to plague all verification teams: quality lag and debug lag.
You shouldn’t have to wait for quality and you shouldn’t be wasting time with debug.
PS: quality lag and debug lag are issues I address in my Expediting Development With An Incremental Approach To Functional Verification lunch-n-learn. I do that talk remotely for teams that are interested. If you and your team are interested, let me know at firstname.lastname@example.org!