If you saw last week’s post you’ll know that I’m pondering a DVCon proposal. As it stands, my proposal centers around unit testing in hardware development. Trouble is, unit testing on it’s own – nevermind TDD which is where I really want to go – has been a tough sell for hardware developers. Lucky for me, I’ve recently found that if I show off unit testing within a context people actually care about (i.e. UVM), suddenly they take an interest. Odd how that happens :). So that’s where I am, an abstract that describes the value of unit testing as we’ve applied it to UVM. It’s called How UVM-1.1d Makes the Case for Unit Testing. The proposal is not a sure thing yet. If you’re on board with it, you can help make it happen over here.
Assuming people like the abstract, I’m looking for an ice breaker or two that in addition to what we’ve done with UVM help convey the value of unit testing. That’s taken me back again to Mentor’s 2012 Wilson Research Group Functional Verification Study and a twist on the time wasted debugging graphic that I love so much. Here’s a snapshot with the graphic and Harry Foster’s analysis from Mentor’s website. I’ve shown the graphic so many times already. Let’s focus on the analysis this time, shall we?
For such a strong indicator, I think Harry is taking it easy on us; probably doesn’t want to hurt anyone’s feelings! I’m going to offer a translation that’s a little more direct…
Let’s look at the mean time verification engineers spend wasting their time. Unless you’re blind, you’ll see verification engineers burn an incredible amount of time fixing defects that they and their colleagues create through a careless disregard for code quality. Ideally, if we realized that people aren’t perfectly molded futuristic efficiency drones, that they make mistakes and that unit testing code is a great way to catch these mistakes before they become a huge time-suck, we’d be free to find more entertaining ways to spend almost 3 hours/day, like windsurfing or smelling flowers or doing chin-ups. Yet, unfortunately, because we don’t generally unit test our code to eliminate defects in a focused and disciplined way (or even make the effort to keep track of where we inject these defects), the time required to find them can vary significantly from project-to-project – big surprise there – which means project managers, team leaders and executives are often left wondering what the hell could be taking us so long to finish up considering we told them we were done writing the RTL and building our testbench several months ago.
There… the colour and analysis I think that graphic deserves. And depending on the interest I get, a nice opening for a DVCon paper describing the benefits of unit testing in hardware. If it sounds good to you, could go read and vote for the abstract… or you could just vote for it here.
[poll id=”5″]
-neil
PS: As I’ve said before, reading through Mentor’s 2012 Wilson Research Group Functional Verification Study is time well spent. Lots of important little nuggets in there. This comes from part 6. This morning I see Harry has just posted part 9.
It is an interesting breakdown of time spent. However, it seems very open to interpretation.
It doesn’t actually show how much of this debug time is spent debugging RTL (notionally the entire point of verification) vs time spent debugging verification code (the necessary evil that TDD and unit testing might tend to accelerate)
I assume from your perspective you are looking to minimize that ‘debug’ slice. While, from the commentary above the slide, it would appear that Harry thinks the goal is to maximize the size of that ‘debug’ slice.
Unfortunately, you can’t usefully read it both ways like that, in fact either view is flawed, as it doesn’t break down RTL debugging vs TB debugging time.
There is a lot of wasted debugging effort going on. There is a lot of useful debugging going on too, but this chart doesn’t give much of an indication of either ratio.
Similarly, with unit testing a large, heavily reused library like the UVM – it certainly makes a lot of sense to expend effort testing all the corners and making it as robust as possible. It may be less compelling an argument for a one time testbench which after all is already the unit test infrastructure for the real product – the RTL. The UVM BCL is a ‘real product’ in its own right, too.
Is a.n.other testbench?
Gordon, thanks for the comment! You’re right that it is open to interpretation. I hope people take the time to form their own interpretation as opposed to sloughing it off. In my mind, the only interpretations that are incorrect would be ones that either a) suggest that high defect rates/wasted effort are reasonable; and/or b) assert that continuing on with the techniques we use currently will adequately address high defect rates/wasted effort. but that’s just me and my interpretation :).
For anyone else reading along, I’ll suggest debug time between rtl and testbench doesn’t matter. both are a time-suck. As for where unit testing is applicable, I suggest it’s meant for code you want to work. That’s a blanket statement that applies to uvm (obviously) but also the one-time testbench. Tricky thing is that the compelling evidence comes from people trying it for themselves, not so much me telling people about it.
Thanks again!
-neil
and as a follow-on comment – I’m writing this based on my experience developing a unit test framework and unit tests for several vera testbenches I’ve written in the past.
It was nice to have an additional framework for refactoring. It probably did enhance the quality of the testbench. It then becomes a question of ‘so what?’ unless it enhances the quality and decreases the total time to working, tested RTL. I’m not so convinced on that front.
You can also use the existing RTL has a framework for refactoring, if you have sufficient functionality working. That can also be used to prove the quality of the testbench (while actually moving towards a final working product).
Unit testing is typically about ‘the product’ and ‘the tests’. I just have a difficult time finding the value in ‘the product’, ‘the tests’ and ‘the tests for the tests’ that is being added in here. I think we need to consider what we are actually doing – rather than just saying ‘unit tests for verification are just like software unit tests’
There’s a different relationship that needs to be properly considered and addressed.
Gordon, Lots of questions in there… which is great :). Instead of trying to answer them all, I’ll take a step back to clarify my definition of unit and unit tests to see if that changes anything.
A unit for me is a piece of the puzzle (i.e. individual module/class/interface) as opposed to a collection of pieces (i.e. block/sub-system) or the entire puzzle (i.e. chip/large sub-system). With that definition of ‘unit’, unit tests would therefore be built around those individual pieces. If you’re building a testbench, you’d have tests around each/all of your bfms/sequences/models/checkers/etc *not* tests applied to the collection of those pieces (i.e. your testbench). It’s the granularity that’s different. Not a lot of hardware developers testing at unit level.
Do I find many sw people writing unit tests? Yes. I’ve taken *all* my queues from sw folks. Hard to generalize, but from what I gather, unit testing (and/or tdd… the superior approach) is widely used.
Not sure if that changes any of your other questions or not?? Yes/no?
-neil
You misunderstand my question about sw people writing unit tests.
Typically with SW unit testing there are two entities:
The code being developed (lets call that the ‘application under test’ or AUT) and the test code (the ‘unit tests’ or UT)
So, for the typical software application of unit testing we have
UT => AUT
In hardware, we typically have the device under test (DUT) and the testbench (TB)
TB => DUT
In the software world, the AUT is the final product and in the verification world, the DUT is the final product.
Now, if we add unit tests to the TB, we have
UT => TB => DUT
I.e., the unit tests in the verification case are one step further removed from the actual product being developed.
I was asking in the software case do you see people writing unit tests for their test code, which is what we’d be doing in the hardware case – writing tests that are further removed from the actual shipping product than I think most software people usually bother to do.
Now, if your Verification IP is an actual shipping product, then unit tests for the VIP make a lot of sense, e.g., UT => UVM BCL or UT => VIP
Similarly for BFMs – does it make sense to write unit tests or does it make sense to have a testing harness (e..g, a master and slave BFM test case where they actually talk to / stimulate each other)
There’s a financial and time cost of writing unit tests. Does it make sense to invest a lot of time and effort writing test code for your test code, when neither of those pieces are the point of verification (working RTL being the actual goal)
Got it.
Like you say, I think the common arrangement is UT and AUT. Typically unit tests are directed at specific features/details of the AUT and do not rely on the scaffolding/utilities that are typical in hardware. That’s the simplest case. Also common, from what I understand, would be UT that rely on mocks/test doubles to model interactions/simplify test writing, which takes us a little closer to how we approach the problem in hardware. In those cases, I’d guess the mock would be tested as/prior to it being used in UT OR the mock is created as part of a refactoring exercise where after some time duplicate/useful functionality is consolidated in the mock/test double and then you carry on. If the mock is built in isolation, I’d expect the same kinds of trade-offs there that we’d have to contemplate. Some people may not see the need. Others, especially if they are using tdd to build the mock, would very likely have unit tests for everything (this is how I approached the uvm report mocking in SVUnit even though it’s relatively simple). Next level from there would be sw devs doing model based testing that looks more like what we’re used to where your tests rely heavily on functionality provided by the model/harness. I don’t think you’d call this unit testing anyone, however. At this point you’d be talking acceptance testing (integration/product level testing) where the underlying model/harness would have already been unit tested.
Another thing to note… sw devs use the concepts of acceptance and unit tests. Unit tests apply to the minute details, acceptance tests to the product. In hardware we typically have chip/product level tests and block level tests. sw acceptance tests serve the same purpose as chip/product tests. sw unit tests and hardware block level tests however are different. I think the sw people have it right here. Very detailed unit tests are better suited to verifying functionality and intent. With our block level testing, I think we’re stuck in this limbo like state where it’s not focused enough to be productive but also not high level enough to be considered acceptance tests.
Hopefully that’ll help the conversation and not make it more confusing.
Final point, you have people that use tdd to produce everything. That’s the only way they write code so they do have unit tests for everything. Once you’re at tdd, perspective changes and ut is no longer a testing exercise, it’s a development exercise. When you talk about financial cost of writing tests to these people, they talk about the financial cost of not writing unit tests… which is where I’m at if you haven’t noticed! With a little practice, I’ve convinced myself that the cost of using tdd is less than the cost of not using tdd. But again, that’s something you can’t tell people. It’s something that they have to see for themselves. People get stuck thinking more code equals longer development time. But I haven’t found that as the case at all. May not be the same for everyone, but that’s where I’m at.
…so once I’ve got the entire hardware world converted to unit testing, next step is to convert the converts to tdd 🙂
-neil
the thing is, do you find many software people writing unit tests for their unit test code?
Writing test code for test code for RTL starts to feel a bit removed from the purpose of the verification. If your ‘product’ is the verification code, (UVM BCL, VIP etc) then it makes a lot of sense. If your ‘product’ is working RTL, then you are already writing unit tests for the RTL – it’s called the testbench.
Now, maybe people could write more effective testbench code, but adding that scaffolding layer of unit tests to the testbench just means we have 3 sets of things to debug and get to agree, rather than the more typical two sets (test code and code under test) which might be unit tests and application or it might be testbench and RTL.
unit tests, testbench and RTL is an interesting triple set to debug (plus the fourth slightly more ephemeral ‘specification’ that we are also often testing at the same time)
I think Neil might have already said this in his discussion of unit tests and acceptance tests, but let me put it more simply: equating unit tests that software people write to the complicated constrained random testbenches that we (hardware verification people) write is a mistake.
In my experience software teams have simple, direct unit tests focused on testing small units of the code and they *also* have larger more complicated full-system tests. When those more complicated full-system tests strive to be as exhaustive as our testbenches, they are (surprise!) written by a separate QA or test team. It’s not unheard of for those QA/test teams to have unit tests for the code they are writing.
One other important point is how svunit tests compile and run in a matter of about 20 seconds or less, while any real SystemVerilog testbench including RTL, even a smaller module-level testbench, takes at lest 3 or 4 minutes. Making a code change, running a sim to verify the code change, and repeating is much quicker with unit tests. They save a lot of time and they really are a development tool used right along side your text editor, code browsing tools, and revision control (with all have costs associated with them as well, but we know the cost is worth it).
Hardware simulation and software really are the same thing.
I think ‘Hardware simulation and software really are the same thing.’ is a bit too easy. It sounds good, but I don’t think it is true.
The cost of shipping a ‘bad compile’ differs for example – it is a bit easier to issue a patch for software, compared to respinning a chip.
There is a whole world of techniques that can be applied from software to hardware design. I don’t doubt that. But what is interesting to me is how the constraints change in the hardware design case and how that makes things change.
Just making the assumption that ‘everything is the same and hardware people are dumb’ doesn’t really get to the bottom of what’s going on.
I can agree that my statement that hardware simulation and software really are the same thing is comparing a very specific task (simulating a digital design) to a very broad field (software). What I probably should say is that hardware simulation is a domain of software development, right along with other domains such as web development, embedded software, operating systems development, network programming, compiler development, etc., etc. Yes, they all have varying goals and constraints. What saddens me is that there are common tools and techniques that almost all software development projects of all types use that hardware simulation people seem to dismiss out of hand because, “we aren’t writing software.”
I did not say that hardware people are dumb. I would like to see more awareness and open-mindedness (and you, Gordon, appear to be one of the more aware and open people in the field from your blog posts and such, so nothing personal) of software tools and techniques that we should be evaluating.
I like this idea of thinking of hardware simulation as another domain of software development. We’re obviously different than other domains though I think we suffer more than we benefit by believing we’re *that* different.
Cost of shipping a bad compile v. respinning… there’s an argument of convenience for me. Catastrophic failure in hardware is more difficult to recover from. ASICs that start on fire don’t make for useful products. That’s obvious. Hardware and software developers I talk to both understand that. The difference… when sw devs hear me suggesting techniques for added rigor to hardware dev they say “Of course. That makes sense. Cost of failure is high so the extra *investment* is worth it.” Now when I suggest the same to hardware developers, on 1 hand we balk at the extra *cost* and say they can’t afford it while at the same time we use high cost of failure to justify complex solutions for delivering quality (i.e. xxM/constrained random/formal/etc) all of which are extremely expensive (time/effort/expertise) to deploy. The cost argument doesn’t make sense to me. If we didn’t test anything, then yes it would make sense. But since we already sink so much money into testing, I don’t see why we aren’t looking for simpler/cheaper solutions.
Good discussion guys. Nice to have people opening up about this stuff.
-neil