For me, this is a very exciting post because I think I’ve made some pretty important headway regarding TDD for hardware designers.
My big side project as of late has been a real pilot dedicated to using TDD to write RTL. I’ve blogged about some of the things I’ve learned already but the big eye-opener that I haven’t talked about yet is how TDD helps us with design partitioning and testability.
The design I’m working on is a video processing addition to the Agile2014 demo I built with Soheil a few months ago. To recap: the demo demonstrates how TDD can be used to write embedded applications, firmware and RTL. The good news is that we were successful on the application and firmware. Our RTL, however, was trivial so it failed the proof-of-concept litmus test miserably. This new video processing block I’m working on now is meant to change that.
I blogged about the design I started a couple months ago. Originally, my hand-drawn design sketch looked like this…
In the sketch, you’ll see some ingress logic receiving 1080×1920 HDMI video frames from a streaming AXI4 interface and writing them to a memory. In the middle, there’s a cloud of video processing logic that adds a glow around live cells in our conway’s game-of-life application. Finally, there’s some egress logic that pulls modified frames out of memory and passes them down the line via another streaming AXI4.
When I started, I envisioned all this logic in a single module because it’s neither large nor super complicated. But 2 minutes in I found that in order to write focused unit tests, I’d need access to a lot of the internals which meant I’d either have to a) probe internals by accessing them hierarchically or b) add a lot of new outputs to pull out everything I was interested in.
I didn’t like the idea of probing internals because I’d be risking very tight coupling between my design and unit tests. Even minor changes to the design would also ripple through my test suite and turn into a potential maintenance nightmare. I also didn’t like the idea of creating new outputs for the signals I needed access to because it didn’t seem right to needlessly add outputs solely for the sake of testing. Finding neither choice acceptable, the idea of a single module design broke down immediately.
It was at this point, 2 minutes in, that I started thinking “what would this design look like if it were software?”. If it were software, I figured, I wouldn’t have a single function to do everything. Because there are 3 distinct characteristics in this design (ingress, processing and egress), I’d also have 3 distinct functions. This line of thinking lead to my first partitioning decision: instead of 1 module for everything, I’d have 3 separate modules each with a singular purpose.
By splitting 1 module into 3, the critical nets I required access to became outputs naturally and the interconnections between the 3 modules offered the visibility I needed to write focused unit tests. That was benefit number 1 (aka: the expected benefit). Benefit number 2 however (aka: the unexpected benefit) was that I had also isolated the ingress, processing and egress logic from each other. Coming back to the software analogy, I now had 3 separate functions and unfettered access to all the relevant input and output arguments such that I could build and test each independently.
Major (major) bonus.
I started with the ingress module, verifying that it properly wrote streaming AXI4 frames from the input through to the memory. Also that it would properly signal fill thresholds. I then tested the inverse function for the egress; that the egress module would properly pull frames from memory and send them out the streaming AXI4. When both were done, I put them together and wrote a simple acceptance test to verify they worked when connected. Worked like a charm (save for an issue with how the fill thresholds were created which was easy to clear up with my acceptance test). Referring back to my original design diagram, that acceptance test verified a step 1 milestone of basic memory/ingress/egress interaction.
My first increment was done. The big lesson learned: TDD helps us think of hardware logic as a series of software functions that can be isolated, then built and tested independently.
Step 2, and the addition of the video processing module, would confirm this lesson. A few tests in, I realized the video processing module that I thought performed 1 function actually performed 2. The first function pulls blocks of pixels from memory and organizes them. The 2nd function modifies pixels and writes them back to memory. I split the processing module into 2 and built and tested them independently.
If you’re counting along, this means my original design of 1 untestable module is now 4 independent modules (ingress, egress, proc and calc), all of which are relatively small and highly testable. Here’s a sketch of the updated design diagram…
This idea of isolating hardware logic into single purpose modules in the same way that software developers partition a design between single purpose functions is the most important thing I’ve learned from this RTL proof-of-concept exercise. In my experience, most designers are pretty good at partitioning designs into logical chunks that make sense but I think focusing on testability of each chunk takes things a step further to a finer granularity than people are used to now. The immediate benefit, as I’ve seen, is high quality RTL that works as intended… even when a verification engineer is writing it :).
I think more small modules, each serving a single purpose that can be isolated and tested independently, is absolutely the key to enabling TDD of hardware. After some initial hiccups and learning this key ingredient, the TDD cycle became very natural just as it did when I started using it to build verification IP… even though it’s hardware. In terms of mechanics, each of the steps looks very similar to what’s done in software: write a test, watch the test fail, write the RTL, watch the test pass.
I’ve finally seen the TDD cycle in action on a non-trivial design… and it certainly appears to work very well.
-neil
Although I didn’t use TDD, it sounds the similar as when I retrospectively put tests onto a “simple” SSR output manager in VHDL. Tests rapidly got more complicated so had to break it down into 3 parts, all easy sections and easy tests. Keep up the good work!
The problem with this approach is the relative hassle of writing a “test” (aka testbench) for RTL compared to software. A software function has a set of inputs and as set of outputs which take values determined by the test being written. To test a piece of RTL, however small, you have to control and check the temporal relationship between input and output events on multiple signals, which is inevitably a lot more difficult. If you go for this ultra low level approach you have a massive proliferation of testbenches. How long did each “test” take you to write?
Also there is a whole large class of bugs related to the interaction between the very small parts that have already been tested. How did you catch those?
I think the pragmatic approach is to write testbenches at various levels, considering the nature of the DUT. I typically approach this from both ends: Low level testbenches for only those sub-blocks with difficult control logic or backpressure schemes, and a high level testbench to test the interaction of the whole system.
jon, I had similar concerns when I started this. I knew it was doable, but I was skeptical wrt tdd being *productive* considering the pin level interface and timing. I thought the pins would require a lot of support logic to manage interaction/capture response. I thought the timing would make test writing awkward (relative to a software test). on top of that I thought the lower level of detail would turn into a maintenance nightmare. had any of these been the case, I’d have changed the name of this post to “Now I Know For Sure That TDD Doesn’t Work For RTL”… or not blogged about it at all. but my concerns didn’t turn into the worst case scenario I was afraid of. I wrote simple accessor tasks to manage the pins (imagine the absolute simplest bfms possible). I also am able to manage the timing pretty well through general purpose synchronization tasks and partitioning. maintenance has also not been an issue (but only b/c I’m constantly refactoring for readability and to keep the code clean).
In terms of time to write tests… last night I started testing a new module. I built the test harness, implemented some control logic in the design and wrote 12 passing tests (with required accessor tasks) in 90min. the tests run in less than 5sec and I ran them about 30 times in that 90min. I know for sure that I caught 2 bugs that would have otherwise slipped through (these were scenarios that I didn’t think of at first but came back to with tests I didn’t initially plan). all this was only possible b/c I had svunit (i.e. the unit test framework). without a test framework this would obviously have been a different story.
beyond the unit tests, I did mention that I’m writing acceptance tests as well against everything connected and that I found a flow control bug in one acceptance test. I’m not ignoring the fact that integration bugs can exist but acknowledging that it’s not tdd that’ll find those.
as you point out, there are other ways to skin this cat. but I think it’s important people realize tdd is 1 way that can be very productive with a little practice.
-neil
Great approach, but lots of simulation micro-testbenches are a large overhead. You really cannot get around that.
However, there is another way. Here’s the way I do TDD in ASIC design:
Write a (system verilog) module (or vhdl entity/arch), and write “assertion properties” for that module. Name these sensibly so that you can constrain undriven external signals to conform with the assertions by changing them to “assumes”. e.g for for a master slave interface, call the assertions master_*, slave_*. Then run your favourite formal proof checker on the module: with this you turn the required asserts into assumes with one line of TCL, and bingo, that’s your testbench.
And all these asserts are valid and useful at higher level simulation and formal proofs. Nothing is wasted. And of course, being formally proven, they provide far more comprehensive verification. than simulation.
So in summary, in my experience, using assertions and formal proof checking tools as your TDD “testbench” is the fastest and most effective development process. AND nothing is wasted when it comes to high level verification done by the separate verification-by-simulation team when they catch up.
jon, not sure I agree with your “micro-testbenches are a large overhead” comment b/c I’m not really seeing that. nonetheless, your formal proof checking is a good direction to go as well. I wish we could do a comparison when it comes to the overhead of building the supporting simulation infrastructure vs. the overhead of debugging sva assertions. the sva constructs are pretty gross so I’d imagine the “wasted” effort would be comparable. good point about portability for properties. I think that’s a benefit you won’t get from unit tests.
In the test you say:
“I didn’t like the idea of probing internals because I’d be risking very tight coupling between my design and unit tests. Even minor changes to the design would also ripple through my test suite and turn into a potential maintenance nightmare. I also didn’t like the idea of creating new outputs for the signals I needed access to because it didn’t seem right to needlessly add outputs solely for the sake of testing.”
Me neither like it. However, I found that it is the only choice to design a FSM with TDD… or do you know any other alternative? It is also useful to test corner cases isn’t it?
Great work!