Neil Johnson – Principal Consultant, XtremeEDA
The functional boundaries we impose on ourselves in IC development are a detriment to quality and productivity. This articleschallenges the traditional structure and roles found in most IC development teams as well as the restrictions that many teams place on who writes and maintains code.
IC development is an energetic, dynamic and highly competitive industry with ever evolving technical standards and practices. Like many industries, however, even IC development teams with their high pressure, fast paced environments have traditions that are hard to abandon, even with evidence they may be holding us back.
Contents
Lessons From An Outsider’s View Of IC Development
Why do we organize IC development teams the way we do: design teams, modeling teams, verification teams, software teams, physical implementation teams and others? Is this type of arrangement optimal or is it holding us back? How does our view ofan IC development team differ from the view of our customers and what can we learn from these different views? Could weeven learn something from the perspective of an end user?
My father, Dan Johnson, would qualify as your average end user. Dan is a small-town doctor in his early sixties, fairlyintelligent and is generally very patient and relaxed. Dan is a gadget guy; he loves his Mac Pro at home, MacBook at the cottage and iPhone in his holster.
Despite his love for gadgets, however, Dan hardly qualifies as a power user. In fact, his patience and lack of technical savvy–not to mention an occasional absent mindedness that even he admits to–give him a rare and uncanny skill that I am just starting to appreciate. This is the type of skill that makes him the perfect test subject for IT organizations looking to root out unqualified new hires. A few hours on a support call with my dad and any person without the will for IT support would easilycrumble and run for the exit.
IT Support: “How can I help you, sir?”
My Dad: “My computer doesn’t work”.
IT Support: “Sir, is your computer plugged in?”
My Dad: “Yes… it is definitely plugged in”.
[hours pass]
IT Support: “Sir, are you sure your computer is plugged in?”
My Dad: “I’m pretty sure. What do you mean by plugged in?”
IT Support: sigh
This is similar to a conversation that comes up every time I describe my job as a functional verification engineer to my dad. Not surprisingly, it can be difficult to get the point across.
My Dad: “So explain this to me again… you don’t actually design the chips do you?”
Me: “Not really. I just make sure they work.”
My Dad: “So you have to use one of them circuit board thingies then to make sure the thing works?”
Me: “Not really. The thing doesn’t actually exist yet. We just simulate with a computer program before it’s built so it’s not a huge waste of money.”
My Dad: “What do you mean by simulate? If it doesn’t exist yet, how do you actually test to see that it works? Do you write the computer program? Are you a software guy like at Microsoft?”
Me: “Not really.”
My Dad: “I don’t get it.”
Me: sigh
It would be easy to blame my dad for not understanding my watered down description of functional verification but the truth is that conversation is my fault, not his. My dad represents the end user. To him there are only desktop computers, laptops, internet, iPhones, televisions, email, word processors and satellite dishes. Unfortunately, functional verification is something that does not actually mean much on its own; it is effectively intangible or at least difficult to quantify to people outside IC development.
By describing a day in the life of a functional verification engineer, I am giving my dad way too much information that he will never understand. However, if I were to simply tell him that I was an IC developer–and maybe crack open his computer to point out the processor–that would probably be enough to get a reasonable picture. He knows what his computer can do and he can see that I have helped deliver a small part of it.
Though our immediate customers are no doubt more up to date with IC development practices than my dad, what they care about is not actually much different than what my dad cares about. We like to see IC development teams as a system architect with the ideas, a design team with the RTL, a verification team with the testbench, a software team with the drivers or application and the physical design team working on the implementation. For our customers, however, it is usually muchsimpler: they give us money; we–the team–deliver a chip.
So why the emphasis on different roles? Many teams strive for an organizational structure that they feel gives them a defined and repeatable development process; it is convenient to define a process based on roles instead of people. Roles and skill-setsappear interchangeable so defined process based on roles is often seen as optimal.
Of course, past experience, personality, project circumstances, stress, employee turnover, leadership and other highly dynamicfactors very often make it obvious that people are not interchangeable. Add to that the fact that IC development is unarguably a process that takes a certain amount of creativity that cannot be planned in advance, and the defined process that many organizations strive for ends up being practically unattainable.
Rather than blindly continue the tradition of labeling people, teams should strive for more general roles of developer or contributor and emphasize the aspects of teamwork, collaboration and shared responsibility. We should be resisting the urge tostructure a team as a bunch of individuals simply satisfying the requirements of their role. Instead, we can build teams thatcollectively take responsibility for delivering a chip. Each person would have their area of expertise, obviously, but the traditional restrictions applied to RTL designers, verification engineers, software developers, physical implementation engineers and others would cease to exist. Operating outside one’s area of expertise would not only be allowed, but encouraged if doing so helped the team meet their development goals. Simply put, if an individual is in a position to deliver quality to a customer or end user, they do so regardless of what that entails.
On most teams, RTL designers are responsible for writing RTL while functional verification engineers are tasked withenvironment development and test writing.
While test writing may at times be a shared responsibility, RTL implementation and environment development are very rarely shared. This arrangement is not entirely surprising because the two areas do require altogether different expertise and mindset.Let us not forget also that we are constantly reminded of the eleventh commandment: “thou shalt not test your own code”. This is a rule that is certainly valid but can be mistakenly preached as “verification engineers shall not touch my RTL” or “designers shall not change my environment”. The strict and at times heavily enforced ownership of each domain does little to foster the teamwork and collective responsibility.
Contrary to conventional IC development practices, extreme programming (XP) takes a different approach to code ownership. In XP, code is not owned by an individual, it is owned by the team; the entire team. Anyone may edit anyone else’s code at any time. If bugs need fixing or enhancements are necessary, anyone can do it. This is one of the primary practices in XP known as shared code (Beck 2004).
To many IC developers, this type of arrangement would seem chaotic and counterproductive. To some, it may even be aninvasive violation of privacy.
There are significant and undeniable benefits to a shared code however when practiced with the required prudence and respect. First, and most obvious is that there are more eyes on the code and that translates to a level of rigor not present for code with a single owner. Second is that a shared code base has people gaining a broader understanding of the implementationthereby turning knowledge transfer into an inherent part of development. Lastly, people are less likely to become bottlenecksduring development or debug because development and debug responsibilities are distributed across the team.
There are times where a shared code base and fewer restrictions on who modifies what code can bring serious advantages in IC development.
During RTL debug, for example, there are handoffs of responsibility that typically occur between a verification engineer and an RTL designer. Normally, the verification engineer is responsible for writing a test and identifying a bug. From there, responsibility passes to the RTL designer to implement a solution. The buck then passes back to the verification engineer tovalidate the solution. This passing of responsibility can be painfully disconnected or neat and efficient depending entirely on how individuals define their responsibilities within the team.
Individual Ownership
In the case that is least productive, neither the verification engineer nor the RTL designer are capable nor open to assuming responsibility beyond their domain. The verification engineer may notify the RTL designer of a failing test and supply test context. The designer is left to fully diagnose the failing test and with considerable effort either implements a solution or notifies the verification engineer that there is no RTL bug, leaving the verifier to debug and fix the test. Communication in this case may happen through email or a bug tracking system making the debug cycle somewhat long and cumbersome.
Overlapping Responsibility
In the average case, the verification engineer and RTL designer are willing to share observations beyond their expertise. Along with the failing test and context, the verification engineer may point to related areas in the RTL where the potential problem exists. The extra guidance expedites the first handoff where it is assumed that the time to a complete diagnosis of the failure is minimized relative to the individual ownership example. The designer either implements a solution or returns similar guidance to the verification engineer as to why the test may be pushing the design into an illegal state. Communication is still likely to happen through email or bug tracking system though it may be complimented by face-to-face consultation.
Where verifier and designer feel a shared responsibility for the code, both are open to root cause analysis and problem solvingbeyond their domain of expertise.
In an improvement to the overlapping responsibility scenario, the verifier does not stop at identifying a potential bug in the RTL, he/she would also suggest a solution. The obvious advantage to suggesting a solution is that the designer is given information that enables a very fast turn around time between identifying a bug and fixing it. The less obvious but potentially greater benefit is that in analyzing the code and formulating a solution, the verifier builds a white-box understanding of the design. Ideally, that white-box understanding translates to a more comprehensive exploration of the state space and more rigorous testing.
In the case where the verifier has done enough in depth analysis to find a potential solution to an RTL bug, there should be aincreased confidence that the stimulus provided by the environment is in fact valid and an RTL bug is indeed present. If there is an oversight by the verifier, however, the designer may reciprocate by identifying a verification bug with a possible solution. Advantages are present here as well where suggesting a possible solution hastens the debug cycle and the designer gains a similar white-box view of the verification environment.
Engraining a sense of shared responsibility between design and verification experts would be no doubt create cohesive development teams and improve debug cycles and test rigor immensely. Having a true, team-wide responsibility for all code that ends up in the hands of a customer is a noble goal, but teams can do better.
Note that while the time to make the transition is assumed to improve with a greater sense of shared responsibility, the debug cycle in all three scenarios thus far still go through the same handoffs–from verifier to designer and back to verifier. Even with the greatest intentions, the handoff itself may be problematic thus leading to potential confusion and delay. Removing a handoff, therefore, would be one way to improve further on the shared responsibility scenario.
If the verifier has gone to the effort of identifying a fix for an RTL bug, why not just implement the fix? If the designer finds a bug in the verification environment, why not correct it? Both of these examples change the debug cycle in a fundamental wayby combining the discovery and solution into one step. The second step then becomes confirmation by the person that either wrote the code or some other person that is qualified to validate the solution.
Of course, if the solution is not adequate, the original arrangement would still apply: a new solution would be applied andvalidation would be the responsibility of the person that found the bug.
Pair programming reaches beyond the scope of this article, but it is worth mention here as an eventual compliment to shared code. Pair programming is a technique where two developers design and write code as a team. There is likely one terminal between them where one writes code and the other observes and analyzed the code as it is written. Close cooperation ensures defects are found at the earliest time possible: as they are coded. The overall pace at which code is written would likely be lower with pair programming, but proponents of pair programming–of which there are many in the world of software development–claim that productivity is greater when it comes to defect free code.
While pair programming in the software world is normally associated with initial development, it is still applicable within the context of this discussion. Productivity may be greatest when designer and verifier, for example, work side-by-side through the diagnosis, solution and validation as opposed to individually.
Pair programming is true code sharing and cooperation. For more on pair programming see the reference material at the end of this article.
The shared code scenario from the previous section is one that many teams are not at all comfortable with. At the root of this discomfort, perhaps, is a genuine feeling that others are not able to understand nor maintain the code we write. Further, for some there may be a deep-seeded misconception that they actually own the code they write. Both could not be further from the truth. We certainly do not own the code we write. Our code is the property of our company, clients and/or customers. Further, on teams with even mediocre talent, there will absolutely be people that can understand and maintain code that they themselves have not written.
For even for the most optimistic of IC developers it should be obvious that instituting a shared code free-for-all is certain to end in failure. An appropriate rollout plan is essential. If an IC development team is going to practice shared code, ground rules must be put in place to see that it done constructively.
Mutual Respect
There must be a mutual respect between functional experts and any code that is modified must be done in a respectful manner.Harshly or passive-aggressively criticizing others or their coding style are certain to undermine any kind of share responsibility. Competition for individual recognition is also sure to be detrimental to the teams success.
The golden rule of “do unto others as you would have them do unto you” should apply to code as well as people. Part of building mutual respect is leaving code in a better state than you found it.
Training
There are many different areas of expertise on an IC development team. In order for individuals to cross the functional boundaries that exist in most present day IC development teams, cross-functional training may be necessary.
Experience and Mentoring
For more experienced individuals, teams may consider delegating tasks beyond their domain in order to gain real world experience and insight. To help build cross-functional expertise, consider temporary assignments where, for example, an RTL designer would design and code a simple testbench component, a verification engineer build software drivers or other arrangements where individuals can build confidence and expertise outside their comfort zone.
Code reviews and mentoring arrangements are another way to build cross-functional expertise. Several options exist here as well, like having a verification engineer demonstrate the test environment to a software developer or having physical implementation engineers explain tools and scripting techniques to middle managers. Shadowing and pair programming are other great ways to share and build cross-functional expertise.
Inline Documentation
Everyone agrees that adding comments to code helps readability. For code with more than one owner, however, just making code readable is inadequate; comments should be used to convey reasoning and understanding. Comments should be treatedas inline documentation that captures important assumptions, critical timing or synchronization details, assumed protocol rules and violations, warnings about problematic corner cases, etc.
Inline documentation should be encouraged to make the code more understandable. As opposed to sketching notes in a log book or drawing timing diagrams on a whiteboard, assume that others will find your thought process helpful and embed such notes and diagrams in the code. Initialize such additions so others can easily identify and contact you for further discussion.
To better leverage the effort dedicated to inline documentation, tools like Doxygen and NaturalDocs can be used to extract documentation for teams that prefer a more formal approach to documentation.
One thing to avoid whenever possible, however, is committing inline documentation with functional changes to the same changelist. Inline documentation and other cosmetic changes (i.e. removing tabs, fixing indentation, etc) can overwhelm changelist descriptions making it difficult to pick-out functional changes in comparisons with other file revisions.
Test Driven Development
When modifying shared code, a person must ensure that there is a corresponding test to validate it. If a test does not exist, consider creating it before modifying code. This approach of building tests before changing code–or even writing code in the first place–is called test driven development (TDD). TDD is another primary practice of XP teams (Beck 2004). It is widely accepted as best practice for keeping bugs from entering the code–as opposed to discovering bugs with testing after they have been committed to mainstream development (Poppendieck M., Poppendieck T. 2006). It is also considered to be a valuabledesign technique where through the creation of the test, valuable insight is gained into the environment in which the code is expected to function. In particular, scenarios that that push the design to the edges of the state space are thought to be easier to visualize through the creation of the test as opposed to implementation of the design (Poppendieck M., Poppendieck T. 2006).
TDD can be applied in the construction of the verification environment (Morris, Saxe, 2009) and development of RTL.
Cautious Optimization
Optimize code in a way that does not compromise readability and understanding. Always remember that the most concisely written code is not necessarily the most easily understood.
- Tackle IC development as a series of small problems as opposed to a single large problem
- As a team, agree on descriptive naming conventions
- Avoid long or complex expressions
- Use intermediate states, properties and signals instead of expressions or bit slices
- Avoid multi-line macros
- Use dedicated properties and nets to capture status, interrupts and error conditions
- Avoid over-generalizing code in the name or reuse
For a final note on optimization, every team should take a practical approach where optimization follows two general rules:
- Optimization should happen in a closed loop system where measurable data or deficiencies are used to initiate it
- Optimization should not compromise convenience or understanding
No matter how elegant it may be, if an optimal solution is cumbersome or convoluted, it is no longer optimal!
Summary
When judging the value of a piece of hardware, customers are not grading individuals various roles we traditionally associate with IC development. If the hardware works they see success; if not they see failure. While how we organize ourselves obviously affects the outcome, there is no ranking system of RTL design v. functional verification nor is their hope for an all-star modeling teams paired with a ragtag crew handling the physical implementation. IC development is a team game; we are judged as a team and we succeed as a team so we must interact and take collective responsibility for delivery as a team.
Breaking away from traditional organizational structures, titles, boundaries and responsibilities is imperative for team success; challenging the idea of code ownership is a great place to start.
For many teams shared code is a major deviation from the norm. To build an objective opinion of how it may improve your team have people discuss the following questions:
- Is it annoying to guide others through your code or helpful to have others analyzing it?
- Do you care if someone else fixes or improves your code?
- Are code ownership and permissions holding back capable contributors?
- How smoothly is responsibility transferred during the debug cycle?
- Does a feeling of code ownership regularly cause the debug cycle to stall?
Finally, imagine yourself explaining the idea of code ownership to a customer (or if you are up to the challenge, you could try and explain it to my dad). Customers care about quality and time to market. If there are people on the development team capable of improving quality or expediting time to market, customers want that ability harnessed. If that means verification engineers fixing RTL, RTL designers writing tests or software developers building models, it is not going to matter to them. Should it really matter to us? After all, in the end it’s our customers that own the code, not us!
References
Beck, K., Extreme Programming Explained: Embrace Change (2nd edition), Addison-Wesley Professional, 2004.
Morris, B., Saxe, R., svunit: Bringing Agile Methods Into Functional Verification, SNUG San Jose 2009.
Poppendieck, M., Poppendieck, T., Implementing Lean Software Development: From Concept to Cash, Addison-Wesley Professional, 2006.
(c) Neil Johnson 2010 – all rights reserved