Showing posts with label compilers. Show all posts
Showing posts with label compilers. Show all posts

Wednesday, May 1, 2024

Static Analysis Ranked Defect List

 Crazy idea of the day: Static Analysis Ranked Defect List.


Here is a software analysis tool feature request/product idea: So many times we see the problem that a static analysis tool or other way of automatically finding bugs inundates developers with so many possible bugs they turn it off in frustration. Or maybe they have a requirement to ship only "clean" code so they don't run the tool because then there is nothing to clean up. Put your favorite head-in-the-sand organizational dysfunction here.


I'd love to be able to recommend a static analysis or other tool that has a feature that reports that top 10 defects currently in the code base, ranked by likely risk. Go ahead and use machine learning for the ranking; fine with me. Regardless of methodology, fixing only some is a guess after all.


Ultimately, all the static analysis warnings should be cleaned up. Having a "top K warnings" feature would allow a team to make progress over time instead of simply sweeping the entire mess under the carpet and ignoring what is often very valuable information that predicts defect escapes to deployment.


For tools with many warnings an approximation can be to gradually turn on a series of warning flags over time and/or just run the warnings on a subset of modules. But a global "these are the 10 scariest warnings" would be a nice feature.


Thoughts? Does some tool already do this?

Monday, August 1, 2011

Proper use of .h and .c files

I recently worked with some embedded system teams who were struggling with the best way to use .c and .h files for their source code. As I was doing this, I remembered that back when I was learning how to program that it took me quite a while to figure all this out too!  So, here are some guidelines on a reasonable way to use .c and .h files for organizing your source code. They are listed as a set of rules, but it helps to apply common sense too:

  • Use multiple .c files, not just one "main.c" file. Every .c file should have a set of variables and functions that are tightly related to each other, and only loosely related to the other .c files.  "Main.c" should only have the main loop in it.
  • .C files allocate storage and define executable code. Only .c files have a function defined in them.  Only .c files allocate storage for variables
  • .H files define external storage and define function prototypes The .h files give other modules the information they need to work with a particular .c file, but don't actually define storage and don't actually define code.  The keyword "extern" should generally be showing up in .h files only.
  • Every .c file should have a corresponding .h file. The .h file provides the external interface information to other modules for using the corresponding .c file.

Here is an example of how this works. Let's say you have files:  main.c    adc.c   output.c  process.c  watchdog.c

main.c would have the main loop that polls the A/D converter, processes values, sends outputs, and pets the watchdog timer.  This would look like a sequence of consecutive subroutine calls within an infinite loop.  There would be a corresponding main.h that might have global definitions in it  (you can also have "globals.h" although I prefer not to do that myself).   Main.c would #include  main.h, adc.h, output.h, process.h and watchdog.h because it needs to call functions from all the corresponding .c files.

adc.c would have the A/D converter code and functions to poll the A/D and store most recent A/D values in a data structure.  The corresponding adc.h would have extern declarations and function prototypes for any other routine that uses A/D calls (for example, a call to look up a recent A/D value from polling).  Adc.c would #include adc.h and perhaps nothing else.

output.c would take values and send them to outputs.  The corresponding output.h would have information for calling output functions.  Output.c would probably just #include output.h.  (Whether main.c actually calls something in output.c depends on your particular code structure, but I'm assuming that outputs have two steps: process.c queues outputs, and the main.c call actually sends them.)

process.c would take A/D values, compute on them, and queue results for output. It might need to call functions in adc.c to get recent values, and functions in output.c to send results out. For that reason it would #include process.h, adc.h, output.h

watchdog.c would set up and service the watchdog timer. It would #include watchdog.h


One wrinkle is that some compilers can only optimize in a single .c file (e.g., only do "inline" within a single file). This is no reason to put everything in a single .c file!  Instead, you can just #include all the other .c files from within main.c. (You might have to make sure you include each .h file once depending on what's in them, but often that isn't necessary.) You should avoid #including a .c file from within another .c file unless there is a compelling reason to do so such as getting your optimizer to actually work.

This is only intended to convey the basics. There are many hairs to split depending on your situation, but if you follow the above guidelines you're off to a good start.

NOTE: Michael Barr published another, compatible, take on this topic in May.  I just saw it at:  http://www.embedded.com/electronics-blogs/barr-code/4215934/What-belongs-in-a-header-file

Wednesday, November 10, 2010

Embedded Software Risk Areas -- Implementation -- Part 2

Series Intro: this is one of a series of posts summarizing the different red flag areas I've encountered in more than a decade of doing design reviews of industry embedded system software projects. You can read more about the study here. If one of these bullets applies to your project, you should consider whether that presents undue risk to project success (whether it does or not depends upon your specific project and goals). The results of this study inspired the chapters in my book.

Here are some of the Implementation red flags (part 2 of 2):
  • Ignoring compiler warnings
Programs compile with ignored warnings and/or the compilers used do not have robust warning capability. A static analysis tool is not used to make up for poor compiler warning capabilities. The result can be that software defects which could have been caught by the compiler must be found via testing, or miss detection entirely. If assembly language is used extensively, it may contain the types of bugs that a good static analysis tool would have caught in a high level language.
  • Inadequate concurrency management
Mutexes or other appropriate concurrent data access approaches aren’t being used. This leads to potential race conditions and can result in tricky timing bugs.
  • Use of home-made RTOS
An in-house developed RTOS is being used instead of an off-the-shelf operating system. While the result is sometimes technically excellent, this approach commits the company to maintaining RTOS development skills as a core competency, which may not be the best strategic use of limited resources.

Static Analysis Ranked Defect List

  Crazy idea of the day: Static Analysis Ranked Defect List. Here is a software analysis tool feature request/product idea: So many times we...