Cppcheck for better QA in Debian, sources, deborphan

This would be somewhat a longish post about how to find c programs in Debian to how to use cppcheck on them to find problems and report them upstream

Confession time – I am not a c programmer by any stretch of imagination. I can read but that’s about it.

Let’s check first as to what this tool is all about. Aptitude shows :-


[$] aptitude show cppcheck
Package: cppcheck
State: installed
Automatically installed: no
Version: 1.68-1
Priority: optional
Section: devel
Maintainer: Reijo Tomperi
Architecture: amd64
Uncompressed Size: 2,896 k
Depends: libc6 (>= 2.15), libgcc1 (>= 1:4.1.1), libpcre3 (>= 1:8.35), libstdc++6 (>= 4.9), libtinyxml2-2 (>= 2.0.2)
Description: tool for static C/C++ code analysis
Cppcheck is a command-line tool that tries to detect bugs that your C/C++ compiler doesn't see. It is versatile, and can check non-standard code including various compiler extensions, inline assembly code, etc. Its internal preprocessor can handle includes, macros, and several preprocessor commands. While Cppcheck is highly configurable, you can start using it just by giving it a path to the source code.

It includes checks for:
* pointers to out-of-scope auto variables;
* assignment of auto variables to an effective parameter of a function;
* out-of-bounds errors in arrays and STL;
* missing class constructors;
* variables not initialized by a constructor;
* use of memset, memcpy, etcetera on a class;
* non-virtual destructors for base classes;
* operator= not returning a constant reference to itself;
* use of deprecated functions (mktemp, gets, scanf);
* exceptions thrown in destructors;
* memory leaks in class or function variables;
* C-style pointer cast in C++ code;
* redundant if;
* misuse of the strtol or sprintf functions;
* unsigned division or division by zero;
* unused functions and struct members;
* passing parameters by value;
* misuse of signed char variables;
* unusual pointer arithmetic (such as “abc” + ‘d’);
* dereferenced null pointers;
* incomplete statements;
* misuse of iterators when iterating through a container;
* dereferencing of erased iterators;
* use of invalidated vector iterators/pointers;
Homepage: http://cppcheck.wiki.sourceforge.net/

Tags: devel::lang:c, devel::lang:c++, devel::testing-qa, implemented-in::c++, interface::commandline, role::program, scope::utility, security::TODO, use::analysing, use::checking, works-with::software:source

So, what it is saying to us from all the information above is that it is a command-line code which needs some source-code written in C and it will tell you whatever issues there are with the code.

Now our first problem is how and where do we find some source-code to try this out. Hang on, aren’t we supposed to have this so-called wonderful GNU/Linux distributions where source-code is open and free. Let’s see how much of that is true.

Looking around, it becomes clear that I need to add a line in my /etc/apt/sources.list if I want to be anywhere near the elusive source-code, so I do that.


└─[$] cat /etc/apt/sources.list

#### stable #########
deb http://httpredir.debian.org//debian/ jessie main contrib non-free
deb-src http://httpredir.debian.org//debian/ jessie main contrib non-free

##### stable-updates #########
deb http://httpredir.debian.org//debian jessie-updates main contrib non-free
deb-src http://httpredir.debian.org//debian jessie-updates main contrib non-free

######### Security updates #########
deb http://security.debian.org/ jessie/updates main contrib non-free
deb-src http://security.debian.org/ jessie/updates main contrib non-free

There are many more entries but the above are good enough to get started. So do an apt-get update or an aptitude update . The addition of deb-src are not going to bring any new packages, just the possibility of downloading the source-code of it.

Now, we know that Debian is a huge, huge repository of several thousand packages, so trying to find a simple C package is a simple and yet daunting task as it’s not easy to know at a glance what language a package is written in. It sometimes may be written in the long-description but more often than not, it’s not. So how do we find out ?

Enter debtags. I had already shared about debtags about couple of years back. With debtags you can simply frame a query within its taxonomy (which needs help as well as more packages need to be tagged but that’s a different issue) you probably can find your quarry. I looked through its choice of keywords it had (that too needs to be cleaned up but that’s topic for another time) and settled on "implemented-in::c" . I framed the query as :-

$ debtags search "implemented-in::c" > packages_which_use_c.txt

and voila got something like 4252 files which would be approximately be around 8-9%. Later on, Stephen Kitt (came to know he’s a Debian Developer – a DD) shared that the number of packages which use number of lines of source code in C is around 40% of the debian archive and 23.1% were c++ which was surprising as I was under the impression that the newer languages were making lot of gains but this simply belies that. Those stats were based on stats shared at sources.debian.net/stats/ which was also a resource shared by Stephen.

Now comes the uneviable task of choosing a package to try out from that huge incomplete list. After trawling through that list, settled on deborphan, simply because I have used the tool a bit, know about it and more importantly the development of the tool happens within Debian. As always before we start with doing anything, let’s see what the tool is about :-


[$] aptitude show deborphan

Package: deborphan
State: installed
Automatically installed: no
Version: 1.7.28.8-0.1
Priority: optional
Section: admin
Maintainer: deborphan devel team
Architecture: amd64
Uncompressed Size: 266 k
Depends: libc6 (>= 2.14)
Recommends: apt, dialog, gettext-base
Description: program that can find unused packages, e.g. libraries
deborphan finds “orphaned” packages on your system. It determines which packages have no other packages depending on their installation and shows you a list of these packages. It is most useful when finding libraries, but it can be used on packages in all sections.

This package also includes orphaner, a text menu frontend to deborphan. Please install the recommended packages dialog, gettext-base and apt when you want a working and fully featured orphaner.

Tags: admin::package-management, implemented-in::c, interface::commandline, role::program, scope::utility, suite::debian, use::checking, use::organizing, works-with::software:package

So, now the next bit of work would be to download the source-code. To do that, simply make a directory called deborphan (or whatever package source you are downloading, that name is needed) cd into it and then type the following command :-


$ mkdir deborphan
$ cd deborphan
/deborphan $ apt-get source deborphan

Depending on the how your system is and what version of deborphan is in the archive, you will get something like this :-


/deborphan $ls

deborphan-1.7.28.8 deborphan_1.7.28.8-0.1.dsc deborphan_1.7.28.8-0.1.tar.gz

While it cannot be seen the first is a directory, the next is the description file and the last one is the original .tar.gz compressed archive. So we go into the deborphan directory and see


~/deborphan/deborphan-1.7.28.8 $ ls

[$] ls

aclocal.m4 config.guess configure COPYING depcomp include intl Makefile.in mkinstalldirs po src util autogen.sh config.sub configure.in debian doc install-sh Makefile.am missing NEWS README THANKS

I have made the listing a bit more readable. The listing given is what is more or less needed for packaging but as today we just want to see some .c files we are going to look at the src folder/directory.


~/deborphan/deborphan-1.7.28.8/ $ cd src
~/deborphan/deborphan-1.7.28.8/src $ ls
deborphan.c exit.c file.c keep.c libdeps.c Makefile.am Makefile.in pkginfo.c set.c string.c xalloc.c

This is precisely what we were looking for.

Now being a good boy, I read through the manpage. While I wouldn’t go through all, the simplest ones would be style. As the manpage tells :-

–enable=
Enable additional checks. The available ids are:

style
Enable all coding style checks. All messages with the severities ‘style’, ‘performance’ and ‘portability’ are enabled.

So let’s see if that gives some output :-


~/deborphan/deborphan-1.7.28.8/src $cppcheck --enable=style deborphan.c
Checking deborphan.c...
[deborphan.c:74]: (style) The scope of the variable 'j' can be reduced.
[deborphan.c:215]: (portability) scanf without field width limits can crash with huge input data on some versions of libc.
Checking deborphan.c: ALL_PACKAGES_IMPLY_SECTION...
Checking deborphan.c: DEBFOSTER_KEEP...
Checking deborphan.c: DEBUG...
[deborphan.c:368]: (warning) %d in format string (no. 1) requires 'int' but the argument type is 'size_t {aka unsigned long}'.
Checking deborphan.c: DEFAULT_NICE...
Checking deborphan.c: ENABLE_NLS...
Checking deborphan.c: HAVE_CONFIG_H...
Checking deborphan.c: HAVE_ERRNO_H...
Checking deborphan.c: HAVE_GETOPT_H...
Checking deborphan.c: IGNORE_DEBFOSTER...
Checking deborphan.c: LOW_MEM...
Checking deborphan.c: USE_XALLOC...

Now I know where there are style, probability and warnings I don’t need to know anything other than the fact that something is wrong here and needs to be fixed. If I was a C developer, I would have gone into that C file, see what it is talking about and then made a patch and shared that to the developer. Because I cannot is no source of worry, I could still run all the checks and share the results with the developer and sooner or later s/he would fix the ones that s/he thinks are true and important.

So, the next step is obviously to report what is being found. Reported it as http://bugs.debian.org/783871

Some points :-

a. Those errors, warnings etc. do not mean that the code is bad. There may be some false positives but as I’m not a ‘C’ programmer I will not be able to know that. If it is, then will know shortly and then a bug will be filed either by me or the developer which will improve Debian (which is what we need/want at the end.)

b. No developer wants to write bad, inefficient and insecure code. If you run cppcheck on some code the first thing is to make sure it is production quality code. That is the reason I chose deborphan because the specific version is in jessie which makes it production quality.

c. Ideally, I would run it through all the different .c files and shared the results one after the other, but in the real world it is better to run with one, take feedback from the developer as to if something would have been made better and then use that on other files to make deborphan and ultimately Debian better.

That’s all for today.

2 thoughts on “Cppcheck for better QA in Debian, sources, deborphan

  1. Just a small correction: the stats you mention are calculated on the total number of lines of code, not the number of packages. So around 40% of the lines of source code in all of Debian are written in C; that doesn’t mean that 40% of the packages in Debian are written in C…

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.