[vbox-dev] reserved identifier violation

Knut St. Osmundsen bird at sun.com
Thu Feb 19 22:34:54 GMT 2009


Hi Markus,

This discussion didn't quite get onto the right track due to some  
misinformation from our side. I'm very sorry about that and my only  
excuse is that I didn't spot the thread until late this afternoon.  
I'll try jump in and try correct some of this without causing too much  
trouble... Bear with me when I quote parts from several mails in the  
exchange here.

Let me start saying that we're actively trying to avoid clashes with  
the C and C++ namespaces. We're aware that the current code contains  
violations, __BEGIN_DECLS being a prominent example, but they will be  
corrected before too long I hope.

I'd also like to point out that we have several people on the team  
which knows the C, C++ and Unix standards very well and/or have been  
(/ are still) working on various C/C++ compilers and/or runtimes at  
some point in their life. We have discussed section 7.1.3 of the ISO/ 
IEC 9899:TC2 in depth several years back and did take a number of  
actions to cleaning up the code as a result. One of them was to ban  
the use of single and double underscores as prefix and postfix. But as  
you can see there are few remaining ones that still needs weeding out.

Let me use the two other examples from the trac ticket to explain how  
we read the double underscore part of section 7.1.3.  The examples are  
___VBox_BiosLogo_h and ____H_VIRTUALBOXBASEIMPL. That's 3 and 4  
leading underscores accordingly, which after our understand does not  
infringe on the reserved namespace.  The standard says: "All  
identifiers that begin with an underscore and either an uppercase  
letter or another underscore are always reserved for any use." We  
ended up read that as excluding identifiers starting with three or  
more underscores. I have no problems seeing how it can be read to  
include two or more leading underscores, but our decision was _not_ to  
read it that way.

On Feb 19, 2009, at 21:00, Markus Elfring wrote:
>>
>> Sure, there's other opportunity for quality assurance, e.g removing  
>> the
>> last few warnings, using lint, etc that's on our TODO list which is  
>> far
>> more likely fixing real problems instead of perceived ones.
>
> It is normal that other open issues usually get a higher priority.
>
>
>> To paraphrase a valuable engineering guideline: We're not fixing  
>> things
>> that ain't broken.
>
> I have got the view that the wrong application of naming conventions  
> might
> result in errors. Can this aspect become a security concern during a  
> detailed
> source code review?

My view is that it can cause portability problems and nightmarish  
build issues. One example of this is how both we and the linux kernel  
kind of wants to define the C++ true/false/bool combo (see iprt/ 
types.h). It would have made more sense to use our own boolean type  
and defines, but since most of our code is compiled by the C++  
compiler we thought it better to amend C with the standard C++ boolean  
stuff.

On Feb 16, 2009, at 14:38, Markus Elfring wrote:

> Thanks for your detailed explanation.
>
>
> Jens Schweikhardt wrote:
>> Any library (or API) provider faces a dilemma because he must  
>> invade one
>> of the two namespaces. VirtualBox chose to not invade the application
>> name space where practical. This means use of underscores as prefix  
>> in
>> places where the application writer is not supposed to use such  
>> identifiers.
>
> I find this a questionable design decision.

As mentioned above, we're not attempting to invade the compiler's  
namespace at all, sorry for that. But yes, we are knowingly invading  
the application's namespace.

The general idea is that we prefix identifiers with a component  
specific string that we deem to be relatively uncommon in the big wide  
world. This not only helps making conflicts with other software and  
system/language APIs less likely, it also helps organizing our own  
code. The C++ API is a little bit different and whether [I]Session,  
[I]USBDevice and so on are really good names choices if the purpose is  
to avoid clashes, is certainly questionable. For instance the  
USBDevice class had to be renamed when we ported VirtualBox to Mac OS  
X because it was clashed with a class in I/O Kit that we needed.  
That's life. Where possible we make use of namespace to avoid these  
kinds of issues.

>> While this is technically causing undefined behavior, it is
>> a conscious decision because we support compilation of VirtualBox  
>> only
>> on a very limited number of implementations, the implementation
>> namespace of which is known to us.
>
> Would you like to avoid more situations to build on undefined  
> behavior?
>
>
>> The __BEGIN_DECLS macro is a good example for this.
>
> I imagine that this symbol might be too short to avoid all potential  
> conflicts.

As mentioned at the start of this mail, __BEGIN_DECLS is a good  
example of a macro we never should have used. It's still around  
because it a bit of job to eliminate it and nobody got around to it  
yet. The reason why we used it in the first place is that, as  
mentioned earlier, some team members have a history from C/C++/Unix  
runtime development and found sys/cdefs.h and its __BEGIN_DECLS (from  
*BSD) to be a useful replacement for the three line #ifdef __cplusplus 
\n extern "C" {\n #endif block. (As you can see, there are both iprt/ 
cdefs.h and VBox/cdefs.h in that tradition.)

__BEGIN_DECLS is a problematic thing as it is defined by the system on  
several platforms. We mostly just detect this and define it  
conditionally, knowing that the system will define it the way we want  
it to be. However it requires more work when porting the code to a new  
platform and whenever a new version of the platform becomes available,  
or someone wish to use a different libc implementation or some such  
thing. The plan is to replace it with RT_BEGIN_DECLS one of these  
days...

>
>> As for header include guards, using __FOO_H is a common way to not
>> invade the app name space (see libvirt or the python binding for  
>> XPCOM
>> and many more).
>
> I have got the impression that this is just a bad habit if the  
> reserved naming pattern is used in source code which does not belong  
> to the implementation of a (C/C++) compiler.
> http://www.gnu.org/software/libc/manual/html_node/Reserved-Names.html
>
> I suggest to use longer names to make important identfiers unique.
> http://en.wikipedia.org/wiki/Include_guard#Difficulties

We use ___[DIR_]FOO_H, not __FOO_H. That's miscommunication from our  
side. We do not see this as an invasion of the compiler / system  
namespace. When possible the guard macro will include the component or/ 
and subdirectory in addition to the file name to avoid clashes with  
software we're using or that is using our code.

Hope this was readable and helpful. :-)

Kind Regards,
knut

---

Sun Microsystems GmbH        Knut St. Osmundsen
Werkstrasse 24               Senior Staff Engineer, VirtualBox
71384 Weinstadt, Germany     mailto:bird at sun.com


================================================
Sitz der Gesellschaft: Sun Microsystems GmbH,
Sonnenallee 1, 85551 Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Thomas Schroeder,
Wolfgang Engels, Dr. Roland Boehmer
Vorsitzender des Aufsichtsrates: Martin Haering
================================================





More information about the vbox-dev mailing list