Thursday, February 7, 2008

Microsoft VS2005 C++ non-compliance issues (Part I)

I'm currently doing some research on the standard conformance of C++ compiler and standard library supplied with MS VS2005 SP1 (compiler version 14.00.50727.762). The main reason of creating this blog entry is that so far I was unable to find a Web page, which would provide a good list of known non-compliance issues with the aforementioned compiler (I'd appreciate a link, if anyone could provide one). Also I'd like to note here that for me this happens to be a part of the process of transitioning from MS VC6 to VS2005 and, for this reason, I'll begin with testing the VS2005 version against the "usual suspects": the most simplistic and obvious deviations from the C++ specification present in VC6 SP5. Maybe later I'll update this blog entry (or create an additional one) with more complex issues, possibly specific to VS2005 only.

By default I will evaluate the compiler's behavior from the point of view of C++98 standard, trying to keep in mind the known issues within the document. I don't expect VS2005 to observe the changes introduced in TC1, but of course it is always more than welcome to follow the new specification.

Just for starters I'd like to say that I was impressed by the number of compiler bugs fixed in VS2005. (This is, once again, compared to VC6. To those who got to use VS2003 most of this might be old news.) Almost everything I tried checked out just fine right away and some things that looked wrong initially could be fixed by the [infamous] /Za switch. The latter appears to be much more useful that it used to be (i.e. as opposed to being completely useless in VC6), since now the compiler appears to be able to compile its own system and standard library headers in presence of /Za, although I can't say that I thoroughly tested this.

OK, here comes the list of what is still wrong.

1. Issues not fixed by /Za

1.1. String literals are thrown as 'char*' values

This issue is inherited from VC6 unchanged. The following code will catch the exception as 'char*' one in VS2005, while a compliant compiler shall not do it

try { throw "Hello"; } catch (char*) { // VS2005 catches it here... } catch (const char*) { // ... while it is supposed to be caught here }

The problem here is not with 'catch', since it can be easily demonstrated that 'catch' in VS2005 can reliably distinguish between const and non-const pointer types. The problem is with 'throw', which manages to lose the const-qualification of the result of array-to-pointer conversion applied to string literal.

It is interesting to note that VS2005 does realize that the type of string literal is 'array of const char', which is a welcome change from VC6. VC6 firmly believed that string literals have 'array of char' type, which lead to incorrect behavior in many other contexts, in addition to the one being considered. I checked a few of these contexts in VS2005 and they worked fine. For example, overload resolution now works correctly

void foo(const char*); void foo(char*); ... foo("Hello"); // Calls 'foo(const char*)' as it is supposed to, while // VC6 would incorrectly call 'foo(char*)'

Also 'typeid' now behaves properly

assert(typeid("Hello") == typeid(const char[6])); // Holds in VS2005 and fails in VC6 assert(typeid("Hello") == typeid(char[6])); // Holds in VC6 and fails in VS2005 // VS2005 exhibits the correct behavior

Meanwhile, 'throw' is still broken. Setting /Za option doesn't make it work the way it is supposed to. It is hard to say how this issue managed to survive in view of the fact that VS2005 now sees the type of string literal correctly. Was it preserved intentionally for backward compatibility? How come /Za has no effect on it then?

1.2. Exception specifications are not checked at compile time

I know that exception specifications in VS2005 are "parsed and ignored" with the exception of the empty one 'throw()', which does have some beneficial effect on the generated code. Moreover, I'm not a big fan of run-time functionality of exception specifications and as far as I know I'm not alone. However, together with run-time effects of exception specifications MS compilers so far managed to ignore the compile-time ones. More precisely, VS2005 (just like VC6) fails to enforce the requirements imposed on the exception specification of a virtual overrider. For example, the following code is ill-formed, but compiles without any diagnostic messages in VS2005

struct A { virtual void foo() throw(); }; struct B : A { void foo(); // ill-formed, not caught by VS2005 };

In C++ the exception specification of an overriding virtual function must be at least as restrictive as the exception specification of the corresponding function in the parent class, i.e. in the above example 'B::foo()' is required to be specified as 'throw()'. The problem with this is that what appears to be a correct code in VS2005 might fail to compile on any other platform. For some it is not a big deal, but it happens to be one for me.

One can argue that since exception specifications are mostly useless, they should not be used at all and the problem in question will never arise. There are two things that can be said in response to this argument. Firstly, as it's been said above the empty specification 'throw()' is actually useful. Secondly, the standard library does use exception specifications, which can lead to unexpected errors even it the users themselves avoid them in their code. For example this innocent looking code is ill-formed and VS2005 doesn't detect the problem

class my_exception : public std::exception { std::string s; };

The culprit is the implicitly-declared virtual destructor of 'my_exception' class. Since the destructor of the only data member 's' has unrestricted exception specification, the implicitly declared destructor of 'my_exception' also has unrestricted exception specification. At the same time virtual destructor of the base class 'std::exception' is specified as 'throw()'. Now it is obvious that the destructor of the derived class attempts to extend the exception specification of the virtual destructor it overrides, and the code is ill-formed.

1.3. Two-phase name lookup is still not implemented

Name lookup for non-dependent names used in template definitions is still delayed till the moment (and point) of actual instantiation. The following perfectly valid code sample will not compile in VS2005

int foo(void*); template<typename T> struct S { S() { int i = foo(0); } // A standard-compliant compiler is supposed to // resolve the 'foo(0)' call here (i.e. early) and // bind it to 'foo(void*)' }; void foo(int); int main() { S<int> s; // VS2005 will resolve the 'foo(0)' call here (i.e. // late, during instantiation of 'S::S()') and // bind it to 'foo(int)', reporting an error in the // initialization of 'i' }
1.4. Explicit template argument specification is ignored in qualified template names used as default arguments

In order to reproduce this problem one needs to mix several "ingredients": a qualified name of a function template should be used as a default argument in another function template declaration. Under these conditions explicitly specified arguments of the former template are ignored. The following code sample illustrates the problem

namespace N { template <typename T> T foo(); } template <typename U> void bar(int i = N::foo<int>()); int main() { bar<int>(); // VS2005 fails to compile the call, complaining about // not being able to deduce the template argument for // 'N::foo' }

In this case VS2005 complains about not being able to deduce the template argument for 'N::foo' call, even though the argument in question is specified explicitly. It is fairly easy to demonstrate that the problem is caused by the fact that the explicitly specified template argument is simply ignored. The compiler makes an attempt to deduce the argument and fails, since it is non-deducible in the above code sample. If we modify the code to make it deducible, the compiler will "prefer" the deduced argument, once again ignoring the explicitly specified one

namespace N { template <typename T> T foo(T t); } template <typename U> void bar(int i = N::foo<int>(0.0)); int main() { bar<int>(); // VS2005 specializes the 'N::foo' in the default // argument as 'N::foo<double>' }

The problem doesn't seem to be tied to namespaces in any way, since it reproduces just as well with a function template declared as a class member (as opposed to namespace member)

struct N { template <typename T> static T foo(); }; ...

With a namespace declaration a using-declaration can be used to eliminate the need for a qualified name, thus making the problem to go away, as shown in the code sample below

namespace N { template <typename T> T foo(); } using N::foo; template <typename U> void bar(int i = foo<int>()); int main() { bar<int>(); // Compiles correctly }
1.5. Name lookup refuses to search for data member names from within a data member declaration

When defining a class, the declaration of each member can refer to the previously declared members of the same class. For example, in the following code snippet

struct S { enum { size = 100 }; int a[size]; };

the declaration of data member 'S::a' refers to the previously declared enumeration constant 'S::size'. The generic lookup mechanism for the member name is, of course, implemented in VS2005. However, it appears to be artificially restricted to look up for some kinds of class members, and refuses to look up others, such as non-static data members, for example. Apparently the compiler authors believed that there's no valid context in which the compiler would have to look for a non-static data member. In reality, such contexts do exist. A reference to a non-static data member name can be used as a template argument in a member declaration, as in the following code sample

template <class S, int S::*> struct T {}; struct S { int x; T<S, &S::x> t; // VS2005 fails to compile the declaration, refusing // to look up the non-static data member name };

The compiler responds with the error messages 'error C2327: 'S::x' : is not a type name, static, or enumerator' and 'error C2065: 'x' : undeclared identifier', while in fact the code is perfectly valid.

It is interesting to note that in the absolutely similar context the lookup for a member function name works perfectly fine, even though member function name is not "a type name, static, or enumerator" either

template <class S, void (S::*)()> struct T {}; struct S { void f(); T<S, &S::f> t; // compiles fine };
1.6. Const-qualified array types work incorrectly as template arguments

In C++, when a constant array type (i.e. an array of constants) is associated with const-qualified template type-parameter, the compiler has to be able to interpret the constness of array elements as the constness of the entire array, and properly match it with the const-qualifier on the template type-parameter. For example, when the argument type 'const int[10]' is passed as a template type-parameter 'const T&', the type 'T' in this case shall stand for 'int[10]', not for 'const int[10]'. Unfortunately, VS2005 gets it wrong and interprets 'T' in this case as 'const int[10]'. The following code demonstrates the incorrect behavior

template <class T> void foo(const T& c) { std::cout << (typeid(T) == typeid(int[10])) << std::endl; std::cout << (typeid(T) == typeid(const int[10])) << std::endl; } int main() { const int a[10] = {}; foo(a); }

The above program outputs 0 and 1, while the correct behavior is to output 1 and 0.

The consequences of this problem are apparent in template argument deduction. For example, VS2005 fails to compile the following code

template <class T> void foo(T&, const T&); int main() { int l[10]; const int r[10] = {}; foo(l, r); // VS2005 fails to compile the call, complaining about // ambiguous template parameter 'T' }

because the argument deduction for the first and for the second argument produce different values of 'T' ('int[10]' and 'const int[10]'), while in reality they should produce the same 'T' (just 'int[10]' for both arguments) and the code should compile.

Another context, where the same problem arises is partial specialization, as in the following code sample

template <class T> struct S {}; template <class T> struct S<const T> { int i; }; int main() { S<const int[10]> s; s.i = 0; // VS2005 reports an error, insisting that there's // no 's.i' }
1.7. Name lookup problem with 'using' directive and out-of-namespace definitions

When a function declared as a member of a namespace is defined outside of its namespace, any 'using' directives present in the declaration namespace "spill out" into the definition namespace causing potential ambuguities during name lookup. In the following code sample

typedef int T; namespace A { typedef int T; } namespace B { using namespace A; void foo(); } void B::foo() {} T i; // VS2005 reports an error, complaining about ambiguious // name 'T', with '::T' and 'A::T' being two possible // candidates

function 'B::foo' is defined in the global namespace, which makes the names from namespace 'A' (nominated by 'using' directive in 'B') to pollute the global namespace and cause name conflicts. If we put the declaration of 'i' before the definition of 'B::foo' the problem will disappear. Also, as one would expect, removing the 'using namespace A' directive from namespace 'B' fixes the problem.

1.8. Zero-initialization of static objects is implemented naively

It is well-known that the language specification requires all objects with static storage duration to be zero-initialized before any other initialization begins. Zero-initialization, as always, means initialization with logical zeros. Since for almost all data types on our "everyday" platforms the logical zero coincides with physical zero, most of the time the implementation can get away with filling the corresponding memory region with all-zero bit pattern. However, one often-overlooked category of types usually has different representation for its logical and physical zeros. These are pointers of pointer-to-data-member type. Since internally they are usually implemented as mere offsets, the physical zero value is actually useful, and some other physical value must be reserved to represent the null-pointer value of the type. The all-one bit pattern (0xFFFF...) is usually chosen for that purpose. This is the case in VS2005 implementation as well. However, the compiler happens to forget that objects of such types have to be initialized with that all-ones patern at program startup. In the following simple program

struct S { int i; }; int S::*p; int main() { assert(p == NULL); p = &S::i; assert(p != NULL); }

the first assertion will fail, even though it is not supposed to. Adding an explicit zero initializer to the declaration of 'p' fixes the issue.

No comments: