Oct 302009
 

I’ve seen this issue mentioned in some random and hard to reach places on the Net, so I thought I’d re-express it here for those who find Google sending them this way.

UPDATE: According to the discussion at https://trac.macports.org/ticket/27237, the real problem here is not fully dynamic string, but the use of _GLIBCXX_DEBUG. So I recommend ignoring what follows, as it will help you on Snow Leopard or Lion with gcc 4.6 and above.

On Snow Leopard, Apple decided to build g++ and the standard C++ library with “fully dynamic strings” enabled. What this means for you relates to the empty string.

When fully dynamic strings are off (as was true in Leopard), there exists a single global variable representing the empty string. This variable lives in the data segment of libstdc++, and so it does not exist on the heap. Whenever a string is deconstructed, the standard library would check whether that string’s address matches matches the empty string’s: if so, it does nothing; if not, it calls free.

With fully dynamic strings on, there is no global empty string. All strings are on the heap, and once their reference count goes to zero, they get deallocated. Where this creates a problem is if you mix and match code. If a library that does have fully dynamic strings enabled (aka the standard library) receives an empty string from code which does not have it enabled (aka, the app you just built), it will try to free it and your application will crash.

Here’s a reproducible case for this issue using Boost:

#include 
#include 
#include 

int main()
{
  std::ostringstream buf;
  boost::variant data;
  data = buf.str();
  data = false;
  return 0;
}

In this case — which really happened to me — I created an empty string by calling ostringstream::str(). Since I don’t have fully dynamic string on, its address is in data space, not on the heap. I pass this string to boost::variant, which makes a copy of that address. Later, when the variant is reassigned false, it calls ~basic_string to deconstruct the string. Since my standard library is compiled with fully dynamic strings, the destructor for basic_string doesn’t recognize that its the “special” empty string, so it tries to free it.

The solution to this problem is three-fold:

  1. You must be using the g++ that comes with Xcode, or if you build your own (say, via MacPorts), you must configure it using --enable-fully-dynamic-string. I’ve already submitted a patch to this effect to the MacPorts crew.

  2. All libraries must be compiled with -D_GLIBCXX_FULLY_DYNAMIC_STRING.

  3. Your own code must be compiled with -D_GLIBCXX_FULLY_DYNAMIC_STRING.

You’ll know if this issue is biting you by looking at a stack trace in gdb. You’ll see a crash somewhere inside basic_string’s _M_destroy (which calls free). Move up the trace a bit and check whether the string it’s trying to free is 0 bytes long.

To recap: what’s happened is that an empty string constructed by code without fully dynamic strings got deallocated by code that was. That is, most likely you, or a library you built, handed an empty std::string to the system library.

 Posted by at 4:35 am

  9 Responses to “A C++ gotcha on Snow Leopard”

  1. Thanks for posting this, although I confess I didn’t come across it until after I had spent many hours chasing the problem…

    For my own system I’m fixing the problem by patching /usr/include/c++/4.2.1/x86_64-apple-darwin10/bits/c++config.h, to ensure that `_GLIBCXX_FULLY_DYNAMIC_STRING` is always set.

    One thing that still confuses me about this bug is the fact that sometimes you *dont* see it. For example, my test case was even simpler than yours and the failure was caused simply by *linking* to a boost library: http://article.gmane.org/gmane.comp.lib.boost.user/53178

    I’m struggling to come up with a better reason than “linker randomness” to explain this behaviour.

  2. According to me, Apple has implemented something called blocks, which can be used to create both closures and anonymous functions. Both of these are something that C and C++ programmers could really use – the Standard Template Library would be massively improved by these new types of objects.

  3. Something like this is slated for the next release of C++.

  4. First, thank you so much for the post. This has been invaluable in giving us a starting point in tracking down similar issues.

    Second, to put it bluntly, I think you’re wrong. :-) libstdc++ in Snow Leopard was compiled with _GLIBCXX_FULLY_DYNAMIC_STRING turned on.

    We originally had the same issue crop up, where we were getting these same errors like crazy. We eventually found this post and other information on the internet indicating that turning on _GLIBCXX_FULLY_DYNAMIC_STRING would fix the problem. We did that. It helped.

    However, even after it was turned on, the errors still occurred, but much less frequently and seemingly from different places than they oringally occurred. It wasn’t completely spamming us anymore, so we ignored them for awhile. But with the release of the new Safari in Lion (that doesn’t install a SIGABRT handler whereas the old Safari did, grrrr….) we were suddenly getting crashes instead of just log messages and were forced to face the issue.

    I was finally able to reproduce the errors with _GLIBCXX_FULLY_DYNAMIC_STRING turned on in a simple example project where an empty string is created via libstdc++, _GLIBCXX_FULLY_DYNAMIC_STRING is set in the project’s code, and the project deallocates the empty string, giving the same error you describe.

    To reproduce this, here are the steps to follow. Note that I’m on Snow Leopard (10.6.8) and am using Xcode 3.2.6.

    In Xcode, create a new “Command Line Tool” project.

    Replace the contents of the main.cpp file with the following:
    #include <iostream>
    #include <string>

    int main() {
    std::string myvar;
    std::cout << myvar << std::endl;
    return 0;
    }

    If you now compile this as Release and run the code, it will work fine. You’ll be given a blank line of output and no errors.

    Next, do “Get Info” on your project’s only Target. Make sure the Configuration at the top is “Release”. Go down to the “GCC 4.2 – Preprocessing” section and under “Preprocessor Macros” add: “‘_GLIBCXX_FULLY_DYNAMIC_STRING=1′” Also, go down to the “GCC 4.2 – Code Generation” section and under “Optimization Level” change it from the default (“Fastest, Smallest [-Os]“) to “Fastest [-O3]“.

    This enables fully dynamic strings in the project and changes the optimization level (I’ll get to why that is necessary at the end). Now, if you recompile you will get the blank line of output and the familiar error. Here was mine:

    test-strings3(744) malloc: *** error for object 0x7fff7080c500: pointer being freed was not allocated
    *** set a breakpoint in malloc_error_break to debug
    Abort trap

    But wait! We compiled with fully dynamic strings enabled! That’s supposed to fix the issue, not cause it. Why are we getting the error in this case then? The root issue, as you correctly stated, is that something creates a std::string and uses the empty singleton string instead of allocating an empty string (in code with _GLIBCXX_FULLY_DYNAMIC_STRING off). This std::string is then passed to some other code that tries to free the singleton string (with _GLIBCXX_FULLY_DYNAMIC_STRING on). It boils down to a question of which functions in the templated std::basic_string class are compiled where (libstdc++ vs. the project binary), whether _GLIBCXX_FULLY_DYNAMIC_STRING is enabled there or not, and which versions of these functions are actually called (libstdc++’s vs. the project’s binary’s). This question is actually quite complicated and subtle because whether or not templated functions are inlined or linked to is complicated and subtle.

    To explain what’s going on, I’m going to walk through the creation and destruction functions of basic_string in detail. This is all templated C++ code in the standard library, so it’s all available in headers that you can access in Xcode. Primarily, it is in the basic_string.h and basic_string.tcc files, both of which you can get to in Xcode by using the command+shift+D shortcut.

    Let’s start with the basic_string default constructor (basic_string.h line 2025):
    template<typename _CharT, typename _Traits, typename _Alloc>
    inline basic_string<_CharT, _Traits, _Alloc>::
    basic_string()
    #ifndef _GLIBCXX_FULLY_DYNAMIC_STRING
    : _M_dataplus(_S_empty_rep()._M_refdata(), _Alloc()) { }
    #else
    : _M_dataplus(_S_construct(size_type(), _CharT(), _Alloc()), _Alloc()) { }
    #endif

    This code is as you would expect, if _GLIBCXX_FULLY_DYNAMIC_STRING is off, _S_empty_rep()._M_refdata() is called to get the singleton empty string value, and if _GLIBCXX_FULLY_DYNAMIC_STRING is on, _S_construct is called to allocate a new string. It’s worth next looking at _S_construct (basic_string.tcc line 162):

    template<typename _CharT, typename _Traits, typename _Alloc>
    _CharT*
    basic_string<_CharT, _Traits, _Alloc>::
    _S_construct(size_type __n, _CharT __c, const _Alloc& __a)
    {
    #ifndef _GLIBCXX_FULLY_DYNAMIC_STRING
    if (__n == 0 && __a == _Alloc())
    return _S_empty_rep()._M_refdata();
    #endif
    // Check for out_of_range and length_error exceptions.
    _Rep* __r = _Rep::_S_create(__n, size_type(0), __a);
    if (__n)
    _M_assign(__r->_M_refdata(), __n, __c);

    __r->_M_set_length_and_sharable(__n);
    return __r->_M_refdata();
    }

    Somewhat surprisingly, there is also a check in here for _GLIBCXX_FULLY_DYNAMIC_STRING. In the case where _GLIBCXX_FULLY_DYNAMIC_STRING is off, the length of the string being allocated is 0, and the allocator is a default allocator, then once again _S_empty_rep()._M_refdata() is called to get the singleton. I guess they needed to cover the case of std::string(“”). These two places are the only places I’ve found in the code that allocate the singleton empty string when _GLIBCXX_FULLY_DYNAMIC_STRING is off. This means that any time the error in question crops up, one of these two functions is being compiled with _GLIBCXX_FULLY_DYNAMIC_STRING turned off.

    Now, let’s follow destruction of an empty basic_string. The basic_string destructor is in basic_string.h line 471:

    ~basic_string()
    { _M_rep()->_M_dispose(this->get_allocator()); }

    That _M_dispose function is the next place to look, in the _Rep inner class of basic_string. It’s in basic_string.h, line 220:

    void
    _M_dispose(const _Alloc& __a)
    {
    #ifndef _GLIBCXX_FULLY_DYNAMIC_STRING
    if (__builtin_expect(this != &_S_empty_rep(), false))
    #endif
    if (__gnu_cxx::__exchange_and_add(&this->_M_refcount, -1) <= 0)
    _M_destroy(__a);
    } // XXX MT

    As you would expect, there is a check when _GLIBCXX_FULLY_DYNAMIC_STRING is off to see if “this” is the empty string rep, and if it is, the code to decrement the reference count and subsequently destroy the string representation is not called. The takeaway is that any time the error in question crops up, the compilation of _M_dispose being called must have been compiled with _GLIBCXX_FULLY_DYNAMIC_STRING on.

    The last place to look is _M_destroy (basic_string.tcc line 423):

    template<typename _CharT, typename _Traits, typename _Alloc>
    void
    basic_string<_CharT, _Traits, _Alloc>::_Rep::
    _M_destroy(const _Alloc& __a) throw ()
    {
    const size_type __size = sizeof(_Rep_base) +
    (this->_M_capacity + 1) * sizeof(_CharT);
    _Raw_bytes_alloc(__a).deallocate(reinterpret_cast<char*>(this), __size);
    }

    Note that there is no check for _GLIBCXX_FULLY_DYNAMIC_STRING here. I only point this out because when the error occurs, this is usually the last basic_string related function on the stack, and the deallocate function is calling “free()” with memory not allocated through “malloc()”, causing the error.

    Now that we’ve gone through the code, I’ll quickly recap: In order for the error to happen, either the basic_string constructor or the basic_string::_S_construct function called was compiled with _GLIBCXX_FULLY_DYNAMIC_STRING off and the basic_string::_Rep::_M_dispose function called was compiled with _GLIBCXX_FULLY_DYNAMIC_STRING on.

    So where am I going with all of this? Since in the example I gave above, _GLIBCXX_FULLY_DYNAMIC_STRING was on for all of the project’s code, that means that if the error is happening, then the basic_string constructor or _S_construct is being called from libstdc++, and that version is compiled with _GLIBCXX_FULLY_DYNAMIC_STRING off. A direct contradiction of your initial assertion!

    You can even double check this using MacDependency. If you look at the compiled binary from the example when the error occurs in MacDependency, you will see that it actually imports the following symbols:

    std::basic_string<char, std::char_traits<char>, std::allocator<char> >::_S_construct(unsigned long, char, std::allocator<char> const&)
    std::basic_string<char, std::char_traits<char>, std::allocator<char> >::_Rep::_M_destroy(std::allocator<char> const&)

    So that means it’s definitely calling libstdc++’s implementation of _S_construct and not calling libstdc++’s implementation of _M_dispose (since it isn’t imported and _M_destroy, called from _M_dispose, is called). Now, remember that change to the optimization level WAY back up at the beginning. If you change the optimization level back to -Os and leave _GLIBCXX_FULLY_DYNAMIC_STRING on, the error does not occur. If you look at the resulting binary in MacDependency, you’ll see imported:

    std::basic_string<char, std::char_traits<char>, std::allocator<char> >::_S_construct(unsigned long, char, std::allocator<char> const&)
    std::basic_string<char, std::char_traits<char>, std::allocator<char> >::_Rep::_M_dispose(std::allocator<char> const&)

    So, in this case, _M_dispose is not inlined into the code in the project and instead is linked to in libstdc++. Since it matches the compilation of _S_construct still being called, the singleton empty string isn’t freed and is instead ignored, so the error no longer occurs.

    And if you still don’t believe me, open up the example I gave in the debugger and step into the initialization of that empty string. You’ll eventually end up in an implementation of _S_construct in libstdc++. As I stepped through it, it was pretty clear that it was doing a simple check (for the length being 0) and then moving a single address (the address we keep seeing in our error messages) before returning. Very clearly running the code that is generated when _GLIBCXX_FULLY_DYNAMIC_STRING is disabled.

    Ultimately, in our project when the error was coming up with std::string and we’d already enabled _GLIBCXX_FULLY_DYNAMIC_STRING, we started setting the optimization level to -Os instead of -O3. However, this solution seems pretty arbitrary, as we’re relying on the compiler’s decision of whether or not a function should be inlined or not. At the very least, it works, and we understand the root issue a lot better now. But at some point, if there is a change in the compiler, this could potentially break again.

    However, this leaves me very puzzled about your initial problem, John. And puzzled about our initial problem here that we mostly fixed by enabling _GLIBCXX_FULLY_DYNAMIC_STRING. And puzzled about why enabling _GLIBCXX_FULLY_DYNAMIC_STRING seems to fix this error for so many people in Snow Leopard (and probably in Lion now as well). I can see why at first blush it appears that libstdc++ must have changed to enable _GLIBCXX_FULLY_DYNAMIC_STRING, but as I’ve shown, that can’t possibly the case. Is it possible it was changed back in a later version of Snow Leopard? Is it possible that some 3rd party library actually turns this on somewhere? I’ve noticed that in all the examples I could find that trigger this error, all of them included some 3rd party library (boost or something else). My example here is the only example I’ve seen that does not use a 3rd party library and still triggers the error message. That leads me to lean more toward the 3rd party library being the issue, but I don’t have any evidence one way or the other on that.

    Anyway, I wanted to post this, at the very least, to give some perspective for anyone who is having this error even after enabling _GLIBCXX_FULLY_DYNAMIC_STRING like we did, as it looks like this issue is a lot more subtle and complicated than it originally appeared (and at first blush it’s already very subtle and complicated).

  5. Dammit… I spent a couple hours editing that comment, and then get the assertion at the very first wrong. I meant to say:

    “Second, to put it bluntly, I think you’re wrong. :-) libstdc++ in Snow Leopard was compiled with _GLIBCXX_FULLY_DYNAMIC_STRING turned off.”

  6. I found this post a good google hit for an identical headache developing for Android (NDK, native code, JNI and multiple libraries etc). It’s not quite your case but I think elaborating on my problem could help someone somewhere, or be interesting-

    If I had a better feel for how stl works and what it means to be linking with the stl library statically vs as a shared library, I may have avoided it, but in my case the error message wasn’t obvious to me:
    05-24 16:33:12.736: A/libc(25191): @@@ ABORTING: INVALID HEAP ADDRESS IN dlfree
    05-24 16:33:31.227: A/libc(25191): Fatal signal 11 (SIGSEGV) at 0xdeadbaad (code=1)

    Debugging showed that the problem was an (the) empty std::string being returned as the result of a call between my test library and a different shared library of mine. The segfault was the empty string then being disposed and destroyed at the end of the method scope.

    The explanation is as you found, this empty string object is implemented as something that shouldn’t be freed. And I’ve tried to statically link with android’s libgnustl_static- I didnt realize but evidently my two libraries have different concepts of what this special empty string data is, and so the test library didn’t detect that the object in this case must not be freed..

  7. THANK you. I was having this issue all over the place and only by debugging into the place where this occurs did I even come close.

    I had an issue very closely related to this thread, where a string assignment failed for seemingly no reason – a

    myStr = “reports/”

    line failed, even though I debugged and watched the constructor create the object. After some tracing through a debugger, it once went to the place this macro is defined and I found this thread.

    Thanks for some insight!

  8. Just a follow up to the great post and comments.

    We recently faced an issue with a crash in the destructor of an empty string. For us, the problem was only exhibited when the string class (in our case, a custom subclass of the std::basic_string with uint16 characters) was defined in a dynamic library. We had previously deleted the _GLIBCXX_FULLY_DYNAMIC_STRING which solved the crash in most executables. However one application continued to crash in the destructor of an empty string even without _GLIBCXX_FULLY_DYNAMIC_STRING defined. For us, the final problem was solved by setting GCC_SYMBOLS_PRIVATE_EXTERN = NO.

  9. I bet that this is another instance of an issue I found where libc++ (clang/osx’s new c++ standard library) tries to be ABI-compatible to libstd++ (the GCC C++ standard library) but get it wrong and ends up deallocating a string while there are still references to it. Since libSystem pulls in libc++ on newer versions of OS X (I checked on 10.8, but earlier versions likely do the same), this ABI-compatibility issue affects *all* programs, even those built with gcc and libstdc++, since you always link against libSystem, and hence the broken code from libc++ always gets pulled in. I’ve filed the bug with apple as bug 13100815. The following shell script produces a broken executable:

    #!/bin/bash

    echo "===== CREATING bug.cpp"
    cat > bug.cpp <<EOD
    #include

    struct err : public std::runtime_error {
    err(const std::runtime_error& e) :std::runtime_error(e) {}
    };

    int main(int argc, char** argv) {
    std::runtime_error rt("error");
    err e(rt);
    }
    EOD

    echo "===== BUILDING"
    clang++ -mmacosx-version-min=10.5 -g -o bug bug.cpp

    echo "===== RUNNING"
    if ! ./bug; then
    echo "---> FAILED, as expected"
    else
    echo "---> SUCCEEDED?"
    fi

    Running the produced binary with

    DYLD_PRINT_BINDINGS=1 ./bug | c++filt

    shows which symbols are pulled in from which library. On 10.8, I see

    dyld: bind indirect sym: bug:__ZNSt13runtime_errorC2ERKS_$lazy_ptr = libc++.1.dylib:std::runtime_error::runtime_error(std::runtime_error const&), *0x100001078 = 0x7fff8e9698b8

 Leave a Reply

(required)

(required)

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>