The missing C++ string class
cstr is the string class you expected to find in STL, but didn't. The design goal of cstr is to make all the string handling functions from the standard C library available within a single C++ class.
The terse names of the original C functions have been
preserved to ease the transition for C programmers. But the "str" prefix has been dropped.
For example strlen(mystring)
is now mystring.len()
.
New functions have been given short but less cryptic
names using the same style (one word, all lower case).
Most cstr functions work as they do in the standard C library or slightly better. Some examples:
- Automatic buffer allocation and deallocation.
- The resulting string is always zero terminated.
- All functions are thread safe in the sense that the class (but not an individual object) can be used by multiple threads concurrently.
- Pre-conditions are tested using assert() which makes it possible to trap "out of bounds" errors in debug build, with no speed penalties in release build.
Download
Click here to download the source code. It is free to use and distribute. The ZIP archive is only 8kB and includes a full set of unit tests. The last update was October 2008.
Memory Allocation
Memory is allocated on the heap when a string object is created, and released when the object is destroyed. Memory is allocated in blocks of a fixed size. More memory is automatically allocated if a string grows out of its current allocation. Memory is not released when a string shrinks unless garb() is called. The block size is controlled by MEM_ALLOC_SIZE and can be changed to find the right balance between memory consumption and memory fragmentation.
Performance
The performance of cstr depends on the performance of the string and memory handling functions in the standard C library on the target system. The cstr class does not use any copy-on-write optimization since it has been shown that such optimizations does not work in a multi-threaded environment.
If you have a performance problem releated to string creation, you can try to re-use string objects instead of destroying old objects and creating new ones. If you are building long strings in small increments, a bigger MEM_ALLOC_SIZE will improve the performance by reducing the number of re-allocations.
Localization
The cstr class use the char data type to represent individual characters. This means that cstr is limited to character sets (code pages) that contains no more characters than can be represented by a single char variable. The cstr class relies on the current locale setting to identify the current character set. The "C" locale is usually the default locale. The character set of the "C" locale is 7 bit ASCII, which is only suitable for english text. The locale setting can be changed by the setlocale() function in the C library. This is usually done only once, in the beginning of a program.
Pitfalls
Certain tradeoffs have been made that has to do with complex C++ issues such as operator overloading. These tradeoffs creates a couple of pitfalls that are best described by example:
// The following code will compile:
cstr s("hello");
if(s[0] == 'h') s.set(0,'H');
// The following code will NOT compile:
cstr s("hello");
if(s[0] == 'h') s[0] = 'H'; // Error
// The following code will compile:
cstr s1("Hello");
cstr s2 = s1 + " world!";
// The following code will NOT compile:
cstr s2 = "Hello" + " world!";
Feedback
Feedback about cstr can be sent to the following email address: