Category Archives: C / C++ / C++11

Self Expanding Loops in C

Imagine you want to define a C preprocessor statement like this:

#define NUM_NOPS(i) .....

and later on, in your code:

NUM_NOPS(5);

and you want your preprocessor to roll out the following code:

asm volatile("nop");
asm volatile("nop");
asm volatile("nop");
asm volatile("nop");
asm volatile("nop");

This is not as easy as it looks like. One possibility is to use Boost’s Preprocessor Repetition Utility: <boost/preprocessor/repetition/repeat.hpp>. But however, let’s assume you don’t want to use any third party libraries. There is one further possibility. You can use GCC’s unrool loops functionality:

#pragma GCC push_options
#pragma GCC optimize ("unroll-loops")
#pragma GCC optimize ("O2")
void nops() {
 int i = 0;
 for(; i < 10 ; i++) {
   asm volatile("nop");
 }
}
#pragma GCC pop_options

int main(void) {
  nops();
}

You should compile it with:

gcc -c -o test.o -O3 test.c

Having a look at the disassembly proves, that your loop was rolled out:

objdump -d test.o

test.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <nops>:
   0:   90                      nop
   1:   90                      nop
   2:   90                      nop
   3:   90                      nop
   4:   90                      nop
   5:   90                      nop
   6:   90                      nop
   7:   90                      nop
   8:   90                      nop
   9:   90                      nop
   a:   c3                      retq   

Disassembly of section .text.startup:

0000000000000000 <main>:
   0:   31 c0                   xor    %eax,%eax
   2:   e8 00 00 00 00          callq  7 <main+0x7>
   7:   f3 c3                   repz retq 

Gentoo: Using splitdebug for specific packages

It is often favourable to keep debugging information for certain packages.

Gentoo allows to keep those symbols by adding

CFLAGS="${CFLAGS} -ggdb"
CXXFLAGS="${CXXFLAGS} -ggdb"
FEATURES="${FEATURES} splitdebug"

to your make.conf

But this enables debugging information for all packages. In order to keep debugging symbols only for certain packages, you can define a splitdebug environment in /etc/portage/env/

/etc/portage/env/splitdebug.conf:

CFLAGS="${CFLAGS} -ggdb"
CXXFLAGS="${CXXFLAGS} -ggdb"
FEATURES="${FEATURES} splitdebug"

Now you can add your packages to /etc/portage/package.env:

sys-libs/glibc splitdebug.conf

So when compiling glibc, portage will use the splitdebug environment and keep debugging symbols.

Example Implementation of a Least Recently Used Cache in C++11

This is yet another example implementation of a Least Recently Used Cache written in C++11.
It is not designed to be threadsafe, but thread safety could easily be reached by locking
the same mutex at the beginning of all public methods.

This Cache uses std::map and std::list as container types. A copy and move constructor is implemented as well.
You may also download it here.
Usage:

LRUCache<int, string> cache(10); // Max. 10 Elements

cache[4] = "Foobar";
cache.insert(2, "Barfoo");

cout << cache[2] << endl;

auto tmp = cache.getCopy(2);
if(tmp == nullptr) {
  cout << "Not found..." << endl;
} else {
 cout << *tmp << endl;
}

LRUCache.h:

#ifndef LRUCACHE_H_
#define LRUCACHE_H_

#include <memory>
#include <map>
#include <list>

template <typename Key, typename Value>
class LRUCache {

private:
	typedef std::list<Key> HistoryType;
	typedef typename HistoryType::iterator HistoryTypeIterator;
	HistoryType _history;

	typedef std::map<Key, std::pair<Value, HistoryTypeIterator> > CacheType;
	typedef typename CacheType::iterator CacheTypeIterator;
	CacheType _cache;

	size_t _maxCapacity;

	void evict(size_t numElements = 1) {
		if(numElements > _history.size()) {
			numElements = _history.size();
		}

		while(numElements--) {
			auto it = _cache.find( _history.front() );
			_cache.erase(it);
			_history.pop_front();
		}
	}

	void updateHistory(const CacheTypeIterator& it) {
		_history.splice(_history.end(), _history, it->second.second);
	}
public:
	// if maxCapacity = 0 -> unlimited capacity
	LRUCache(size_t maxCapacity) : _maxCapacity(maxCapacity) {

	}

	~LRUCache() {

	}

	LRUCache(const LRUCache& other) : _maxCapacity(other._maxCapacity) {
		std::cout << "Copy Con" << std::endl;
		_cache = other._cache;
		_history = other._history;

		// Adjust Iterators
		for(auto it = _history.begin() ; it != _history.end() ; ++it)
			_cache.find(*it)->second.second = it;
	}

	LRUCache(LRUCache&& other) : _maxCapacity(other._maxCapacity) {
		_history = std::move(other._history);
		_cache = std::move(other._cache);
	}

	void dropCache(size_t maxRemainingElements = 0) {
		size_t elementsDropped;
		if(maxRemainingElements == 0) {
			elementsDropped = _history.size();
			_cache.clear();
			_history.clear();
		} else if (_history.size() > maxRemainingElements) {
			elementsDropped = _history.size() - maxRemainingElements;
			evict(elementsDropped);
		}
	}

	size_t size() {
		return _history.size();
	}

	void setMaxCapacity(const size_t& maxCapacity) {
		// problematic case: shrink cache
		if(maxCapacity < _maxCapacity && maxCapacity != 0)
			dropCache(maxCapacity);
		_maxCapacity = maxCapacity;
	}

	std::unique_ptr<Value> getCopy(const Key& id) {
		auto it = _cache.find(id);
		std::unique_ptr<Value> retval = nullptr;
		if(it != _cache.end()) {
			updateHistory(it);
			// Copy the element
			retval = std::unique_ptr<Value>(new Value(it->second.first));
		}

		return retval;
	}

	// Inserts new element. If the element already exists, it will be overwritten.
	// Returns a reference to the inserted object
	Value& insert(const Key& id, Value c) {
		// Check if the element is already existing
		auto it = _cache.find(id);
		if(it != _cache.end()) {
			// If the element exists, overwrite it
			it->second.first = c;
			updateHistory(it);
		} else {
			if(_maxCapacity != 0 && _history.size() == _maxCapacity) {
				evict();
			}
			auto end = _history.insert(_history.end(), id);
			auto newelem = _cache.insert( std::make_pair(id, std::make_pair(std::move(c), end ) ) );
			it = newelem.first;
		}

		return it->second.first;
	}

	Value&	operator[](const Key& id) {
		auto it = _cache.find(id);
		if(it != _cache.end()) {
			updateHistory(it);
			return it->second.first;
		}

		// Create new empty element
		return insert(id, std::move(Value()));
	}
};
#endif

Linux’s strange memory allocation strategy

As I learned today, Linux has a real strange memory allocation strategy. Have a look at this piece of code:

#include <stdio.h>
#include <stdlib.h>

int main(void) {
  while(1) {
    char *foo = (char*)malloc(1024);
    if(foo == NULL) {
      printf("Couldn't alloc\n");
      fflush(stdout);
      return 0;
    }
  }
  return 0;
}

According to the malloc reference it should return NULL if it is not able to allocate memory:

“If the function failed to allocate the requested block of memory, a null pointer is returned.”

So I thought my process would write Couldn’t alloc to stdout and exit in a proper way. But it didn’t act that way. When my system ran out of memory the process got killed by the kernel with signal SIGKILL. So why does it act like this? Wikipedia writes about Out of memory:

“A process which exceeds its per-process limit and then attempts to allocate further memory – with malloc(), for example – will return failure. A well-behaved application should handle this situation gracefully; however, many do not. An attempt to allocate memory without checking the result is known as an “unchecked malloc”.”

Well… Yes… Of course, you always should check malloc() if it returned NULL, but under normal conditions it never will return NULL because of Linux’s memory overcommitting. By default Linux has an optimistic memory allocation strategy. When allocating memory, malloc() will return a pointer. The space on the heap which is needed for that pointer gets allocated at the first read/write operation to that address. In my opinion, this is a really strange behaviour, because when memory gets allocated, it will actually be used. Here’s a short example to see the effects of that strange behaviour:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MALLOC_SIZE (1024*1024*10)

int main(void) {
  int i = 0;
  while(1) {
    char *foo = (char*)malloc(MALLOC_SIZE);
    //memset(foo, 0xaa, MALLOC_SIZE);
    printf("Allocation No. %d\n", i++);
    if(foo == NULL) {
      printf("Couldn't alloc\n");
      fflush(stdout);
      return 0;
    }
  }
  return 0;
}

With the memset line commented out it results to:

ralf@Pegasus:C$ ./test|tail
Allocation No. 1841101
Allocation No. 1841102
Allocation No. 1841103
Allocation No. 1841104
Allocation No. 1841105
Allocation No. 1841106
Allocation No. 1841107
Allocation No. 1841108
Allocation No. 1841109
Allocation Nzsh: killed     ./test |
zsh: done       tail

And with the memset commented in it results to:

ralf@Pegasus:C$ ./test|tail
Allocation No. 1275
Allocation No. 1276
Allocation No. 1277
Allocation No. 1278
Allocation No. 1279
Allocation No. 1280
Allocation No. 1281
Allocation No. 1282
Allocation No. 1283
Allocazsh: killed     ./test |
zsh: done       tail

This is really funny. You can allocate tons of space you don’t have. That’s creepy. But this behaviour can be switched by

echo 2 > /proc/sys/vm/overcommit_memory

“Since 2.5.30 the values are: 0 (default): as before: guess about how much overcommitment is reasonable, 1: never refuse any malloc(), 2: be precise about the overcommit – never commit a virtual address space larger than swap space plus a fraction overcommit_ratio of the physical memory. Here /proc/sys/vm/overcommit_ratio (by default 50) is another user-settable parameter. It is possible to set overcommit_ratio to values larger than 100. (See also Documentation/vm/overcommit-accounting.)” Source

Then, both results (with and without memset) lead to:

ralf@Pegasus:C$ ./test|tail
Allocation No. 433
Allocation No. 434
Allocation No. 435
Allocation No. 436
Allocation No. 437
Allocation No. 438
Allocation No. 439
Allocation No. 440
Allocation No. 441
Couldn't alloc

And this is exactly what I originally expected to be the normal way. Without this bypass, malloc will never return NULL and a process will not be able to shut down properly if the system runs out of memory.

Here‘s a nice anecdote on that topic.