Monday, August 6, 2012

Reference Counting

The code related to the memory management is generally collected under mem/. The reference counting has two parts to it:

  • The objects that can be managed by reference counting.
  • The references that do the counting.

The managed objects come in two varieties: single-threaded and multi-threaded. The single-threaded objects lead their whole life in a single thread, so their reference counts don't need locking. The multi-threaded objects can be shared by multiple threads, so their reference counts are kept thread-safe by using the atomic integers (if the NSPR library is available) or by using a lock (if NSPR is not used). That whole implementation of atomic data with or without NSPR is encapsulated in the class AtomitInc in mem/Atomic.h.

The way a class selects whether it will be single-threaded or multi-threaded is by inheriting from the appropriate class:

Starget (defined in mem/Starget.h) for single-threaded;
Mtarget (defined in mem/Mtarget.h) for multi-threaded.

If you do the multiple inheritance, the [SM]target has to be inherited only once. Also, you can't change the choice along the inheritance chain. Once chosen, you're stuck with it. The only way around it is by encapsulating that inner class's  object instead of inheriting from it.

The references are created with the template Autoref<>, defined in mem/Autoref.h. For example, if you have an object of class RowType, the reference to it will be Autoref<RowType>. There are are some similar references in the Boost library, but I prefer to avoid the avoidable dependencies (and anyway, I've never used Boost much).

The target objects are created in the constructors with the reference count of 0. The first time the object pointer is assigned to an Autoref, the count goes up to 1. After that it stays above 0 for the whole life of the object. As soon as it goes back to 0 (meaning that the last reference to it has disappeared), the object gets destroyed.No locks are held during the destruction itself. After all the references are gone, nobody should be using it, and destroying it is safe without any extra locks.

An important point is that to do all this, the Autoref must be able to execute the correct destructor when it destroys the object that ran out of references. Starget and Mtarget do not provide the virtual destructors. If you don't use the polymorphism for some class, you don't have to use the virtual destructors. But if you do use it, i.e. create a class B inheriting from A, inheriting from [SM]target, and then assign something like

Autoref<A> ref = new B;

then the class A (and by extension all the classes inheriting from it) must have a virtual destructor to get everything working right.

It's also possible to mess up the destruction with the use of pointers. For example, look at this sequence:

Autoref<RowType> rt = new RowType(...);
RowType *rtp = rt; // copies a reference to a pointer
rt = NULL; // reference cleared, count down to 0, object destroyed
Autoref <RowType> rt2 = rtp; // resurrects the dead pointer, corrupts memory

The lesson here is that even though you can mix the references with pointers to reduce the overhead (the reference assignments change the reference counters, the pointer assignments don't), and I do it in my code, you need to be careful. A pointer may be used only when you know that there is a reference that holds the object in place. Once that reference is gone, the pointer can't be used any more, and especially can't be assigned to another reference. Be careful.

There are more varieties of Autoref<>:

Onceref<>
const_Autoref<>
const_Onceref<>

The Onceref is an attempt at optimization when passing the function arguments and results. It's supposed to work like the standard auto_ptr: you assign a value there once, and then when that value gets assigned to an Autoref or another Onceref, it moves to the new location, leaving the reference count unchanged and the original Onceref as NULL. This way you avoid a spurious extra increase-decrease. However in practice I haven't got around to implementing it yet, so for now it's a placeholder that is defined to be an alias of Autoref.

const_Autoref<> is a reference to a constant object. Essentially, const_Autoref<T> is equivalent to Autoref<const T>, only it handles the automatic type casts much better. The approach is patterned after the const_iterator. The only problem with const_Autoref is that when you try to assign a NULL to it, that blows the compiler's mind. So you have to write an explicit cast of (T*)NULL of (const T*)NULL to help it out.

Finally, const_Onceref is the const version of Onceref.

No comments:

Post a Comment