Fake binary trees
... rants, ramblings and occasional good idea ...

Automatic memory management myth

At a recent interview, a job candidate ticked me off when we reached one of the topics which is very dear to my heart. We are mostly C++ shop, but many of our job applicants come from .NET background. So during the interview we came to the topic of explicit memory management vs. automatic memory management (a.k.a. garbage collection), and the guy (with substantial C++ experience) started ranting about how troublesome explicit memory management is, with all this extra caution required to call delete for every new, as opposed to simplicity of platforms which support GC.

Well, if you happen to agree with this guy, then you are not doing it right :)

Although GC can be done in C++ (see here), that is not the point: the main issue here is that many people are still doing C code using C++ compiler.

Modern C++ is very different beast from C, and as such provides different patterns for common problems. The concept of smart pointers alleviates much of the manual work that is otherwise needed to handle allocations correctly. The STL comes with std::auto_ptr, which makes it trivial to ensure the proper deallocation in face of exceptions or normal scope exit. 

If you need to handle an array, STL is your friend again: there is std::vector, and it also takes care of transparent resizing/reallocation, so you don't have to worry about it.

What is (currently) lacking in STL is an implementation of smart pointer with shared semantics. However, there is plenty of such implementations, most notably the excellent boost::shared_ptr, which is a reference counted implementation that makes sure that the allocated object is deleted when the last reference to it goes away. It will also become a part of new C++ standard, and some vendors support it already.

One issue where GC (at least when implemented with mark-and-sweep algorithm, like in Java or .NET) has an advantage to explicit memory management is the problem of circular references, i.e. when two objects hold a reference to each other. This is really a problem for reference counted implementations, but most cases can be handled by judiciously using weak pointers, and such an implementation is provided by boost::weak_ptr, which will also be included in new C++ standard.

If you think that this problem is a 'deal breaker' to prefer platforms with GC, bear in mind that the similar problem also exists even there, and that there is a very good reason why Java and .NET both provide WeakReference class (e.g. see here for one such problem).

What I really find annoying is the fact that the only resource which is deemed to be important enough to be automatically handled is memory. For everything else, like database connections, kernel or GDI objects, clients have to explicitly call Dispose, and it seems that most developers on managed platforms have no problems with that. A coworker who recently switched from VB to .NET found out hard way that you really have to call Dispose() on your bitmaps. What happens when you have to share such an object between multiple clients (yes, I know that there are idioms to avoid this resource sharing)? Who calls Dispose? Can you be sure that it has not already been disposed? 

GC is a nice and useful thing, but it is not a silver bullet, and has its own problems.

Deterministic finalization is what makes it possible in C++ to treat all resources equally: whether it is a memory, a bitmap, a COM object or database reference, with a simple wrapper around it, you can rest assured that the object will be properly released as soon as it is not referenced anymore. Actually, smart pointers are only a special case of what is one of the most powerful concepts in C++, although a bit unfortunately named: RAII

If you are a C++ programmer, take some time to learn C++ idioms. It will make you a better programmer, and your code a better code.

When you program in C++, write C++, not C.

Posted at 14:32 on April 17, 2008
Categories: C++ | Software Design   E-mail | del.icio.us | Permalink | Comments (0) | Post RSSRSS comment feed

Access control based security

Natural question to ask after previous post is: that's all fine and dandy but how do you combine this with access control list (ACL) based security?

First, let's explain the issue here: What I refer to as 'ACL based security' is defining permissions (access rights) for individual resources, similar to the way operating systems allow access to file system. E.g. user 'xy' can see all tasks for projects he manages, but also all other tasks in other projects where their managers have allowed access to 'xy', or tasks which are assigned to 'xy'. This changes our imaginary security API from HasPermission(user, permission) to HasPermissionFor(user, permission, object)

Although security is usually not considered a business logic, the line starts to get blurry here. In my opinion, this is both a business and infrastructure concept.

One possible solution, which unfortunatelly can be seen too often, is to retrieve data as usual and then throw away resources which don't match the permissions. This approach fails miserably in many aspects: performance, filtering, paging, etc.

I am not sure if it is even possible to create a 'one size fits all' solution for this problem. However, in most systems that I had to deal with, following solution was able to get me quite far.

Note: I presume that the infrastructure is already set up: additional database tables store ACL entries which define who is allowed or denied access to individual resources, so that queries which retrieve items from the database can join on these tables. This is not a trivial thing and can become quite complex, especially when you take into account resource hierarchies (e.g. project-task), but it is out of scope of this post.

Anyway, suppose that we have ITaskManagementService which exposes following method:

[RequiresPermission(Permission.Edit)]
void GetTasks( ... parameters...)
{
    // m_taskRepository is an instance of ISecureRepository<T>
    m_taskRepository.GetAll(); 
}

Service has SecurityInterceptor implemented through Windsor/DP2, which checks for RequiresPermission attribute and do something like:

class SecurityInterceptor : IMethodInterceptor
{
    public object Intercept(IMethodInvocation invocation, params object[] args)
    {
            CallContext[Context.Security] = new SecurityContext(CurrentUser, attr.Permission);         
            return invocation.Proceed(args);
    }
}

than in the SecureRepository implementation, permissions are added to the query:

public IList<T> GetAll()
{
    ICriteria criteria = BuildCriteria();
    return criteria.List<T>();
}

public void BuildCriteria<T>()
{
    ICriteria criteria = Session.CreateCriteria(typeof(T));
    SecurityContext security = CallContext[Context.Security];
    // now we modify criteria to join entity tables with ACL tables...
    AddPermissions(criteria, security.UserId, security.RequiredPermissions);
}

This will make sure that GetAll() method returns only those tasks for which the caller has sufficient permissions. The drawback of the solution is that it only works if the resources and permissions are stored in tables in the same database so you can join them, but this usually isn't an issue for most small to medium solution.

Posted at 18:01 on February 6, 2008
Categories: .NET | Software Design   E-mail | del.icio.us | Permalink | Comments (0) | Post RSSRSS comment feed