Everyone knows the diatribe spouted by certain types of programming evangelists that a good garbage collector can give a program higher memory throughput than one under manual memory management (or some equivalent scheme like reference counting) — not only that but manually managing your own memory is pretty much a waste of good development time; no real programmer should have to lower him/herself to dealing with memory. So the appeal of Garbage Collection is pretty obvious. Why would anyone want to do their own memory management when you can let the computer do it for you, and at the same time have it work out to be more efficient?
The thing is, garbage collection isn’t a trivial problem, and it’s certainly not gotten to the point where it’s perfect. Cycles aren’t really an issue for modern conservative style GC’s, not like they are with reference counting schemes, but that’s not to say they don’t still cause a problem. Discovering cycles isn’t free, and generally in a complicated memory setup, even with a good fully modern GC you still end up with weak references / pointers like you would using simple reference counting.
I’m writing this because I’ve recently fallen onto the GC bandwagon as I’ve been using D quite a bit lately. Prior to D my only experience with GC was some time spent with Objective-C, and 4 years of undergrad cursing using Java — but that doesn’t really count
, and of course a myriad of interpreted languages like Python, PHP, lua, Ruby, etc. I have used the Hans Boehm GC with C and C++ a few times as well, but that was for select cases where the scenarios were a good match for a garbage collected system.
After spending some time really working with a GC language (under situations that are less than an ideal use for a GC system — ie a real-time interactive simulation), I’ve come up with a more rounded opinion of garbage collection, and I thought I ought to share some of the things I’ve discovered that the GC pundits don’t seem to want you to know:
- Managing memory using GC, in complex data structure situations, where weak ref’s and cyclical dependencies are required (pretty much every largish program every written), can be more complicated and time consuming than manual memory management. If someone tells you GC is all about “new and forget”, while it is true much of the time, you’re still okay to give them a good kidney punch because they’ve never really used GC.
- Memory leaks are not impossible with GC. It doesn’t take much to fool a GC system into believing memory should be retained when it shouldn’t. This is a continuation of the above point, as it generally only happens in complicated situations, but nevertheless GC is not the panacea-all-hail-the-mother-of-memory-leak-free-programming that the pundits claim it to be. If programmers can’t be trusted to manage their own memory, and GC systems are easy to fool, what makes us think that programmers should be trusted to manage GC memory either?
- Using GC is slow. “What?! Didn’t you say using GC is more efficient and leads to higher memory throughput?” I did, but I was baiting a bit, because that’s what the propaganda says. Yes using a good, modern, GC leads to higher throughput of memory allocs and frees, this is true. What the GC evangelists don’t remind you of is that most [well designed] programs allocate a bunch of memory, use the same memory over and over again, and then release the memory, so most of a program’s time [usually] isn’t spent allocating and deleting. If your GC app needs to run for a lengthy period of time, chances are the GC’s sweep phase is going to use up more processing power than would have been “wasted” on manually allocating your own memory. Which leads me to the next point.
- GC makes it easy to fall into bad programming habits. This is a bit subjective, and obviously not true in all cases. But in using a GC language I’ve noticed a tendency in myself to new things that shouldn’t necessarily be newed. Since I don’t have to worry about memory management the odd new here or there doesn’t really matter, right? In a non-GC language I would certainly have created a temporary on the stack rather than allocate memory on the heap (which as we all know is orders of magnitude slower than a stack alloc — and it’s even worse in a GC scenario). Basically think of any Java code you’ve ever seen… how many times have you seen something like: GridCell.setPosition(new Position(x, y)) inside a loop? Ack! Granted this is sort of unavoidable in Java, but especially for people who are still learning, this is a terrible habit to pick up.
So given all of this, it sounds like I’m saying GC is complicated, slow, and makes you dumb. But I’m not. I still hold true to the notion that memory management is something a programmer shouldn’t often have to explicitly deal with. GC is a good thing!
But getting the whole story with garbage collection is important. Like any tool, it’s one of those things that has good use cases and bad, and knowing where and when to use it is key. Fortunately, in D’s case, it’s ridiculously easy to mix manual and GC memory management, so for me, it’s not that much of an issue. I just get sick and tired of hearing the all-pro GC arguments, because in reality GC makes life for a programmer a little bit tougher. And by tougher I mean that more knowledge is required to complete a task.
In the end though it’s definitely worth it. GC might not get rid of memory leaks, and it’s probably going to make your program slightly less efficient, and it might even introduce a few “bad” habits, but in some cases these are reasonable trade offs. If you know a temporary heap allocation is going to get collected right away, and you’re not in a tight loop or something, then why not let the GC do some work for you! If using GC ends up leading to some complicated weak reference scheme, do it manually!
Basically it comes down to the age old adage of: know your tools. Garbage collection can save valuable development time by freeing a programmer from having to deal with the bulk of an application’s memory management, thus leaving more time to focus on those situations where you do need to get dirty with some mallocs and frees (or hopefully on things more interesting than memory management). But, if used improperly, or in the wrong situation, it’s pretty easy for GC to end up causing more problems than it solves.











