Instance variables = the new global variables?
It’s common programming wisdom that global variables are bad and should be avoided, but the question of why is rarely discussed. Thinking through the underlying reasons why we shouldn’t use global variables, it occurs to me that sometimes the same is true for instance variables in object-oriented programming.
Global variables make it hard to verify and debug programs because they have a long lifetime and wide scope. That is, as data storage areas they hang around for the entire lifetime of your program, and they are accessible by any part of your program. Thus they make it difficult to verify small chunks of program text, since faraway pieces of code can affect local correctness. This forces programmers to grok code in larger chunks, and in a large program this can quickly make debugging a nightmare.
Consider this pseudo-code example:
global filehandle fh;
function initialize ()
{
fh = open file (...);
}
function myfunc ()
{
.. do something with fh ...
}
function cleanup ()
{
close (fh);
}
If you are debugging this code as part of a large, unfamiliar application, it takes a lot of work to determine whether fh is always in the right state during execution of the myfunc() function.
One way to make this code easier to understand and debug is to shorten the filehandle’s lifetime:
global filehandle fh;
function myfunc ()
{
fh = open file (...);
.. do something with fh ...
close (fh);
}
The variable fh can still be touched by any part of the application, but it’s easy to verify that myfunc() is correct, since the code fragment does not depend on fh‘s initial value.
Another way to improve the situation is to reduce the scope of the filehandle variable, for example:
class foo
{
private static filehandle fn;
public static function initialize ()
{
fh = open file (...);
}
public static function myfunc ()
{
.. do something with fh ...
}
public static function cleanup ()
{
close (fh);
}
}
In this case the program is easier to debug because only the code in class foo can access the filehandle.
The ideal solution is to reduce both lifetime and scope at the same time by using a local variable to store the filehandle:
function myfunc ()
{
filehandle fh = open file (...);
.. do something with fh ...
close (fh);
}
Now to get to my point: In recent years, class libraries have been getting more complicated. Using Microsoft Visual Studio for example, it’s easy to create enormous classes, with large chunks of code hidden from view by default. Inheritance makes this worse. An instance variable in a widely used base class might be visible to large chunks of an application (i.e. very wide scope).
Furthermore, common coding patterns exist that result in long object lifetimes. For example, the main Form object for an application, together with its instance variables, will hang around for the lifetime of the application. In ASP.NET projects it’s even worse – a long-lived object is created for every application page. Finally, the hugely popular Singleton design pattern results in objects that last for the lifetime of the program. (There’s an interesting discussion on how hard it is to get rid of singleton objects in the book “Pattern Hatching” by John Vlissides.)
When you have mutable instance variables with wide scope and long lifetime, you have – for all practical purposes – global variables, with all their attendant problems.
So what should we as programmers do? I think large objects are here to stay (e.g. the ones generated automatically by Visual Studio). But we can still:
- Avoid creating new instance variables whenever there is a reasonable alternative solution passing data around as function arguments and return values.
- When new instance variables must be created, they should be in small, simple objects rather than large, complicated ones (the goal being to minimize the amount of program text in your program that can access the variable).
- Minimize object lifetime. Use the C# “using” statement where applicable. Use the Singleton pattern sparingly. (If you make a Singleton instance of a mutable object, you’ve created a global variable.)
- Whenever functions access public instance variables from large, long-lived objects (i.e. global-like variables), consider adding assertion checks, i.e.:
... assert (fh.IsOpen ()); ... go ahead and use fh ...



November 23rd, 2005 at 12:23 pm
Just wanted to nitpick on one point. You say:
To be clear, the
usingstatement serves a specific purpose – to ensure that Dispose() is called on IDisposable objects. It’s not a general-purpose “scope limiting” technique.You actually can create new scopes within a method by enclosing them in curly braces, like this:
public void DoStuff()
{
...
{
int foo = 9;
}
Debug.Write(foo); // doesn't compile - no foo here
}
But that’s just gross.
November 23rd, 2005 at 1:01 pm
I was thinking that when you are using IDisposable objects, the “using” statement simultaneously limits lifetime and scope, eliminating the temptation to stash the object away somewhere in case you need it later. For example, using a web service this way:
using (FooService s = new FooService ())
{
s.doSomething ();
}
rather than storing it in an instance variable.
November 23rd, 2005 at 3:40 pm
I think on a general note, you should design your classes to be cohesive. To serve one purpose well. If you find that you are adding more and more instance variables to a class, that is a clear sign that your class is losing cohesion, and that you may need to refactor the class into two or more smaller more cohesive classes.
November 25th, 2005 at 1:16 pm
One other reason I don’t like instance variables: they are frequently used for message passing, i.e. as a way for one part of the class to communicate with another. Mesage passing in this context is really just a euphemism for ‘side effect programming’. Making the message passing explicit, either through the use of method calls or events leads to far more coherent, testable code.