It's a great scam, don't you think? Someone asks a question about how to design their code, and we have these two nebulous words to throw back at them: coupling and cohesion. We even memorize a couple of adjectives that go with the words: low and high.
Cohesion Good. Coupling, Baaaaad!
It's great because it shuts up the newbie who asks the question -- he doesn't want to appear dumb, after all -- and it gets all of those in-the-know to nod their heads in approval. "Yep, that's right. He's got it. +1."
But no one benefits from the exchange. The newbie is still frustrated, while the professional doesn't give a second thought to the fact that he probably doesn't know what he means. He's just parroting back the advice that someone gave to him. It's not malicious or even conscious, but nobody is getting smarter as a result of the practice.
Maybe we think the words are intuitive enough. Coupling means that something is depending on something else, multiple things are tied together. Cohesion means. well, maybe the person asking the question heard something about it in high school chemistry and can recall it has something to do with sticking together. Maybe they don't know at all.
Maybe, if they're motivated enough (and not that we've done anything to help in that department), they'll look it up:
Types of Cohesion and Coupling
Types of Cohesion Coincidental cohesion (worst) is when parts of a module are grouped arbitrarily (at random); the parts have no significant relationship (e.g. a module of frequently used functions).
Logical cohesion is when parts of a module are grouped because they logically are categorised to do the same thing, even if they are different by nature (e.g. grouping all I/O handling routines).
Temporal cohesion is when parts of a module are grouped by when they are processed - the parts are processed at a particular time in program execution (e.g. a function which is called after catching an exception which closes open files, creates an error log, and notifies the user).
Procedural cohesion is when parts of a module are grouped because they always follow a certain sequence of execution (e.g. a function which checks file permissions and then opens the file).
Communicational cohesion is when parts of a module are grouped because they operate on the same data (e.g. a module which operates on the same record of information).
Sequential cohesion is when parts of a module are grouped because the output from one part is the input to another part like an assembly line (e.g. a function which reads data from a file and processes the data).
Functional cohesion (best) is when parts of a module are grouped because they all contribute to a single well-defined task of the module Types of Coupling Content coupling (high) is when one module modifies or relies on the internal workings of another module (e.g. accessing local data of another module). Therefore changing the way the second module produces data (location, type, timing) will lead to changing the dependent module.
Common coupling is when two modules share the same global data (e.g. a global variable). Changing the shared resource implies changing all the modules using it.
External coupling occurs when two modules share an externally imposed data format, communication protocol, or device interface.
Control coupling is one module controlling the logic of another, by passing it information on what to do (e.g. passing a what-to-do flag).
Stamp coupling (Data-structured coupling) is when modules share a composite data structure and use only a part of it, possibly a different part (e.g. passing a whole record to a function which only needs one field of it). This may lead to changing the way a module reads a record because a field, which the module doesn't need, has been modified.
Data coupling is when modules share data through, for example, parameters. Each datum is an elementary piece, and these are the only data which are shared (e.g. passing an integer to a function which computes a square root).
Message coupling (low) is the loosest type of coupling. Modules are not dependent on each other, instead they use a public interface to exchange parameter-less messages (or events, see Message passing).
No coupling [is when] modules do not communicate at all with one another.
What does it all mean?
The Wikipedia entries mention that "low coupling often correlates with high cohesion" and "high cohesion often correlates with loose coupling, and vice versa."
However, that's not the intuitive result of simple evaluation, especially on the part of someone who doesn't know in the first place.
In the context of the prototypical question about how to improve the structure of code, one does not lead to the other. By reducing coupling, on the face of it the programmer is going to merge unrelated units of code, which would also reduce cohesion. Likewise, removing unrelated functions from a class will introduce another class on which the original will need to depend, increasing coupling.
To understand how the relationships become inversely correlated requires a larger step in logic, where examples of the different types of coupling and cohesion would prove helpful.
Examples from each category of cohesion
Coincidental cohesion often looks like this:
int main(void) <
where almost all of your code goes here;
In other words, the code is
organized with no special thought as to how it should be organized. General helper and utility classes, God Objects. Big Balls of Mud. and other anti-patterns are epitomes of coincidental cohesion. You might think of it as the lack of cohesion: we normally talk about cohesion being a good thing, whereas we'd like to avoid this type as much as possible.
(However, one interesting property of coincidental cohesion is that even though the code in question should not be stuck together, it tends to remain in that state because programmers are too afraid to touch it.)
With logical cohesion. you start to have a bit of organization. The Wikipedia example mentions "grouping all I/O handling routines." You might think, "what's wrong with that? It makes perfect sense." Then consider that you may have one file:
While logical cohesion is much better than coincidental cohesion, it doesn't necessarily go far enough in terms of organizing your code. For one, we've got all IO in the same folder in the same file, no matter what type of device is doing the inputting and outputting. On another level, we've got functions that handle both input and output, when separating them out would make for better design.
Temporal cohesion is one where you might be thinking "duh, of course code that's executed based on some other event is tied to that event." Especially considering the Wikipedia example: a function which is called after catching an exception which closes open files, creates an error log, and notifies the user. But consider we're not talking about simple the relationship in time. We're really interested in the code's structure. So to be temporally cohesive, your code in that error handling situation should keep the closeFile. logError. and notifyUser functions close to where they are used. That doesn't mean you'll always do the lowest-level implementation in the same file -- you can create small functions that take care of setting up the boilerplate needed to call the real ones.
It's also important to note that you'll almost never want to implement all of that directly in the catch block. That's sloppy, and the antithesis of good design. (I say "almost" because I am wary of absolutes, yet I cannot think of a situation where I would do so.) Doing so violates functional cohesion, which is what we're really striving for.
Procedural cohesion is similar to temporal cohesion, but instead of time-based it's sequence-based. These are similar because many things we do close together in time are also done in sequence, but that's not always the case. There's not much to say here. You want to keep the definitions of functions that do things together structurally close together in your code, assuming they have a reason to be close to begin with. For instance, you wouldn't put two modules of code together if they're not at least logically cohesive to begin with. Ideally, as in every other type of cohesion, you'll strive for functional cohesion first.
Communicational cohesion typically looks like this:
some lines of code;
data = new Data();
function1(Data d) <. >;
function2(Data d) <. >;
some more lines of code;
In other words, you're keeping functions together that work on the same data.
Sequential cohesion is much like procedural and temporal cohesion, except the reasoning behind it is that functions would chain together where the output of one feeds the input of another.
Functional cohesion is the ultimate goal. It's The Single Responsibility Principle [PDF] in action. Your methods are short and to the point. Ones that are related are grouped together locally in a file. Even files or classes contribute to one purpose and do it well. Using the IO example from above, you might have a directory structure for each device, and within it, a class for Input and one for Output. Those would be children of abstract I/O classes that implemented all but the device-specific pieces of code.
Examples from each category of coupling
Content coupling is horrific. You see it all over the place. It's probably in a lot of your code, and you don't realize it. It's often referred to a violation of encapsulation in OO-speak, and it looks like one piece of code reaching into another, without regard to any specified interfaces or respecting privacy. The problem with it is that when you rely on an internal implementation as opposed to an explicit interface, any time that module you rely on changes, you have to change too:
data_member = 10
10 * A->data_member end
What if data_member was really called num_times_accessed. Well, now you're screwed since you're not calculating it.
Common coupling occurs all the time too. The Wikipedia article mentions global variables, but this could be just a member in a class where two or more functions rely on it if you consider it. It's not as bad when its encapsulated behind an interface, where instead of accessing the resource directly, you do so indirectly, which allows you to change internal behavior behind the wall, and keeps your other units of code from having to change every time the shared resource changes.
An example of external coupling is a program where one part of the code reads a specific file format that another part of the code wrote. Both pieces need to know the format so when one changes, the other must as well.