This article will focus on the appropriate use of IF statements, and also show you when it’s better to move into more robust formulas like LOOKUP, VLOOKUP, HLOOKUP & INDEX/MATCH.
NOTE: this article is written for Excel 2010 and its Ribbon Interface. The examples you see were created in Excel 2010, but saved as Excel 97-2003. Working examples are hosted on Microsoft’s SkyDrive service for you to download.
IF statements are one of the core formula models you can use in Excel, and they can be very powerful with regards to their logic. Very simply they follow this methodology: IF(something is True, then do something, otherwise do something else).
- =IF(A1=”Yes”,1,2). which simply says if A1 = Yes, then return a 1, otherwise return a 2.
You can also combine IF statements to evaluate multiple criteria by nesting them, however, many people try to get IF statements to do too much. Previous versions of Excel allowed up to 7 levels of nested IF statements, which means that you could evaluate up to 7 different criteria. Excel 2007+ allows 64 levels of nested IF statements, but just because you can do something doesn’t mean you should !
One of the most common uses of IF statements is to return data that matches specific criteria, or falls within a range of values. Here is a relatively common example of a nested IF statement to return letter grades based on test scores:
Another is calculating commission statements:
Note that in both formulas, the criteria need to be ordered sequentially in order for the formula to calculate correctly. So in the first example A1>=90 gets evaluated first, and if that condition is true, then the formula performs the calculation associated with that condition. If the condition isn’t true, then it moves onto the second, and so on. But if you get your conditions out of order, then one condition can invalidate the next and render your formula(s) useless. One of the inherent weaknesses with IF statements is that they need to be precise and ordered.
While both of these formulas work fine, they’re unwieldy and should be avoided if at all possible. Why? Primarily because the data in the formulas is static, so if the conditions driving the formula ever need to be changed, the formula needs to be manually adjusted as a result. Granted, the grades example isn’t too bad, because it’s not likely that information will change too often, but just imagine how much work you’d have to do if you have a lot of formulas like the commission example, and you have to change the criteria. Ideally, you’ll get into the habit of only using IF statements for Text comparisons like the earlier example. Yes/No/Maybe or Male/Female evaluations are very common, and the nice thing about them is that criteria aren’t likely to change very often. If you find yourself with situations like this then by all means use IF statements, otherwise it’s time to move up to more robust alternatives, starting with LOOKUP.
One of the primary reasons to move away from IF statements for multiple criteria is so that you can use “table based” reference data. This gives you the ability to have your data points on a worksheet, where the values can be easily changed, as opposed to hardcoded in a formula, where changing the values can be a challenge. Many workbooks have broken because of numerous IF statements that someone didn’t update. In the commission example above, what would happen if you needed to change the 2% and you had hundreds of formulas depending on that one? It wouldn’t be fun, and that’s a relatively small example. Imagine one with 64 conditions! And that’s if you can even find the formula in the first place! Table based dependencies are much easier to change on the fly, which can mean a lot especially if you’re dealing with complex models and testing multiple criteria.
Note: data tables that drive referential formulas don’t need to be housed in the open where users can change them. Very often you’ll find sensitive tables housed on hidden worksheets, where only the creator/administrator can change the data.
The first formula in the family of referential formulas is LOOKUP (they’re called referential formulas because they can return a reference to a value, or a value in a range of values). LOOKUP has plenty of uses, but it’s also the most fragile and least flexible in the Lookup family.
Following are examples of a Discount table, and a Grades table using a standard LOOKUP function:
A LOOKUP formula evaluates a value (either hardcoded or in a cell) then finds a match for it in a table that lists the values to be looked up, and the values to return. In these examples I’m using LOOKUP to return a variable range from a list of values. LOOKUP is governed by the following syntax:
- =LOOKUP(Value you want to look up, range where you want to return the rightmost value).
NOTE: LOOKUP formulas can look up a hard-coded value, or they can look up a value from a cell. =VLOOKUP(900,A2:B23) & LOOKUP(A5,A2:B23) do exactly the same thing, but the latter is much more flexible since you don’t have to change the 900 value whenever you want to change the lookup value. You should always try to use cell references in formulas as opposed to hard-coded values.
So in the example above, =LOOKUP(A5,A2:B23) will look in range A2:A23 for the value closest to what was entered in cell A5, and return the corresponding value from column B. In this case, looking for $900 results in a 2% result by matching $750, which is the largest value that’s less than or equal to the value evaluated by the formula. Had you entered a value between $1,000 and $1,249 the result would have been 3%. Note that the LOOKUP formula requires your data to be sorted in Ascending order, otherwise it’s not going to provide you with consistent or correct results. A quick note on LOOKUP and its counterpart VLOOKUP is that they can only look to the right of the lookup value; they can’t go to the left. For that you need to combine INDEX/MATCH, which will be covered later.
VLOOKUP – This is the big brother to the relatively limited LOOKUP, and gives you more flexibility with regards to what data you can return and where. VLOOKUP follows this syntax:
- =VLOOKUP(Value you want to look up, range where you want to lookup the value, number of columns to the right of the found value, Exact Match or Approximate Match – Indicated with 0/FALSE or 1/TRUE).
So where LOOKUP is going to return a value from the rightmost column in the range that you specify, VLOOKUP lets you lookup a multi-column range, and choose which column from which to return a value. VLOOKUP is very often used in finance scenarios with 12-Month periods because you can easily choose which month to return.
Note the first two formulas use static column references (2 & 3), which tell the formula to return values from the second and third columns respectively:
This is all well and good, and for many models like this one it’s ok, but it can be time consuming to have to go into each formula and adjust those column references. The latter formulas (March forward) use a trick to make the columns dynamic by using the COLUMN() function instead:
So in the case of March, it’s the 4th column to the right, and COLUMN() returns a 4, which feeds the correct number to the rest of the formula. But this example assumes that you’ll have a relatively static data set like the one above. But what if your column headers aren’t
always the same? The next example shows you how to build a dynamic VLOOKUP where you don’t have to know the headers or where they’re located, you simply have the formula do it for you.
In this example you could use: =VLOOKUP($A5,$A$9:$F$48,2,FALSE). but you would need to manually adjust the column reference accordingly, knowing that Name is column 2, Department is column 5, Earnings column 6, and Region column 4. Or you can introduce the MATCH formula. Using MATCH, you don’t need to know the column headers, because you tell the formula to find them for you.
The formulas in B5:E5 are (the red text indicates where the formulas are different):
This is very similar to using the COLUMN() function in the previous example, except this time the columns aren’t in order, so we can’t use that. Instead we use MATCH, which finds the value we specify. In this case we match the header values in B4:E4 (Name, Department, Earnings, and Region) with the corresponding headers in A8:F8. MATCH returns a numeric match from the beginning of the data series to the end, so Name is 2, Department is 5, and so on.
MATCH syntax is as follows:
- =MATCH(Value or cell you want to find, Range to look in, 0 = Only find an exact match).
NOTE: If you leave the 0 out or put a 1, the formula will find the closest approximate match.
Next in the LOOKUP family of functions is HLOOKUP. Where VLOOKUP is a vertical lookup, looking from left to right in a range, HLOOKUP is a horizontal lookup, which goes from top down in a range.
HLOOKUP’s syntax is very similar to VLOOKUP:
- = HLOOKUP(Value you want to look up, Range you want to look in, number of rows down to return a value, TRUE/FALSE)
NOTE: Here again, 1/TRUE will return an approximate match, while 0/FALSE will return an exact match.
In this case I’m doing a dynamic HLOOKUP so that I don’t have to know the positions of the rows that I want to return.
Where the formulas in C5:E5 are:
So for C5, I’m looking for the value in B5 (Qtr3) in A7:F16, and MATCH is returning the 4th row down for Gross Profit. Without the MATCH formula it would be:
And you would need to manually adjust the row references as you copied the formula, so you’d need to manually determine where Gross Profit, Net Profit and Profit % fell within that range and adjust the formula to suit.
Finally, we’re moving onto what is probably the most powerful of lookup combinations, which is when you use INDEX & MATCH together. You’ve already seen how you can use MATCH to make lookups more dynamic, but what happens if you need to do a lookup to the left, when LOOKUP’s only go to the right? INDEX/MATCH is the solution, and once you get the hang of it, you’ll be surprised how easy it is.
You know that MATCH returns a reference for a value found in a table by giving you the value’s position in the table. INDEX returns a ROW & COLUMN reference in a range based on what row & column you tell it to look in, so:
Would return the value in cell B1, where INDEX’s syntax is:
- =INDEX(Range you want to look in, # of rows to look down from the top of the range, # of columns to look to the right of the range).
But wait, you might be thinking that’s contrary to what I said about INDEX/MATCH being able to look left! Don’t worry, they can and the following example shows how:
Note: as with LOOKUP’s you can return a matching value in the same column as the lookup value, but that’s generally pointless unless you’re just trying to see if a value actually exists in a table.
In this case, we’re using the Employee Name to return the other values in the list. Some are to the right of Employee Name and some are on the left. The various formulas are as follows:
The only variation in the formulas is A5:A9, where I’m telling it what column header to find in the information table.
So the INDEX formula says Look in the range A12:F51 and first find the row number that holds the employee name in B12:B51, then find the column number that holds the header (Employee ID #, SSN, Region, etc.) in A11:F11, and return the intersection of those two values. So in this example for Sandy Stewart, for Employee ID #, I want the 4th row in the first column, which is to the left of the name column we’re looking in. So that’s how you can lookup to the left.
Now you might be thinking that’s a bit hard to fathom all at once, and frankly it can be, but there’s a trick to building complex formulas like this: you do them one step at a time. You’ll see the detail of how I got here in the sample workbook, but I’ll walk you through it just in case.
First, since I know I’ll be using Employee name as the primary driver for the rest of the formula, I start there by matching a sample name in the Name column. In this case, Sandy Stewart is row 4, which I find with a MATCH formula:
=MATCH($B$4,$B$12:$B$51,0) – So find Sandy’s name in column B and give me an exact match.
Then starting with Employee ID #, which I know is column 1, I take the 4 and add it to an INDEX formula:
- = INDEX($A$12:$F$51,4,1)
Which gives me “E033”, and that’s the fourth row in the first column.
Now I need to replace the 4 and the 1. I already have the 4, so I can add that next:
Next I need to find the 1 for Employee ID #, which I can do with another MATCH formula:
Finally, I replace the 1 with my new MATCH formula:
Now you can copy the final formula down, and as long as you got your Absolute references right it will automatically update for each cell it’s copied into.
Hopefully, that gives you a better understanding of the various ways to match criteria and retrieving data from both formulas, and tables. As always, please feel free to ask any questions you’d like and don’t forget to take a look at the example workbook I put together. You’ll find all of these examples there and you can work on all of these formulas yourself.
Author’s note: We made an interesting discovery that is worth sharing after updating our blogsite. This article was written in Microsoft Word and pasted in The SmittyPro (WordPress 3.1) blog which caused Microsoft Internet Explorer 8 and 9 to break the blog. (The site was viewable in FireFox and Chrome.) Content would stop after the title of this entry. After excruciating research, we found out that you cannot paste directly into the WordPress admin “Edit Post” window from a Microsoft Office product. The legacy code and formatting from Word (or almost any other Microsoft Office product) gets pasted along with the text you have written, and this extra code causes the Microsoft IE8 and IE9 to break. This is explained in detail on WFTCyndi.com .