Array formulas – The basics

Looking up data from tables is very important when creating dashboards or for that matter even when crunching data for any analysis. While there are already some lookup formulas, that you know of and we have covered in our Dashboard 101 tutorial as well, not all of these formulas are capable of giving you the desired result.

For example let’s look at the data table below. It contains the data for drug samples given by sales reps to physicians. In order from left to right the columns are 1) the visiting reps’ name, 2) the visited physicians’ name, 3) the date of visit and 4) the number of samples given.

Array formula example table

Now if you were to find out, in a single formula, the total number of samples left by reps containing ‘Andrew’ in their name and visit date between 1st Jan and 15th Jan 2012 (both inclusive)– how would you do that?

Well… the answer is array formula. We are not saying that array formulas are the only way – but that they are a neat way and a way you should learn to explore a completely new world of formulas.

This is our first post in the five-post series on Array formula that we have designed for you. The first two posts will mostly cover the basics and the rest three will be application (mostly). We will also touch upon the SUMPRODUCT formula at the end of the series and see how this is a possible substitute to array formulas in many cases. By the end of this series of tutorial you would not only be able to solve the question posted above but many other complex and challenging problems that would have given you a hard time if you were to use simple formulas only.

So what is an array, first of all?

Array is just a fancy name for a table and that’s how the programmers like to call it.A table can be of two types 1) which has just one row or one column, or 2) with more than one row and column. The first is called a one-dimensional array and the second a two-dimensional array.

Two dimensions are all we need for Excel and yes dimensions are just another fancy way of saying rows and columns.

But why are we so bothered about the dimensions?

Well that’s because while building an array formula we need to make sure that the dimensions of our inputs match. I will come back to this later. For now just remember that it’s important.

Let’s refresh some elementary computer science and mathematics as well –the logical tables

and & or logic

Mostly while working with array formulas, we will be evaluating things to ‘true’ and ‘false’ to get to our answer. Obviously when evaluating multiple columns the AND & OR logics will come into play. In excel the AND logic can also be reached by multiplying two logical values and similarly an OR logic can be reached by summing them up (the OR done this way is a bit different and we will cover it later).

It is also important to remember that in Excel terms, True = 1 and false =0, so the above table for AND logic can also be equated to:

true false logic

Ok, I get array and dimensions and the logic part, but why do we call it array formula and how different is it from a normal formula?

At this point if you are thinking that array formulas are a different category of formulas, different from the normal SUM, IF, etc, that’s not correct. They are the same formulas that we use day and night but we just change the way we input data in them so that they behave differently. Opposed to normal formulas which operate on a single cell, array formulas process the entire array at a time. Also the inputs, most of the times, are arrays (ranges to sound more familiar) and not single cells.

Ok, enough theory for the day. Let’s do some examples to make things easy.

Let’s imagine a one-dimensional  array/table (AD5:AD10; see image below)  that contains data on product names – with only two products – Product A and Product B. Now if I were to ask you to count the number of times Product A occurs in this table you would find it very easy and use the COUNTIF formula. But can you do it without a COUNTIF?

Yes, obviously – array formula. But how?

example of array formula

Well… our formula is simply going to be = SUM(–(AD5:AD10=”Product A”)) and we get the answer 4.

[one thing that you must know about writing array formulas is that once you have written the formula in a cell, you have to press CTRL+SHIFT+ENTER to make them work as compared to the normal ENTER that we use for normal formulas ]

So how did this work?

The first thing you would have noticed is that inside the formula I compared a complete range – AD5:AD10 – to ‘Product A’. Well that’s why it is called an array formula i.e. all the operations happen over a complete array/range/table and not just a single cell.

When I made the comparison, each cell in the range was evaluated to TRUE or FALSE based on whether it contained the required value or not. So we had an array of TRUE and FALSE now (see image above). Then I used the double negative sign ahead of the evaluated array. The double negative sign converts logical values to numeric – so TRUE becomes 1 and FALSE becomes 0. So, we now had a table with numbers in it. The rest was taken care of by the SUM formula i.e. summing up the entire array and the answer was our count or the number of times Product A appeared in the table. Ok so we just wrote our first array formula… interesting… isn’t it?

The second thing you would have noticed in the formula bar above is that there are curly braces ({=SUM(–(AD5:AD10=”Product A”))}) around the formula. That is Excel’s way of differentiating an array formula from a normal one. I would encourage you to try and duplicate this example in an Excel file as this will help you understand things better.

In the next post I will cover how mathematics works on array formulas. Till then keep Excelling.

Have a question? just leave a comment and we will answer A.S.A.P