Okay, so in the last post we created a Pivot Table and got a weird looking sheet with a Field List pane on our sheet. The listed fields are nothing but the column headers in our dataset. To carry out an analysis on our data, a Pivot Table needs the list of fields so that it can slice and dice the data according to values in those fields.
Now this slicing and dicing of our data can happen in multiple ways and a Pivot Table gives us four options to do it by simple drag and drop operations:
- Report Filter: This kind of serves as a master filter of our data as we would have in a dashboard. A field dragged into this section will be created as a drop down for filtering data. We can then filter data corresponding to our selected values in this drop down.
- Column Labels: As the name suggests, a field dragged into this section will form column labels with unique values from this field.
- Row Labels: Similar to column labels, a field dragged into this section will form rows with unique values from this field.
Crab Tip: We suggest that for better interpretability, you select a field with lesser unique values as column label and more unique values as row label.
- Σ values: This aggregates values in our dataset which lie at the intersection of row and column labels for a selected filter by performing different operations (which are user specified) like sum, count, average, etc.
So basically, we’ve laid out the ground rules for a Pivot Table. Essentially what our Pivot Table does is that it aggregates values by the filters/labels specified by us. The best part about this is that we can continuously change our selected values and look at different analyses without writing even a single formula. All of this will be clearer with the exercises that follow.
Let’s say you’re an analyst who has been given this dataset and you have to quickly examine this dataset in as many ways as possible and get a good idea about our data. Let’s try answering a few questions about our dataset:
- How many students does each school have?
- What is the number of male and female students in school MS?
- What is the average grade of male and female students in both the schools?
- What is the sum of absent-days for male and female students in each school (add absent days for each student in each school)?
Just follow the video below to get answers to these questions using a Pivot Table. Do remember to switch on the subtitles.
While these were some simple operations that we introduced to give you a basic introduction to Pivot Tables, we recommend that you explore the dataset further and play around with the Pivot Table to see what else you can do.
We hope you found this post useful. In the next post we’re going to look at some advanced features of Pivot Tables which will help you to become a pro at Pivot Tables. Stay tuned in!
Finally the wait is over. We’ve already heard a lot about ‘Pivot Table’ in the previous post. So, what is a Pivot Table?
Put simply, a Pivot table is simply a built in feature in MS Excel which takes in our raw data in a certain format and allows us to summarize it by different fields. This is a very simple definition. Going further we’ll see what more a Pivot Table can achieve for us. The best way to learn is by doing. So let’s just get to it.
For this series on Pivot Tables, we’ll be using a subset of the ‘Student Performance’ dataset taken from https://archive.ics.uci.edu/ml/datasets/Student+Performance . Click on Student_Performance to download the excel file for this series.
This dataset basically shows how the academic performance of 649 students from two schools is along with certain other parameters. We’ll use Pivot Tables to delve deeper into our data. Before we can do that, we need to create a Pivot Table. The section below details the steps of creating a Pivot Table.
- Preparing data for creating a Pivot Table-An important aspect of creating a Pivot Table is data preparation. Our entire analysis will depend upon how we structure and format our data. We need to take care of the following points while preparing data for our Pivot Table.
- Column Headers: All columns in our data should have column headers to make a Pivot Table. Without all column headers, we’ll not be able to make a Pivot Table.
- Data should not have Totals: Our raw data should not have Totals. Having Totals will cause our Pivot Table to output erroneous results.
- Avoid blank cells: While our Pivot Table can handle blank cells, it is better to not have blank cells for an effective analysis.
- Consistent number formats: Each of our columns should have the same number format i.e. values in one column should not be in multiple formats. Having multiple number formats can cause calculation errors in our Pivot Table and return wrong results.
- Arrange data by rows (not columns): This is extremely important. It means that it’s better to have repeated rows of data for certain categories instead of creating new columns for every category. By doing this, our Pivot Table will be able to slice and dice data in a more efficient way without limitations. This becomes even more important in case of date values. I’ll explain this better with an example in the next post. See image below for further explanation.
- Creating a Pivot Table-This is the easy part—just follow these steps:
Step 1: Select any cell from your data and from the Insert Tab, select Pivot Table
Step 2: On the dialog box that appears, check your data range and select ‘New Worksheet’ to place the Pivot Table
Step3: Your Pivot Table is created on a new sheet with some fancy panes on either side of the sheet
That’s it! We have created a Pivot Table on a new sheet.
In the next post we’ll do basic data exploration on our student performance dataset and answer a few interesting questions. Feel free to reach out to us in case you have any queries.
As the title suggests, In this series of posts we’ll be covering Pivot Tables starting with the very basics and then move on to creation of Dashboards using them.
The topics will be covered in two series—the first explaining working of Pivot Tables in detail and the second on dashboard creation using Pivot Tables. If you’re already familiar with Pivot Tables, you can skip the next section.
Today I will show how you can create a cool excel chart for your dashboards that responds to mouse over – yes, no clicking required. The final chart is going to look like the one below.
This can be a very useful technique to show some important data points related to your main series. For example in the chart above I am showing the profit made for each year when the user points his/her mouse over the relevant series.
So, let’s see how this is done Continue reading
What !! you want me to change the name of a product in 120 charts because I missed to capitalize the first letter? And you want me to do this in the next 30 minutes? Do you have any idea that there is no inbuilt functionality in PowerPoint to do this?
This was the reaction my friend had when his manager asked him to change the product name in all the charts in the 150 slides PowerPoint deck that he just made. Since it was not the first time this had happened to him and he was pretty sure that it would happen many more times again, he reached out to me.
So, I made him a tool that allowed him to specify all the names that he wanted to change and their substitutes on an excel sheet and let the Excel tool take care of the rest. And for the greater good, I decided to share it with you guys. Download the tool here (Find Replace Within PowerPoint Charts tool; xlsm).
Using the tool is a simple two-step process and is explained in the image below.
I hope your manager never asks you to do so. But if he does, you will find this tool useful
In the last post I explained the basic logic of array formulas and how they work. In this post I will be focusing on the mathematics of arrays – so what happens when you subtract one array from another or multiply or divide two arrays? How exactly does it work?
For the purpose of this post I will just be multiplying various arrays but you can basically do any mathematical (subtract, divide, add, etc.) or non-mathematical (concatenation, find length, etc.) operation in the same fashion – literally anything and everything that you could think of.
Whenever I start with this concept, I get a common question – is this matrices multiplication that we learnt in school – so let me answer this, just in case – and the answer is NO. Array multiplication is a row wise multiplication. So each row of the first array gets multiplied with each row of the second array – in most of the cases.
Now, let’s look at all the possible cases of array multiplication:
Case 1: when both the arrays have the same number of rows and columns
In the image below, I multiply two arrays A & B (essentially range C6:C10 & E6:E10). When I do this in a cell, I get the result that is displayed in the Output Array below. So, each cell in each row of the array A gets multiplied with the corresponding cell in array B.
So why did I say “each cell in each row”? To understand this let’s look at the image below: Continue reading
In this series of posts, I’m going to illustrate the potential of Excel as a modeling tool. We are going to create simple models using the features that Excel offers.
Excel is a powerful data modeling tool, but one should be aware of the modeling features that it offers to be able to fully utilize its potential. In most cases, modeling involves creation and evaluation of multiple scenarios and comparison of their results.
In this post, I’m going to cover the ‘Scenario Manager’ feature in the ‘What-If Analysis’ option available in the ‘Data’ tab on the ribbon. Please download the tutorial file by clicking here.
The ‘Scenario manager’ becomes a very useful feature for analysts in any industry while modeling different scenarios. It becomes even more useful when a small number of independent variables control the results of the entire model.
For example, in estimating the future profits for three demand scenarios for a product—low, medium and high—the scenario manager allows the modeler to check the results of all scenarios by doing calculations only for a single scenario. So, I only have to perform calculations in Excel for the low demand scenario using some initial values. The scenario manager would then allow me to create the other two scenarios simply by substituting the relevant initial values for those scenarios.
I’ll now try to implement the above example in Excel with the information given below:
- I have forecasted demand values of a product in number of units—High (6000), Medium (4000) and Low (2000).
- I also know that under these scenarios the variable cost of producing the product would vary as—High ($ 90), Medium ($ 95) and Low ($ 100).
- Let us assume that the fixed cost of production in each scenario is $ 100,000 and the selling price is $ 150.
Using this information, I need to calculate the profits for all the three scenarios. Continue reading