Finally the wait is over. We’ve already heard a lot about ‘Pivot Table’ in the previous post. So, what is a Pivot Table?
Put simply, a Pivot table is simply a built in feature in MS Excel which takes in our raw data in a certain format and allows us to summarize it by different fields. This is a very simple definition. Going further we’ll see what more a Pivot Table can achieve for us. The best way to learn is by doing. So let’s just get to it.
For this series on Pivot Tables, we’ll be using a subset of the ‘Student Performance’ dataset taken from https://archive.ics.uci.edu/ml/datasets/Student+Performance . Click on Student_Performance to download the excel file for this series.
This dataset basically shows how the academic performance of 649 students from two schools is along with certain other parameters. We’ll use Pivot Tables to delve deeper into our data. Before we can do that, we need to create a Pivot Table. The section below details the steps of creating a Pivot Table.
- Preparing data for creating a Pivot Table-An important aspect of creating a Pivot Table is data preparation. Our entire analysis will depend upon how we structure and format our data. We need to take care of the following points while preparing data for our Pivot Table.
- Column Headers: All columns in our data should have column headers to make a Pivot Table. Without all column headers, we’ll not be able to make a Pivot Table.
- Data should not have Totals: Our raw data should not have Totals. Having Totals will cause our Pivot Table to output erroneous results.
- Avoid blank cells: While our Pivot Table can handle blank cells, it is better to not have blank cells for an effective analysis.
- Consistent number formats: Each of our columns should have the same number format i.e. values in one column should not be in multiple formats. Having multiple number formats can cause calculation errors in our Pivot Table and return wrong results.
- Arrange data by rows (not columns): This is extremely important. It means that it’s better to have repeated rows of data for certain categories instead of creating new columns for every category. By doing this, our Pivot Table will be able to slice and dice data in a more efficient way without limitations. This becomes even more important in case of date values. I’ll explain this better with an example in the next post. See image below for further explanation.
- Creating a Pivot Table-This is the easy part—just follow these steps:
Step 1: Select any cell from your data and from the Insert Tab, select Pivot Table
Step 2: On the dialog box that appears, check your data range and select ‘New Worksheet’ to place the Pivot Table
Step3: Your Pivot Table is created on a new sheet with some fancy panes on either side of the sheet
That’s it! We have created a Pivot Table on a new sheet.
In the next post we’ll do basic data exploration on our student performance dataset and answer a few interesting questions. Feel free to reach out to us in case you have any queries.