Excel GROUPBY & PIVOTBY Functions - All You Need to Know (do they BEAT Pivot Tables? š¤)
Summary
TLDRThe video introduces two new Excel functions, GROUPBY and PIVOTBY, that provide powerful ways to aggregate, analyze, and present data. It demonstrates how these dynamic functions can automatically group data, calculate totals and subtotals, handle dates, exclude unwanted data, and pivot data from rows into columns. Key benefits over pivot tables include easier filtering, text support, and automatic updates. Overall, these functions enable faster, more flexible Excel reporting while avoiding tedious manual steps.
Takeaways
- š The new GROUPBY and PIVOTBY functions allow for easier data aggregation and pivot table-like reporting directly in Excel formulas
- š GROUPBY has 3 required arguments: Row fields (categories), Values (numbers to aggregate), and Aggregation function like SUM()
- š Optional arguments in GROUPBY: show field headers, toggle totals, sort order, filter array
- š PIVOTBY pivots data with row and column fields for layout; useful when report gets too long
- š GROUPBY can return text values, unlike pivot tables, using functions like ARRAYTOTEXT()
- š¢ GROUPBY excludes errors and stray calculations, cleaning up the data
- š GROUPBY can easily handle date fields using the YEAR() and TEXT() functions
- š¤ HSTACK() helps include multiple fields like dates and categories in GROUPBY rows
- š GROUPBY and PIVOTBY formulas update dynamically when source data changes
- ā¬ļø PIVOTBY has additional options to customize column totals, sorting and filtering
Q & A
What are the two new Excel functions introduced in the video?
-The two new Excel functions introduced are GROUPBY and PIVOTBY.
What are the mandatory arguments for the GROUPBY function?
-The mandatory arguments for the GROUPBY function are: Row fields, Values, and Function.
How can you exclude certain rows from aggregations in the GROUPBY function?
-You can use the Filter Array optional argument in GROUPBY to exclude rows based on criteria.
What is one advantage of using GROUPBY over pivot tables?
-One advantage is that GROUPBY results are dynamic - any changes to the source data will automatically update the GROUPBY output.
How can you handle dates in the GROUPBY function?
-Dates can be handled by wrapping them in functions like YEAR or TEXT to extract specific components like year and month.
When would you use PIVOTBY instead of GROUPBY?
-Use PIVOTBY when you want one of the grouping categories in the columns instead of the rows to avoid too much horizontal scrolling.
What are the additional arguments available in PIVOTBY?
-PIVOTBY has a Columns argument to specify which fields should be pivot columns. It also has additional options like Column Total Depth.
Can GROUPBY and PIVOTBY output text fields?
-Yes, by using text aggregation functions like ARRAYTOTEXT or writing a custom Lambda function.
How can the GROUPBY output be sorted?
-The optional Sort Order argument allows sorting the GROUPBY output by a specified column.
What calculations can be used to aggregate values in GROUPBY?
-Options like SUM, AVERAGE, MEDIAN, MAX, MIN, PERCENTOF and custom Lambda functions can be used.
Outlines
š Introducing New Excel Functions GROUPBY and PIVOTBY
Paragraph 1 introduces two new Excel functions called GROUPBY and PIVOTBY that will change how formulas are written. It provides an example scenario where the boss asks for sales data aggregated by division and sales manager. The GROUPBY function with the SUM operation can quickly accomplish this using the data from the TSALES table.
š GROUPBY Function Advantages Over Pivot Tables
Paragraph 2 highlights 3 advantages of the GROUPBY function over pivot tables - it is dynamic and automatically updates if the data changes, it can return text values instead of just numbers, and it can easily exclude extraneous rows with calculations that pivot tables struggle with.
š More GROUPBY Function Features for Text, Dates and Advanced Logic
Paragraph 3 demonstrates additional GROUPBY features - using text functions like CONCAT or ARRAYTOTEXT within it, handling dates by extracting parts like YEAR or formatting with TEXT, logically filtering with complex criteria using Filter Array, and calculations like AVERAGE and PERCENTOF.
š Introducing PIVOTBY Function to Transform Data Layout
Paragraph 4 introduces the PIVOTBY function to pivot data layouts. It has a Column Fields argument to move a category into columns for better visibility. The function otherwise works similarly to GROUPBY. It covers PIVOTBY features like optional arguments and turning column totals on/off.
Mindmap
Keywords
š”GROUPBY
š”pivot table
š”aggregation
š”Lambda function
š”dynamic
š”HSTACK
š”PIVOTBY
š”column fields
š”optional arguments
š”Filter Array
Highlights
Introduction of GROUPBY and PIVOTBY functions in Excel that will change formula writing.
GROUPBY function simplifies generating reports such as total sales by sales manager and division.
GROUPBY requires three mandatory arguments: Row fields, Values, and the aggregation Function.
The function supports a variety of pre-built aggregation functions, including SUM, AVERAGE, and COUNT, and allows custom Lambda functions.
GROUPBY automatically provides a total as part of its default behavior.
Optional arguments of GROUPBY include Field Headers, Total Depth, and the ability to sort data.
GROUPBY's dynamic update feature ensures real-time report adjustments based on data changes.
GROUPBY can return text in the Values field, a capability not available with pivot tables.
Using GROUPBY to exclude unwanted data or calculations with the Filter Array argument.
PIVOTBY is recommended when reports get too lengthy and categories are better suited for columns.
PIVOTBY allows for specifying row fields, column fields, values, and the aggregation method.
GROUPBY and PIVOTBY can handle dates efficiently, including year and month combinations.
HSTACK function in GROUPBY can combine multiple categories, like date and sales manager, in the report.
PIVOTBY enhances reports by organizing data with additional dimensions in columns and offering optional arguments for customization.
Examples demonstrate the versatility and dynamic capabilities of GROUPBY and PIVOTBY for advanced Excel reporting.
Transcripts
They're coming, and they're here to stay!Ā Two new functions called GROUPBY and PIVOTBYĀ Ā
that are going to change the way you write yourĀ formulas. These are functions for everyone,Ā Ā
so make sure you watch this video,Ā no matter what your Excel level is,Ā Ā
so you're prepared once theyĀ show up in your spreadsheet.
So, imagine you get to the office, and youĀ start to enjoy your cup of coffee. Then,Ā Ā
the boss comes along and says, "Hey, I haveĀ this Excel table. It's called TSALES. I wantĀ Ā
you to quickly give me the total sales forĀ each sales manager by division right now."Ā Ā
You take a look at this and you think, "Well, IĀ could create a pivot table." If you know howĀ Ā
to create a pivot table, if not, don't worry;Ā I have lots of videos on the channel that canĀ Ā
help you out. Your other option startingĀ now into the future is to use a functionĀ Ā
that gets this done. That function is theĀ new GROUPBY function, and it's super easyĀ Ā
to use. All you need to do is type in =GROUPBY. It only needs three mandatory arguments.
The first one is the Row fields. These areĀ basically your categories, so the boss wantedĀ Ā
Division and Sales Manager. I'm just goingĀ to select these columns. The next argument isĀ Ā
the Values. In this case, it's my sales. If I hadĀ multiple values, so if I had Profit as well, for example, I wouldĀ Ā
just select that as well. Next is the Function.Ā How do we want to aggregate these numbers? Take aĀ Ā
look at this; we have a lot of pre-built functionsĀ that we can choose from. One is the SUM function;Ā Ā
this is probably going to be the most used one.Ā Then we have AVERAGE, MEDIAN, COUNT, MAX, MIN,Ā Ā
and so on, right? Lots of things to chooseĀ from, plus you get the ability to write yourĀ Ā
own Lambda function if you can't find a functionĀ here that gets the job done for you. But for that,Ā Ā
you have to be a bit more advanced, and I'mĀ going to show you an example in a little bit.
So in this case, all we need is the SUM.Ā You can either type it in or select it,Ā Ā
close the bracket, press Enter, and that'sĀ it. You can go back and enjoy your coffeeĀ Ā
now. Another great feature of this, though, isĀ that we get a total automatically as part ofĀ Ā
this formula. That's the default behavior. TheĀ total right now comes with it. As you can see,Ā Ā
the sum of these values is this number,Ā which is our total value right here.
Now, let's go and take a look at our optionalĀ arguments. We know these arguments are optionalĀ Ā
because they're in square brackets. So, theĀ first optional argument is the ability to showĀ Ā
the Field Headers. In my case, I didn't selectĀ any headers, so there's nothing to show. ButĀ Ā
if I had selected it, I could selectĀ 3 here to show the field headers.
Now, just to demonstrate how that looks, if IĀ had selected the headers as well, where... So,Ā Ā
I'm just clicking and clicking again to make sureĀ I select the headers. I have to be consistent,Ā Ā
and if I select the headers for my Row fields,Ā I have to also select it for my Values field,Ā Ā
right? And I choose a 3 here to showĀ the field header. This is what I get:Ā Ā
Division and Sales Manager, right?Ā But I could also just put in Division,Ā Ā
Sales Manager. I could change this toĀ Manager or Employee or whatever I want.Ā Ā
You can just always hardcode this, but that'sĀ one of the optional arguments that we have.
Personally, I would have preferred if thisĀ argument comes all the way to the end orĀ Ā
somewhere later because I find the next twoĀ arguments much more useful than this one. ButĀ Ā
let's just move on to the next argument. This isĀ Total Depth. This allows us to choose whether weĀ Ā
want to show the grand totals on top or on theĀ bottom and also if we want to show subtotals.
So in this case, I'm showing two differentĀ columns, right? I could select Show Grand TotalsĀ Ā
and Subtotals as well. So here, I get Health. IĀ get a total for Health, total for Productivity,Ā Ā
Utility, and then a grand total. You haveĀ the ability to reverse that as well. So ifĀ Ā
I put a minus 2, I show the grand total onĀ top, the subtotals on top, and so on. Let'sĀ Ā
say boss changes their mind and they say,Ā "Well, I also want to include Product in thereĀ Ā
as well." It's really easy to update this. AllĀ you have to do is just remove this and selectĀ Ā
these three columns, and that's it, right? WeĀ get the total, we get our subtotals, and so on.
If at any point in time you wantĀ to turn off the totals altogether,Ā Ā
you can put a zero and don't show any totals.Ā Okay, I'm just going to press Ctrl+Z to go back.Ā Ā
Another optional argument that you haveĀ is the ability to decide the sort order.Ā Ā
So let's say I want to sort everything by theĀ fourth column, by my sales value. So I'm justĀ Ā
going to put 4 here. When I press Enter,Ā it sorts my sales values in ascending order,Ā Ā
right? So that's the default; it's ascendingĀ order. Obviously, it excludes the totals andĀ Ā
subtotals. If I want to sort everything inĀ descending order, I have to put a minus inĀ Ā
front of the column number. Now I get all myĀ categories here sorted in descending order.
Now, the advantage of this over a pivotĀ table is that everything is dynamic. TheĀ Ā
moment something changes inĀ my data set, so for example,Ā Ā
let's say Jesse Pinkman's FitTrack is right hereĀ in my report. If it really has explosive sales,Ā Ā
everything updates automatically in my report.Ā So that's one advantage. Let me show you theĀ Ā
second advantage of this using GROUPBY overĀ pivot tables. That advantage is the abilityĀ Ā
to return text in your Values field, whichĀ is something you can't do with pivot tables.
So, for example, let's say boss wants theĀ names of the sales managers that work for eachĀ Ā
division. We could quickly use the GROUPBY function. The Row fields is my Division,Ā Ā
the Values fields is going to be my Sales ManagerĀ field, and as for the Function, I have to selectĀ Ā
something that can work with text. So CONCAT is anĀ option, but that's just going to stick everythingĀ Ā
together. ARRAYTOTEXT is better because it'sĀ going to separate everything nicely with aĀ Ā
comma. So when I press Enter, I get my salesĀ managers separated with a comma. But notice,Ā Ā
because my data set is detailed and salesĀ managers are repeated, I get the repetitionĀ Ā
here too. Optimally, in my example, I want toĀ get a unique version of this ARRAY TO TEXT.
Now, there isn't anything inbuiltĀ right now that you can use,Ā Ā
but you have the ability to write your ownĀ function. So here, I could write a LambdaĀ Ā
function. For the parameter that's going toĀ be a representative of the sales manager,Ā Ā
it's going to be any sales manager. I'm going toĀ go with A for any. Then, the calculation is goingĀ Ā
to be the unique version of the sales managerĀ should go into ARRAYTOTEXT. Inside ARRAYTOTEXT,Ā Ā
I'm going to go with UNIQUE of A, right? AnyĀ sales manager. So let's just close these brackets,Ā Ā
and when I press Enter, I get my report.Ā That's another advantage over pivot tables.
Let me show you the third and my favoriteĀ advantage over pivot tables. Over here,Ā Ā
I have the same data set; it's just not formattedĀ as an Excel table, and my colleague has gone in,Ā Ā
and for some reason, they've added calculationsĀ in the middle of this data set. So here, I haveĀ Ā
Total Quarter 1, down here I have Total QuarterĀ 2 with the sum of the cells above it, then I haveĀ Ā
Total Half Year, and so on. I can create a pivotĀ table based on this unless I clean this up, whichĀ Ā
I could do with Power Query. But you can alsoĀ clean up your data using the GROUPBY function.
So let's say we want to get the total salesĀ for each sales manager. My Row fields areĀ Ā
going to be the sales manager. Now, this isĀ going to be a hassle to select, so I'm justĀ Ā
pressing Ctrl+Shift+Down Arrow until I get to theĀ bottom. Now, Ctrl+Backspace to jump back up. Next,Ā Ā
I want to get the Values fields, which isĀ right here, but let's just type it in. Last,Ā Ā
let's go with SUM, close bracket, pressĀ Enter. What do we get? Those empty rows;Ā Ā
those calculations are all going to add upĀ and mess up our data. We want to excludeĀ Ā
these. We can do that by using the last argumentĀ of the GROUPBY function, which is Filter Array.
So I'm just pressing the commas or theĀ Excel separator just to get to the lastĀ Ā
option here. Now, we want to excludeĀ anything that is in the Sales ManagerĀ Ā
column that is a blank cell. To do that,Ā you basically treat it similarly to theĀ Ā
Include argument in the FILTER function. SoĀ we want to select this column, that's 313,Ā Ā
and include anything that doesn't equal to blank,Ā right? An empty cell. And when I press Enter,Ā Ā
I get my proper report, excluding allĀ these other calculations in the middle.
If, for some reason, you couldn't do this,Ā if you had some random data in here, and youĀ Ā
wanted to base your logic, let's say, on anotherĀ column, for example, to take a look at Column A,Ā Ā
and if anything starts with the word "Total,"Ā I want to exclude it. Can we do that? Yes,Ā Ā
we can. All we have to do is justĀ find the logic. So in this case,Ā Ā
it would be LEFT of the value that weĀ have in the A column. So we're goingĀ Ā
to take a look at the entire A column,Ā and if the left five characters, right,Ā Ā
so that's "Total," five characters, if they don'tĀ equal to "Total," then they should be included.Ā Ā
If they are equal to "Total," then they shouldĀ be excluded. And we get the same thing back;Ā Ā
pretty neat way of excluding things youĀ don't want to show up in your final report.
Now we've been taking a look at the SUM a lot.Ā You can, of course, change this and take a lookĀ Ā
at the AVERAGE sales for each person or the oneĀ that's new and pretty cool is the PERCENTOF.Ā Ā
So when I press Enter, I get the percentage ofĀ each person as compared to the total. So let'sĀ Ā
just format these as a percentage, Ctrl+Shift,Ā and Percentage key. If I wanted to sort these soĀ Ā
that I can easily tell who sold the most, I'mĀ going to go to the Sort Order argument here,Ā Ā
and let's sort the second column in descendingĀ order. I'm going to put a minus 2, press Enter,Ā Ā
and we can see the salesperson that has theĀ highest sales as compared to everyone else.Ā Ā
And remember, all of this is fully dynamic. SoĀ if Christopher, for example, has a crazy salesĀ Ā
number, and he's going to jump all the way to theĀ top. But if we decide to ignore it because thisĀ Ā
is the total of whatever quarter, then it'sĀ going to be ignored from this calculation.
Now it's time to take a look at how you can handleĀ dates in the GROUPBY function. So let's say youĀ Ā
want to take a look at total sales for each yearĀ and month combination or for each year. How canĀ Ā
you do that? So I'm back at my original cleanĀ table called TSales. What you're going to doĀ Ā
is start off with GROUPBY. Then we are going toĀ select a date column. But let's say we want toĀ Ā
get the year; you're going to put it inside theĀ YEAR function. That's it. Then, you're going toĀ Ā
select your Values. In this case, it's sales andĀ your aggregation, SUM. Press Enter, and that's it.
What about year and month combination? Well,Ā here you can just use anything that can giveĀ Ā
you that combination. So one thing that comes toĀ mind is the TEXT function because a TEXT functionĀ Ā
allows you to format a date or any number or anyĀ text into the format that you want. So we want toĀ Ā
format this date as, in quotation mark, "yyyy-mm", quotation, Enter,Ā and then I have the year and month combination,Ā Ā
and I get the total sales for each of these.Ā And we can double-check quickly if it reallyĀ Ā
works. So we have 87,400 for February 2024. ThisĀ is February 2024; let's just sum these, 87,400.
Now, what if I wanted to expand this and addĀ another category in addition to period? SoĀ Ā
I want to have the year-month combination,Ā and then I want to have sales manager andĀ Ā
then total sales. How would I handle that?Ā Well, let's copy and paste this over here,Ā Ā
and let's see how we can update this argument.Ā So until now, we've been highlighting,Ā Ā
like selecting a column to include as the RowĀ fields. For dates, we put it inside the TEXTĀ Ā
function so that we could grab the year andĀ month combination. But now, I also want toĀ Ā
add the sales manager column. To be able to doĀ that, I can use the HSTACK function, which isĀ Ā
the horizontal stacking of ranges. So I'm goingĀ to put this inside the HSTACK function. My Array1
is the first column that I want to show; thisĀ is the year-month combination. Then, my Array2Ā Ā
is going to be my sales manager column. Close theĀ bracket for HSTACK, and that's all I need to do.
At the very beginning of the video, I promisedĀ you that I'm going to show you PIVOTBY too. ButĀ Ā
until now, I've just been talking about GROUPBY. The good thing is that once you understandĀ Ā
how GROUPBY works, you already know how PIVOTBY works. You only need to know when will youĀ Ā
use PIVOTBY instead of GROUPBY. Whenever youĀ have a case where you feel that your report,Ā Ā
your data analysis is getting too long, you'reĀ scrolling a lot more than you would like,Ā Ā
and you'd actually rather prefer toĀ have one of these categories in theĀ Ā
columns instead of the rows, that'sĀ when you're going to use PIVOTBY.
So, for example, let's say we wanted to get totalĀ sales for each sales manager and have the year notĀ Ā
here beside the sales manager, have the year inĀ the columns. We would use PIVOTBY. So let's doĀ Ā
that right here. You're going to start off withĀ PIVOTBY. The row fields ā that's what you want toĀ Ā
have in the rows that's going to be Sales Manager.Ā This one, the column fields ā this is the extraĀ Ā
argument that comes with PIVOTBY because youĀ can decide what is going to be in the column. So,Ā Ā
we want to have the date, but not the dateĀ like this because otherwise, we're going toĀ Ā
have to scroll a lot horizontally. We wantĀ to grab the year from the date. For Values,Ā Ā
we're going to have sales, and we want toĀ aggregate these using SUM. That's it. WeĀ Ā
get a nice report with the Sales Managers here,Ā we have a total on the bottom, we have the YearsĀ Ā
in the columns, and we have a total on the sideĀ as well. You can expand on this, so for example,Ā
let's say you want to have the Sales Manager andĀ the product. You just include these in the rows,Ā Ā
and you press Enter. We have Sales Manager,Ā Product, and then the years on top. BecauseĀ Ā
we have this additional column argument,Ā you also get a lot of optional arguments,Ā Ā
not just for the rows but also for the column.Ā So, we have the Column Total Depth. Let's sayĀ Ā
we don't want to show the totals in the columns,Ā we can turn them off by putting a 0 here. You canĀ Ā
decide on the column sort order and the filterĀ array. In this case, let's just hide the totalsĀ Ā
in the columns, and when I press Enter, thatĀ is gone. So, these are pretty cool functions.Ā Ā
Let me know your thoughts about them. AddĀ your comments below this video. As usual,Ā Ā
thank you for being here. Thank you for watching,Ā and I'm going to catch you in the next video.
5.0 / 5 (0 votes)