Skip to content

Aggregating Data by Month in Google Sheets

Tags: finance • Categories: Uncategorized

Table of Contents

Certain operations that are easy in SQL are hard in google sheets / excel. I ran into one of these: I had a simple problem: I had some time series data (dividend payouts from a financial institution) and I wanted to aggregate the data by month. In SQL, this is a simple GROUP BY, in sheets it’s not that easy.

Create a separate sheet to aggregate data, then group

  1. Filter based on the type of transaction
  2. Look at each date and normalize it to the last day of the month =ARRAYFORMULA(EOMONTH(B:B, 0)) (this is now column E)
  3. Get a list of unique months (i.e. end-of-month keys) =UNIQUE(E:E) (this is now column F)
  4. Filter the normalized list of months by the unique month keys and sum the values ARRAYFORMULA(SUMIFS(C:C, E:E, F1, A:A,$H$1))

Here’s all of the additional rows, tab separated, so you can copy paste.

=ARRAYFORMULA(EOMONTH(B:B, 0))  =UNIQUE(E:E)=   ARRAYFORMULA(SUMIFS(C:C, E:E, F1, A:A,$H$1))

This works, but it’s super messy (adds a bunch of garbage rows only used for data normalization). You could probably write a more complicated query to materialize a dataset with the EOMONTH data, but it would destroy readability.

There’s got to be a better way!

UseQUERY to run to SQL on google sheets

There’s a interesting QUERY function in google sheets that allows you to write SQL-like syntax to query your sheet. It’s pretty powerful and perfect for this use case.

For instance, here’s how to aggregate dividends by month in a single formula (with nice column labels!):

=query(A:C, "SELECT MONTH(B) + 1, SUM(C) WHERE A = 'Dividend Received' GROUP BY MONTH(B) LABEL MONTH(B) + 1 'Month', SUM(C) 'Dividend Total'")

Have multiple entries like this and want to sum them up, without drag-copying your formula across rows?

=ARRAYFORMULA(IF(E2:E<>"", E2:E + B2:B, ""))

This function applies the formula to each row, like copy-dragging a formula. Nice! The IF ensures that if no value exists, 0 s won’t be added to blank rows.

Other learnings on google sheets

Here some misc learnings that were new to me:

  • You cannot reference columns by name in the google sheet visualization language
  • You can’t set conditional formatting via a formula
  • SUBTOTAL is the only function which respects data filters, but you can’t apply any additional filters or transformation to the input into SUBTOTAL which makes it mostly useless.
  • Sometimes, double clicking on the bottom right corner doesn’t work to copy a formula across an entire column. Here’s what does.
    1. Click on the cell you would like to copy the formula from
    2. CMD+SHIFT+down array
    3. CMD+D to copy the formula