ben-kolde-365811-unsplash

Ultimate Beginner's Guide to the Query Function in Google Sheets

Search your data on Google Sheets as if it's an SQL database. Easily Analyze, transform and aggregate data with one simple function.
Emma Butler

 

If you’re like me, you have a lot of data sets you want to analyze, aggregate, filter, or transform with calculations. Pivot Tables, simple functions like =sum(), and the filtering feature might not be enough to get all this done neatly. Your tables can get messy with so many layers of filters and functions all over the place.

 

Wouldn’t it be nice to manipulate your data with just one easy to learn function?

 

Luckily, Google Sheet’s query function is a simple and powerful tool (even if you don’t know SQL). It’s intuitive to learn as it uses English words like “SELECT” “WHERE” “ORDER BY” “LIMIT” and others.



I’ll walk you through the layout/set up of the query function and SQL keywords to know. Then I’ll show you some beginner examples and then some more advanced functions (such as nested query functions).

 

You can see all the formulas I’ve used in action on this Google Sheet. The function is written above the table so you can see it clearly but it is active in each of the purple boxes.

 

Watch the 8-minute video tutorial here. It is also embedded at the bottom of this article.

query 1

The Query Function is set up in two main parts. The first being, DATA, meaning which data set are you querying. You can highlight the cells you are interested in or name the data set and type in the name of your set. You can do this by selecting the cells containing your data, clicking “Data” in menu above, then “Named Range” and name your range. 

Query 2

The second part of the query function is called “Query.” This basically translates to “what are you trying to query in the range of cells you just selected?” Using keywords in a structured phrase like “SELECT X Y Z WHERE xxx = xxx” you can tell the query function what data you want to be retrieved. By the way, this section is always wrapped in quotation marks.



There are a few keywords you should know.

 

“SELECT”  “WHERE” “ORDER BY” “LIMIT” “GROUP BY” and “DATE”

 

Every basic query starts off with SELECT. If you want to select all the data in the data set (meaning the table retrieved will have all the columns) then put an * after SELECT

=query(A1:G15, “SELECT *”)

But if you only want to retrieve data from certain columns within the dataset, put the column letter after SELECT

=query(A1:G15, “SELECT B, C, G”)


query 3



Once you’ve chosen which columns you are interested in, using SELECT, it’s now time to filter what type of data you want from those columns. In the example data set above that shows company names, ID, year founded, and revenue, maybe you only want to retrieve data from companies based in the United States. In the new table you will create in your query, you are only interested in having 3 columns: company name, founding year and revenue. (You don’t need to include the column you are filtering by in your selected columns.)

 

You could write

=query(Range1,"SELECT B,D,G WHERE C='United States'")

 

And Columns B (company name), D (year founded) and G (stripe revenue) would be returned but only the United States based companies would be included. (Remember to put single quotes around text, ‘United States’)

Query 4


 

You might need multiple WHERE clauses. Let’s say your data sheet has some missing information and looks something like this

query 5

You only want to retrieve data where there are filled cells but also want to retrieve data from only US companies. You would separate the two WHERE filters by just writing AND between the two. 

=query(Range1, "SELECT B, D, G WHERE A is not null and C = ‘United States’”)

query 6


Now the next step would be if you wanted to sort these companies. Maybe you’re interested in the company with the most monthly revenue. You would add the phrase “ORDER BY G DESC” at the end of the query.

DESC is short for ‘descending’ and would mean the lowest number to the highest number.

ASC is short for ‘ascending’ and would mean highest to lowest.

=query(Range1, "SELECT B, D, G WHERE A is not null and C = ‘United States’ ORDER BY G DESC”)


query 7

You next might only want to limit yourself to seeing the top 5 US companies with the highest monthly revenue. You would just add LIMIT 5 to the end of the query.

=query(Range1, “SELECT B, D, G WHERE A is not null and C = ‘United States’ ORDER BY G DESC LIMIT 5”)



query 8

It’s important to remember to structure your query criteria in this order. “SELECT, WHERE, ORDER BY LIMIT”

 

 

What happens if you want to query multiple sheets?

 

If you want to query multiple sheets/tabs) within the same spreadsheet, you can do that using the query function. You’ll need to make sure the data is consistent though; the data type needs to correspond throughout each sheet. For example, Col1 across all the sheets should all be company names, Col2 as Company ID, Col3 as Revenue etc. As soon as you are using more than one table in a query, column letterheads (A, B, C, D, etc.) become irrelevant in the query criteria (But they are still relevant when you indicate range). You now need to refer to the columns as Col1 for the first column in the selected range and Col2 for the second and so forth.

 

To select multiple tabs/sheets, refer to the tabs by their name, then follow the sheet name with ! and then the range of cells on that sheet. Then add a semicolon to stack the data on top of each other, and then Sheet name followed by ! and the range.

Here’s an example:

=query({Sheet1!A:D;Sheet2!B:E} “SELECT *”)

 

*The most important thing to remember is putting the curly brackets around the totality of your selected data range. Whenever you are querying data from different sheets or different spreadsheets.

 

Now continue as you normally would but refer to columns as Col1 or Col2 ect.

 

Let’s say I want to aggregate data from two sheets (Growth1 and Growth2) but exclude growth rates that are less than 0.

 

Here’s Growth 1:

advanced query 1

And here’s Growth 2:

advanced query 2


You can see that they each have four columns in the order, “name” “# of transactions this week” “# of transactions last week” and “growth rate”.

 

Column A from Growth1 lines up with Column B from Growth2 and so on. The fourth column in each table is the growth rate.

 

Now I’ll use the function:

 

=query({Growth1!A:D; Growth2!B:E} “Select * WHERE Col 4 >0”) 

advanced Query 3
advanced query 4


You can also query from a completely different spreadsheet. 

 

Let’s say Growth2 wasn’t on this spreadsheet and it was on a whole other spreadsheet and we still wanted to aggregate it. We would take the same function: 

=query({Growth1!A:D; Growth2!B:E} “Select * WHERE Col 4 >0”) 

 

But we would use the “import range” method to add 

=query({Growth1!A:D; ImportGrowth2!B:E} “Select * WHERE Col 4 >0”) 

 

 

The query function is great because as you change the data on you main data set, all other tables made from querying that table update too. Want to know more about keeping your data live on Google Sheets no matter where you export it from?

 

Check out this article on importing data to Google Sheets and keeping it live here.



I’m Emma from Actiondesk, the spreadsheet that lets you gather and control all your data in one place. Have data from all sorts of different sources you want to analyze and transform? You should check Actiondesk out. It might save you a whole lot of time and make you a workplace hero.

More Posts