Google Sheets is a spreadsheet application developed by Google. Spreadsheets are important because data isn’t always stored in a database. These can be opened in applications such as Google Sheets or Microsoft Excel, but these files don’t work with databases as they do with SQL. To solve the problem, Google Sheets provides a QUERY option that allows users to write SQL-like instructions and retrieve data in a way that’s similar to SQL. This way, users can use the power of SQL even if they don’t have a database to work with.
There are some advantages of using QUERY function in Google Sheet compared to using SQL:
- Accessibility: Many people may not have access to SQL databases or don’t know how to use them. Google Sheets is a popular spreadsheet program that is widely used, making it an accessible option for
- Simplicity: The QUERY function in Google Sheets is simple to use and doesn’t require extensive knowledge of SQL It can perform basic database functions like filtering, sorting, and aggregating data.
- Integration: The QUERY function is seamlessly integrated with Google Sheets, so users can work with their data directly in the spreadsheet without having to export it to a separate
- Flexibility: The QUERY function can work with a variety of data sources and formats, including CSV and TSV files, spreadsheets, and even HTML
- Cost: Google Sheets is a free program that does not require any licenses, installation, or maintenance fees, making it an affordable option for small
1. QUERY Function in Google Sheets
1.1. QUERY Function in Google Sheets
QUERY Syntax in Google Sheets:
=QUERY(data, query, [headers])
The parameters are:
- data – The range of cells to perform the query on.
- query – The query to perform.
- headers – [ OPTIONAL ] – The number of header rows at the top of data. If omitted or set to -1, the value is guessed based on the content of data.
To follow this section, please look at the examples shown in this spreadsheet: Employees
- Create a table
First of all, to access the data easily in this function, the data is needed to be joined into a single table.
Step 1: Select the whole data and choose Named ranges in the Data pane in Menu
Step 2 : A new window will pop on the right side of the spreadsheet. Type “EmployeesDemographics” in the first input box. This is the table name used for reference later. Click “Done”.
1.1.2 SELECT Statement.
To select all the data in the table, use the SELECT * query to retrieve all the columns of the table. After typing the following syntax into cell H1.
The result is this table:
To select specific columns, for example the EmployeeID and FirstName columns then type this syntax into the cell H2:
In a Google Spreadsheet, column identifiers are the one or two character column letters (A, B, C, …).
The result will be as follow:
1.1.3 WHERE Clause
The WHERE clause helps users specify a condition that must be met and it’s good for filtering data. It should come after the SELECT clause. In this case, I want to retrieve all of the employees who are more than 30 years old. Here are the syntax and its results.
1.1.4 ORDER BY Clause.
The ORDER BY can be used to sort data, ascending or descending. It should come after SELECT and WHERE clauses:
In the example, the query sorts employees whose names start with “M” by age, in descending order. Its syntax and result are:
1.1.5 LIMIT and OFFSET Clause
Users can use the LIMIT clause to restrict the number of results that are returned. It should come after the SELECT, WHERE, and ORDER BY clauses. The OFFSET clause is to remove a number of first rows.
Let’s modify the previous command to return the top 5 oldest male employees without the OFFSET clause in the syntax:
If OFFSET is in the syntax, the result will be as follow:
1.1.6 GROUP BY Clause and AGGREGATE Functions.
The Group By clause is used with aggregate functions to leverage data in groups. For example, if the user wants to count the number of female and male employees, and their average age, one can use the subsequent query:
The number of employees for each Gender is shown:
1.1.7 AND and OR Clauses
These two clauses can help add multiple search criteria to the formula. For example, I can use the AND clause to search for employees who are older than 30 years old and are male.
The OR clause can be used to select the employees whose names start with ‘M’ or ‘D’. The syntax and result table are:
1.2 Google Sheets Query for multiple sheets
In some situations, the data needed to be queried is spreaded across multiple sheets; however, the QUERY function can still be used despite its shortcomings.
For example, the first sheet is missing demographic information of employees having ID of 1005 and 1010, that information is in the second sheet. The QUERY function can connect these 2 sheets, but the syntax requires both tables to have the same number of rows and columns. Moreover, after adding more rows to the second table, 2 tables can only union in horizontal direction, not vertical as in SQL. This drawback can be shown in the the syntax and result table as follow: