Snowflake for Beginners, Part 2: Your First Queries and Understanding Warehouses

In Part 1, we signed up for a Snowflake account and took a tour of the Snowsight interface. In this post, we’ll write our first real SQL queries against Snowflake’s built-in sample data, and we’ll pull back the curtain on how virtual warehouses work — including why they matter for your credit balance.
The Three-Tier Data Model
Before we write a single line of SQL, it’s worth spending a moment on how Snowflake organizes data. Everything lives inside a three-level hierarchy:
Database → Schema → Table
A database is the top-level container. Think of it as a folder for an entire project or domain — you might have a database called MARKETING, another called FINANCE, and so on.
A schema is a namespace inside a database. It groups related objects — tables, views, functions — that belong together. A common pattern is to have a RAW schema for unprocessed incoming data and a ANALYTICS schema for the cleaned, transformed version. Schemas also serve an access control purpose: you can grant users permission to one schema without exposing another.
A table is what you’d expect: rows and columns. Nothing exotic here.
When you reference an object in SQL, you can use its fully-qualified name in the form DATABASE.SCHEMA.TABLE, like this:
SELECT * FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.ORDERS;
Or, if you’ve already told Snowflake which database and schema you’re working in, you can refer to the table by name alone. You tell Snowflake your context using USE statements:
USE DATABASE SNOWFLAKE_SAMPLE_DATA;
USE SCHEMA TPCH_SF1;

— Now this works without the full path:
SELECT * FROM ORDERS;
Both approaches work. Fully-qualified names are more explicit and less likely to surprise you; USE statements are more convenient for interactive exploration. For now, we’ll use fully-qualified names so it’s always obvious what we’re querying.
The Sample Data: What’s Already There
Every new Snowflake trial account comes pre-loaded with a shared database called SNOWFLAKE_SAMPLE_DATA. This database doesn’t consume any of your storage quota — it’s a live read-only connection to data hosted by Snowflake itself. You can run queries on it, but you can’t insert, update, or delete rows (which is fine for learning).
To explore what’s inside, navigate to Data → Databases in the left navigation bar. Click on SNOWFLAKE_SAMPLE_DATA to expand it, and you’ll see several schemas. The one we’ll use today is TPCH_SF1.
TPC-H is an industry-standard benchmark dataset that models a wholesale supplier’s business — orders, customers, products, suppliers, and shipping. It’s been used to benchmark databases for decades, which means it’s realistic enough to practice real analytical thinking on. The SF1 suffix stands for “scale factor 1,” meaning it contains a base-sized dataset with several million rows — large enough to be interesting, small enough to query without burning through credits.
The TPCH_SF1 schema contains eight tables:
Table
What it represents
ORDERS
Order headers — one row per order
LINEITEM
Individual line items within orders — the largest table at ~6 million rows
CUSTOMER
Customer records
SUPPLIER
Supplier records
PART
Product/part catalog
PARTSUPP
Which suppliers carry which parts, and at what price
NATION
Countries (25 rows)
REGION
Geographic regions (5 rows)
Opening a Worksheet
Navigate to Projects → Worksheets in the left nav and click the + button in the top-right corner to create a new worksheet. A blank SQL editor opens.
You’ll notice two dropdowns near the top of the editor: one for the active database and one for the active schema. These are the worksheet-level equivalents of USE DATABASE and USE SCHEMA. Setting them here means every query in this worksheet defaults to that context without needing fully-qualified names.
For now, leave them as-is and we’ll use fully-qualified names in our queries. This keeps things unambiguous.
There’s also a warehouse selector — a dropdown that shows which virtual warehouse will execute your queries. If your account came with a default warehouse (often named COMPUTE_WH), it will be pre-selected. If the dropdown is empty, we’ll address that in the warehouse section below.
Running Your First Query
Type the following into the worksheet editor:
SELECT COUNT(*) FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.LINEITEM;
To run it, look for the Run button — it appears as a blue play button (▶) in the top-right area of the editor. Click it.
Important note about running queries: The Run button (or its equivalent button in the toolbar) executes only the SQL statement your cursor is currently positioned within — not everything in the worksheet. If you have five statements in a worksheet and want to run all of them in sequence, look for a small dropdown arrow next to the Run button and choose Run All. This distinction trips up nearly everyone at least once, so it’s worth knowing before you spend time wondering why only one thing happened.
After a moment, the results panel will appear below the editor showing a single number: around 6,001,215. That’s how many line items exist in this table. Not bad for a first query.
A Few More Queries to Try
Let’s do something a bit more interesting. Paste each of these in and run them one at a time — position your cursor inside the statement you want to run, then click Run.
How many orders were placed by customers in each region?
SELECT
    r.R_NAME        AS region,
    COUNT(o.O_ORDERKEY) AS total_orders
FROM
    SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.ORDERS      o
    JOIN SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.CUSTOMER c ON o.O_CUSTKEY = c.C_CUSTKEY
    JOIN SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.NATION    n ON c.C_NATIONKEY = n.N_NATIONKEY
    JOIN SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.REGION    r ON n.N_REGIONKEY = r.R_REGIONKEY
GROUP BY
    r.R_NAME
ORDER BY
    total_orders DESC;
This query joins five tables together, walking from individual orders all the way up through the geographic hierarchy to regions. You’ll get five rows back — one per region.
What’s the total revenue per order status?
SELECT
    O_ORDERSTATUS                           AS order_status,
    SUM(O_TOTALPRICE)                       AS total_revenue,
    COUNT(*)                                AS order_count,
    ROUND(AVG(O_TOTALPRICE), 2)             AS avg_order_value
FROM
    SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.ORDERS
GROUP BY
    O_ORDERSTATUS
ORDER BY
    total_revenue DESC;
Notice how fast these queries return. That’s not magic — it’s the combination of Snowflake’s columnar storage format and the micro-partitioning architecture that automatically indexes data behind the scenes. You didn’t create a single index. Snowflake handled that.
Previewing a Table Without Writing SQL
One of Snowsight’s small conveniences: you don’t always need to write SQL to see what’s in a table. In the left navigation, go to Data → Databases, drill down to SNOWFLAKE_SAMPLE_DATA → TPCH_SF1, and click on any table name — say, CUSTOMER. A panel opens showing the table’s column definitions, their data types, and a Data Preview tab that shows you the first rows without needing a query at all.
This is useful when you’re exploring an unfamiliar schema and want to understand what a table contains before writing against it.
Understanding Virtual Warehouses
Now that you’ve run a few queries, let’s talk about what was actually doing the work — the virtual warehouse.
In Snowflake, a virtual warehouse is a cluster of compute resources (CPU, memory, and local SSD cache) that executes your SQL. It’s completely separate from storage. Storage holds your data permanently; the warehouse is a temporary, rentable engine that reads from that storage and processes your queries.
This separation is why Snowflake can do something traditional databases can’t: you can have multiple independent warehouses all querying the same data at the same time, with no interference between them. One warehouse running an ETL job doesn’t slow down another warehouse running your BI dashboards.
Sizes and Credits
Warehouses come in T-shirt sizes. Each step up doubles the compute resources — and doubles the credit consumption rate:
Size
Credits per hour
X-Small
1
Small
2
Medium
4
Large
8
X-Large
16
2X-Large
32
… and up
…
For the queries in this post — counting rows, joining the TPC-H tables — an X-Small warehouse is perfectly sufficient. You’d only need to scale up when you’re regularly querying billions of rows or running complex transformations on large datasets. Starting small and scaling up when queries feel slow is the right approach.
Credits themselves are Snowflake’s billing currency. On a trial account, you start with $400 worth. On a paid account, credits cost roughly $2–$4 each depending on your cloud provider and whether you’re on on-demand or pre-purchased pricing. An X-Small warehouse burning 1 credit per hour therefore costs about $2–$4 per hour of active use.
The 60-Second Minimum — and Why It Matters
Snowflake bills per second, but with a minimum charge of 60 seconds each time a warehouse resumes from a suspended state. This means if you run a query that takes 3 seconds and then the warehouse sits idle long enough to auto-suspend, you’ll be billed for a full 60 seconds of that size warehouse.
For an X-Small warehouse at 1 credit/hour, 60 seconds of use costs 1/60th of a credit — roughly 3 cents at on-demand pricing. Barely noticeable for learning. But this minimum applies at every size, so a 2X-Large warehouse that resumes, runs a 5-second query, and then suspends has still billed you for a full minute at 32 credits/hour. That’s about $1 per query. Worth keeping in mind as you scale up.
Auto-Suspend and Auto-Resume
Two warehouse settings deserve your attention early:
Auto-Suspend tells Snowflake to automatically shut down the warehouse after a period of inactivity. The default is 10 minutes — meaning if you run a query and then walk away, the warehouse will run idle for 10 minutes before Snowflake turns it off. For a learning account, that’s fine. For production, it’s often set to 1–5 minutes to avoid paying for idle compute.
Auto-Resume is the opposite: when a query hits a suspended warehouse, Snowflake automatically starts it back up without any manual intervention. There’s typically a 2–5 second delay while the warehouse provisions — you’ll see a spinner in the worksheet during this time. Auto-resume is enabled by default and you should leave it on.
You can inspect and change these settings by going to Compute → Warehouses in the nav bar, finding your warehouse, and clicking into it. The settings panel will show you the current auto-suspend timeout, let you change the size, and show a credit usage chart for the warehouse.
Checking Your Credit Usage
To see how many credits you’ve consumed so far, navigate to Admin → Cost Management (the exact label may vary depending on your account configuration, but it’s in the Admin section). You’ll find a breakdown of credit consumption by warehouse, date, and service type. On a fresh trial account after running the queries in this post, you should have used a fraction of a credit — less than a nickel.
A Note on the USE Commands
We used fully-qualified names throughout this post, but you’ll see USE DATABASE and USE SCHEMA constantly in Snowflake documentation and tutorials. For reference, here’s how they work:
— Set the active database for this session
USE DATABASE SNOWFLAKE_SAMPLE_DATA;

— Set the active schema within that database
USE SCHEMA TPCH_SF1;

— Set the active warehouse
USE WAREHOUSE COMPUTE_WH;

— Now this works without any path at all:
SELECT * FROM ORDERS LIMIT 10;
These context-setting commands are session-scoped — they apply only to your current connection. If you open a new worksheet tab, you’d need to run them again (or use the dropdowns in the worksheet header, which do the same thing). They don’t change anything permanent in your account.
What You Now Know
You’ve queried a multi-million-row dataset, joined five tables together, and watched Snowflake return results in seconds. You also understand the virtual warehouse — that it’s the engine doing the work, that it’s separately billed from storage, and that the auto-suspend and auto-resume settings govern both cost and availability.
A quick recap of the key mental models from this post:
Data lives in a Database → Schema → Table hierarchy. Reference objects with their full path or set context with USE statements.
The virtual warehouse is compute, separate from storage. You pay for it only while it’s running.
Warehouses bill per second with a 60-second minimum each time they resume.
Auto-suspend stops an idle warehouse. Auto-resume starts it back up when a query arrives. Leave both enabled.
When in doubt, start with an X-Small warehouse and resize only if queries are genuinely slow.
What’s Next
In Part 3, we’ll create our own database and tables, load some real data into Snowflake using the web interface’s file upload feature, and write queries against data we own. We’ll also look at Snowflake’s VARIANT column type — the gateway to querying JSON and other semi-structured data without needing to flatten it first.
This series is designed to be hands-on. If you ran into anything unexpected — a warehouse that wouldn’t start, an error message you didn’t recognize — drop it in the comments. Real errors make the best teaching moments.

Thomas [W] Marshall

Database and Artificial Intelligence

Snowflake for Beginners, Part 2: Your First Queries and Understanding Warehouses

Leave a comment Cancel reply

Share this:

Related

Leave a comment Cancel reply