Snowflake for Beginners, Part 1: Signing Up and Finding Your Way Around

Posted by thomaswmarshall

This is the first post in a series designed to take you from zero to confident with Snowflake — the cloud data platform that’s changed how companies store, query, and share data. No prior Snowflake experience required.

—

## What Is Snowflake, and Why Should You Care?

Before you click “Sign Up,” it’s worth spending sixty seconds on *why* Snowflake exists. Traditional databases make you choose between storage and compute — you buy a server, it handles both, and when you need more power you buy a bigger server. Snowflake separates those two concerns entirely. Storage lives in cloud object stores (Amazon S3, Azure Blob, or Google Cloud Storage depending on which cloud you choose). Compute — the engines that actually run your queries — is rented on demand and scales independently. This means you pay only for what you use, and you never have to think about hardware.

That architecture is why Snowflake has become the go-to platform for data teams at companies of every size. You’ll hear it described as a “data warehouse,” but it’s grown into something closer to a full data platform: you can query structured tables, semi-structured JSON, load files, build machine learning pipelines, and even publish datasets for other organizations to consume — all from the same interface.

That interface is called **Snowsight**, and it’s where you’ll spend most of your time. Let’s get you into it.

—

## Step 1: Creating a Trial Account

Snowflake offers a **30-day free trial** with $400 worth of free credits — more than enough to learn, experiment, and build something real.

Head to [snowflake.com](https://www.snowflake.com) and click **Start for Free**. You’ll be asked for:

– **Your name and email address** — used to create your Snowflake user profile. Use a real email; you’ll need to verify it.
– **Your company name** — required, but don’t overthink it. If you’re experimenting solo, your own name works fine.
– **A password** — Snowflake enforces a reasonably strong password policy. At least 8 characters, a mix of cases, numbers, and symbols.

### Choosing a Cloud Provider and Region

This is the first decision that trips up new users, and it feels more consequential than it is. You’ll be asked to select a **cloud provider** (AWS, Azure, or Google Cloud) and a **region** (e.g., US East, EU West, etc.).

For learning purposes, any of the three cloud providers will give you an identical Snowflake experience — Snowflake abstracts away the underlying cloud almost completely. The only practical consideration: if you already use AWS, Azure, or GCP for other workloads and eventually plan to connect Snowflake to data already living in one of those clouds, pick the same provider. Data transfer within a provider’s network is faster and cheaper than crossing providers.

For region, choose the one geographically closest to you. Latency differences are minimal for a trial, but it’s a good habit.

### Choosing an Edition

The trial defaults to **Enterprise Edition**, which is exactly what you want. It unlocks multi-cluster warehouses, time travel up to 90 days, and other features you’ll want to explore. You won’t be charged automatically when the trial ends — Snowflake asks you to explicitly enter payment information to continue.

After submitting the form, check your inbox for an activation email. Click the link, set your login, and you’re in.

—

## Step 2: Your First Look at Snowsight

When you land inside Snowflake for the first time, you’re looking at **Snowsight** — the web interface that replaced Snowflake’s original “Classic Console” in 2022. As of 2025, Snowsight is the only interface available to new accounts.

The layout is divided into two primary areas: a **left navigation bar** that runs the full height of the screen, and a **main content area** that changes depending on what you select. Let’s walk through each section of that left nav from top to bottom.

—

## The Navigation Bar, Explained

### 🏠 Home

The house icon at the top of the nav bar takes you to the **Home page** — your dashboard and jumping-off point. Here you’ll find:

– **Recently updated worksheets**, so you can quickly get back to whatever you were working on last.
– **Quick Actions**, a set of prominent buttons tailored to your current role. These shortcuts change depending on your permissions, but for a trial admin account you’ll typically see options to create a worksheet, upload a file, or explore sample data.
– A **Search bar** at the top that lets you search across database objects, worksheets, dashboards, and more.

The Home page is intentionally minimal — it’s a launchpad, not a dashboard crammed with metrics.

### 📊 Projects (Worksheets, Dashboards, Notebooks)

This section is where you’ll do the bulk of your actual work. It contains three sub-items worth understanding:

**Worksheets** are Snowflake’s code editor. Open one and you get a SQL editor on the right and an object explorer on the left — similar in feel to a lightweight version of SQL Server Management Studio or DBeaver. Worksheets support both SQL and Snowpark Python. One subtlety worth knowing early: by default, clicking the **Run** button (or hitting Cmd/Ctrl + Enter) executes only the statement your cursor is currently on, not the entire file. If you want to run everything, use the **Run All** option from the dropdown next to the button. This surprises nearly every newcomer.

**Dashboards** let you combine multiple query results into a visual report using charts — bar, line, scatter, and more. They’re built on top of worksheets, so you’ll create the SQL first, then pin the results to a dashboard tile.

**Notebooks** are a newer addition, offering an experience similar to Jupyter notebooks. You write code in cells, mix SQL and Python, and see results inline. They’re particularly useful for exploratory data analysis and machine learning workflows.

### 🗃️ Data (Databases, Marketplace, Private Sharing)

This is where you manage the *stuff* Snowflake stores and serves. Three main areas:

**Databases** opens the object explorer — a tree view of every database, schema, table, view, and function your account contains. You can click into a table to preview its columns and even sample its rows without writing a single line of SQL. For a fresh trial account, you’ll find some pre-loaded sample datasets here (look for a database called `SNOWFLAKE_SAMPLE_DATA`), which are great for practicing queries without needing to load your own data yet.

**Marketplace** is one of Snowflake’s most distinctive features. It’s a data exchange where thousands of third-party datasets are available — weather data, financial data, demographic data, and more. Many are free. When you access a Marketplace dataset, the data doesn’t copy into your account; instead, Snowflake gives you a live, read-only connection to the provider’s data. This is the platform’s “data sharing” architecture in action.

**Private Sharing** is the enterprise counterpart to the Marketplace — it’s how you share your own data with specific other Snowflake accounts, or receive shared data from partners, without any file transfers or ETL pipelines.

### ⚡ Activity (Query History, Copy History, Task History)

**Query History** is invaluable for debugging and auditing. Every query that runs against your account — from every user, every tool, every automated job — is logged here with execution time, bytes scanned, warehouse used, and the full SQL text. If a query is running slow, this is your first stop. For trial accounts, history retention is 14 days.

**Copy History** shows the status of bulk data loads performed via Snowflake’s `COPY INTO` command, which is the primary way to load files from cloud storage into tables.

**Task History** tracks the execution runs of **Tasks** — Snowflake’s scheduler for automating SQL or procedural logic on a cron schedule.

### 🖥️ Compute (Warehouses, Resource Monitors, Budgets)

**Warehouses** is a section you’ll visit early and often. In Snowflake, a “warehouse” isn’t a building — it’s a cluster of compute that executes your queries. Think of it as the engine you rent by the minute. Warehouses come in T-shirt sizes: XS, S, M, L, XL, and up to 6XL. An XS warehouse is sufficient for most learning exercises and costs about 1 credit per hour. Warehouses auto-suspend when idle (you set the timeout — 5 minutes is a sensible default) and auto-resume the moment a query hits them. You will almost never need to manually start or stop a warehouse.

**Resource Monitors** let you cap credit consumption on a warehouse — set a monthly limit and choose whether Snowflake should notify you, suspend the warehouse, or suspend and notify when the limit is hit. Essential for production accounts; good to understand even on a trial.

**Budgets** provide a higher-level spending view across your whole account, letting you set targets and track consumption by category.

### 👤 Admin (Users, Roles, Warehouses, Accounts)

The Admin section is where account administrators manage users, roles, and security policies. As the owner of a trial account, you’re automatically given the **ACCOUNTADMIN** role — the most powerful role in Snowflake. It can see and do everything.

This is a good moment to understand Snowflake’s **role-based access control** system, because it will come up constantly. Rather than granting permissions directly to users, Snowflake grants permissions to *roles*, and roles are then assigned to users. The built-in hierarchy looks like this:

– **ACCOUNTADMIN** — full account control. Should be used sparingly in production.
– **SYSADMIN** — can create and manage warehouses and databases.
– **USERADMIN** — can create users and roles.
– **PUBLIC** — automatically granted to every user; represents a baseline of no real permissions.

You can create custom roles — and you should, in any real deployment. For learning, ACCOUNTADMIN is fine.

—

## Step 3: The Bottom of the Navigation Bar

At the very bottom of the left nav you’ll find your **username**. Clicking it opens a small menu with a few important options:

**Settings** takes you to your personal profile, where you can change your password, set a default warehouse and role (so Snowsight always loads in a sane state), configure your display theme (light or dark), and enroll in **Multi-Factor Authentication (MFA)**. Snowflake strongly encourages MFA, and you’ll see a prompt every three days until you enable it. Take five minutes and set it up — it’s standard practice and Snowflake makes it easy.

**Role Switcher** lets you switch between any role your user has been granted. The currently active role is displayed next to your username. Roles are central to how Snowflake determines what you can see and do, so being aware of which role you’re operating under is a habit worth building from day one.

—

## What You Now Know

You’ve created a Snowflake account, made your first infrastructure decision (cloud provider and region), and taken a tour of every major area of the Snowsight interface. Here’s the one-line summary of each section for quick reference:

| Section | What it’s for |
|—|—|
| Home | Your launchpad — recent work and quick actions |
| Worksheets | Write and run SQL or Python |
| Dashboards | Visualize query results as charts |
| Notebooks | Interactive, cell-based development |
| Data / Databases | Browse and manage your data objects |
| Marketplace | Access third-party datasets |
| Query History | Debug, audit, and review past queries |
| Warehouses | Manage your compute resources |
| Admin | Users, roles, and account settings |

—

## What’s Next

In **Part 2**, we’ll write our first real SQL queries against Snowflake’s sample data, understand how warehouses start and stop (and why it matters for your bill), and learn the difference between a database, a schema, and a table in Snowflake’s three-tier data model.

For now, poke around. Click into the sample database, open a worksheet, and see what’s there. The worst thing that can happen on a trial account is spending a few credits — and you have $400 worth of them.

—

*Questions or stuck on something? Drop a comment below — this series is designed to be practical, and your questions shape what we cover next.*

AWS Hardening

Posted by thomaswmarshall

Cloud Maturity Series — Posts 1–3 (LinkedIn Drafts)

Post 1: Start With the Account, Not the App

Most cloud problems don’t start in code.

They start in the AWS account itself.

Before VPCs.

Before CI/CD.

Before a single workload is deployed…

I lock down the account foundation:

• MFA on the root account

• Root access keys removed

• Break-glass strategy defined

• IAM users eliminated in favor of roles

• Clear ownership and access boundaries

Cloud maturity starts at the control plane, not the application layer.

#AWS #CloudArchitecture #IAM

#CloudSecurity #PlatformEngineering

#Infrastructure #OpenToWork

Post 2: Identity Is the Real Perimeter

Firewalls don’t protect cloud environments.

Identity does.

Once the account is secured, the next layer is IAM done right:

• Role-based access instead of long-lived users

• Least-privilege policies mapped to job function

• Explicit separation between human and workload identities

• MFA everywhere it makes sense

In AWS, IAM is the blast radius.

If identity is loose,

no amount of network segmentation will save you.

Get this layer right and everything downstream becomes safer:

deployments, automation, audits, and incident response.

Cloud security isn’t a tool problem.

It’s an identity design problem.

#AWSIAM #ZeroTrust #CloudSecurity

#DevSecOps #PlatformEngineering

#InfrastructureAsCode #OpenToWork

Post 3: Auditability Before Availability

Most cloud failures aren’t outages.

They’re untraceable changes.

Before scaling workloads or optimizing cost,

I establish control-plane observability.

That means always being able to answer:

• Who changed this?

• What changed?

• When did it happen?

• Can we prove it?

This layer is built with:

• CloudTrail — immutable API history

• AWS Config — resource state and drift detection

• GuardDuty — security signal, not just logs

Compliance benefits (SOC 2, ISO, PCI) are a side effect.

The real value is engineering control.

If you can’t reconstruct an incident,

you don’t truly control your environment.

Observability isn’t something you add later.

It’s something you start with.

#AWS #CloudEngineering #AuditLogging

#IncidentResponse #DevSecOps

#InfrastructureAsCode #OpenToWork

The Warehouse Sizing Paradox

Posted by thomaswmarshall

The Warehouse Sizing Paradox: Why I Sometimes Choose XL Over Small
“Always use the smallest warehouse possible to save money.”
I heard this advice constantly when I started with Snowflake. It sounds logical—smaller warehouses cost less per hour, so naturally they should be cheaper, right?
Except the math doesn’t always work that way.
Here’s what I’ve observed:
Snowflake charges by the second with a 60-second minimum. The cost difference between warehouse sizes is linear, but the performance difference often isn’t.
The actual formula is simple:
Total Cost = Credits per Second × Runtime in Seconds
A Small warehouse might be 1/4 the cost per second of an XL, but if it takes 5x longer to complete the same query, you’re paying more overall.
When I’ve seen this matter most:
Working with subscription and order data, certain query patterns consistently benefit from larger warehouses:
→ Customer lifetime value calculations across millions of subscribers
→ Daily cohort analysis with complex retention logic
→ Product affinity analysis joining order details with high SKU cardinality
→ Aggregating subscription events over multi-year periods
These workloads benefit dramatically from parallelization. An XL warehouse has 8x the compute resources of an XS, and for the right queries, it can complete them in less than 1/8th the time.
A simple experiment you can run:
— Test with Small warehouse
USE WAREHOUSE small_wh;
SELECT SYSTEM$START_QUERY_TIMER();

SELECT
    subscription_plan,
    customer_segment,
    COUNT(DISTINCT customer_id) as subscribers,
    SUM(order_value) as total_revenue,
    AVG(order_value) as avg_order_value,
    COUNT(DISTINCT order_id) as total_orders
FROM orders
WHERE order_date >= ‘2023-01-01’
GROUP BY subscription_plan, customer_segment
HAVING COUNT(DISTINCT customer_id) > 100;

— Note the execution time and credits used in query profile

— Now test with XL warehouse
USE WAREHOUSE xl_wh;
— Run the same query
Check the query profile for each:
Execution time
Credits consumed (Execution Time × Warehouse Size Credits/Second)
Total cost
The decision framework I use:
Size up when:
Query runtime > 2 minutes on current warehouse
Query profile shows high parallelization potential
You’re running the query repeatedly (daily pipelines)
Spillage to remote disk is occurring
Stay small when:
Queries are simple lookups or filters
Runtime is already under 30 seconds
Workload is highly sequential (limited parallelization)
It’s truly ad-hoc, one-time analysis
The nuance that surprised me:
It’s not just about individual query cost—it’s about total warehouse utilization. If your Small warehouse runs 10 queries in 100 minutes, but an XL runs them in 20 minutes, you’re paying for 80 fewer minutes of warehouse time. That matters when you’re paying for auto-suspend delays, concurrent users, or just opportunity cost.
My practical approach:
I start with Medium for most workloads. Then I profile:
Queries consistently taking 3+ minutes → test on Large or XL
Queries under 1 minute → consider downsizing to Small
Monitor credit consumption patterns weekly
The goal isn’t to find the “right” size—it’s to match warehouse size to workload characteristics.
Want to test this yourself?
Here’s a quick query to see your warehouse credit consumption:
SELECT
    warehouse_name,
    SUM(credits_used) as total_credits,
    COUNT(*) as query_count,
    AVG(execution_time)/1000 as avg_seconds,
    SUM(credits_used)/COUNT(*) as credits_per_query
FROM snowflake.account_usage.query_history
WHERE start_time >= DATEADD(day, -7, CURRENT_TIMESTAMP())
    AND warehouse_name IS NOT NULL
GROUP BY warehouse_name
ORDER BY total_credits DESC;
This shows you which warehouses are consuming credits and whether you might benefit from right-sizing.
The counterintuitive truth:
The cheapest warehouse per hour isn’t always the cheapest warehouse per result. Sometimes spending more per second means spending less overall.
What’s been your experience with warehouse sizing? Have you found scenarios where bigger was actually cheaper?
#Snowflake #DataEngineering #CostOptimization #CloudDataWarehouse

Temp tables, CTEs, or RESULT_SCAN? Here’s how I decide.Every time I’m building a data transformation in Snowflake, I ask myself the same three questions. Getting these right has become one of the most practical ways I optimize both performance and cost.The problem? Most teams pick one pattern and use it everywhere. I did this too early on—defaulting to temp tables for everything because they felt “safe.”Here’s the framework I’ve developed:Question 1: How long do I actually need this data?→ Just iterating on analysis right now? RESULT_SCAN – Reuse cached results from expensive queries – Zero compute cost for subsequent filters/aggregations – Perfect for exploration, but expires in 24 hours – When: Refining a report, exploring segments, debugging→ Need it for multiple operations in this session? TEMP TABLES – Materialize once, reference many times – Supports updates, joins, clustering – Persists through the session – When: Multi-step ETL, quality checks, complex workflows→ Just organizing logic within one query? CTEs – Maximum readability, zero storage – Snowflake optimizes the entire plan together – NOT materialized (this surprised me initially) – When: Breaking down complex business logicQuestion 2: Is this ad-hoc or production?Ad-hoc analysis → Lean toward RESULT_SCAN and CTEsProduction pipeline → Temp tables for reliability and testabilityQuestion 3: Am I reusing this computation?If you’re filtering/joining the same expensive base query multiple ways, that’s your signal to materialize it somehow—either as a temp table or by leveraging RESULT_SCAN.What this looks like in practice:Imagine a daily reporting workflow:- Expensive aggregation across billions of rows → TEMP TABLE (computed once)- Logical transformations on that data → CTEs (readable, optimized)- Stakeholders request variations → RESULT_SCAN (free iterations)This hybrid approach combines reliability, readability, and cost efficiency.The shift in mindset:I stopped asking “which pattern should I use?” and started asking “what does this specific transformation actually need?”Snowflake gives us different tools for different jobs. The art is knowing when each one fits.What decision-making frameworks have helped you optimize your Snowflake workflows?#Snowflake #DataEngineering #CloudDataWarehouse #CostOptimization—

Posted by thomaswmarshall

Thomas [W] Marshall

Database and Artificial Intelligence

cost

Snowflake for Beginners, Part 1: Signing Up and Finding Your Way Around

AWS Hardening

Cloud Maturity Series — Posts 1–3 (LinkedIn Drafts)

Post 1: Start With the Account, Not the App

Post 2: Identity Is the Real Perimeter

Post 3: Auditability Before Availability

The Warehouse Sizing Paradox