Snowflake for Beginners Part 3: Creating, Loading, and The Magic of Semi-Structured Data

In Part 2, we explored Snowflake’s built-in sample data and learned how virtual warehouses power our queries. But in the real world, you aren’t just querying someone else’s data—you’re bringing in your own.
Today, we move from “read-only” to “read-write.” We will create our own storage hierarchy, upload a CSV file directly through the browser, and then tackle Snowflake’s “superpower”: the ability to query JSON data as easily as a standard table.
1. Building the Foundation
Before we can load anything, we need a container. Run these in a new worksheet to create your first custom database and schema. Note that we are using the SYSADMIN role here, which is the standard role for creating objects.
USE ROLE SYSADMIN;

CREATE DATABASE MY_FIRST_DB;
CREATE SCHEMA MY_FIRST_DB.STAGING;

— Verify it worked
SHOW SCHEMAS IN DATABASE MY_FIRST_DB;

2. Loading a CSV (The Easy Way)
For files under 250MB, you don’t need a complex pipeline; you can use the Snowsight UI.
First, let’s create a destination table for some imaginary “App Users”:
CREATE OR REPLACE TABLE MY_FIRST_DB.STAGING.USERS (
    USER_ID NUMBER,
    FIRST_NAME STRING,
    LAST_NAME STRING,
    EMAIL STRING,
    SIGNUP_DATE DATE
);

To load your data:
* Navigate to Data → Databases in the left sidebar.
* Drill down: MY_FIRST_DB → STAGING → Tables → USERS.
* In the top right, click Load Data.
* Drag and drop your CSV file.
* When prompted for a File Format, you can create a “Default CSV” format in the wizard—just ensure “Header lines to skip” is set to 1 if your CSV has column names.
3. The VARIANT Superpower: Querying JSON
Traditional databases often require you to “flatten” JSON into rows and columns before you can even see it. Snowflake handles this differently using the VARIANT data type, which stores JSON exactly as it is but allows you to query it with SQL speed.
Let’s create a table for raw JSON data:
CREATE OR REPLACE TABLE MY_FIRST_DB.STAGING.RAW_LOGS (
    EVENT_DATA VARIANT
);

If you have a JSON record like {“device”: “mobile”, “event”: “login”, “user_info”: {“id”: 101, “location”: “NY”}}, you can query it directly using dot notation:
SELECT
    EVENT_DATA:device::STRING AS device_type,
    EVENT_DATA:user_info.location::STRING AS city
FROM RAW_LOGS;

The : grabs the key, and the :: casts it to a specific data type.
4. Zero-Copy Cloning: The “Save Game” Feature
One of the most useful features for beginners is Cloning. Suppose you are about to run a massive update on your USERS table and want a safety net. You can create an instant “Clone” of the entire table:
CREATE TABLE USERS_BACKUP CLONE USERS;

This doesn’t copy the data (so it doesn’t cost extra storage initially); it just creates a new pointer to the existing data. It’s a perfect “Save Game” point for your data.
What You Now Know
* Object Creation: You know how to build your own Databases and Schemas.
* Web Loading: You can move data from your local machine into the cloud without external ETL tools.
* Semi-Structured Power: You’ve seen how the VARIANT type allows Snowflake to act like a NoSQL database when needed.
* Cloning: You understand how to protect your work with instant, metadata-only clones.
What’s Next?
Now that you have data in Snowflake, how do you keep it clean automatically? In Part 4, we will dive into Tasks and Streams—Snowflake’s built-in automation tools that allow you to transform data the second it arrives.

Thomas [W] Marshall

Database and Artificial Intelligence

Snowflake for Beginners Part 3: Creating, Loading, and The Magic of Semi-Structured Data

Leave a comment Cancel reply

Share this:

Related

Leave a comment Cancel reply