❄️ Snowflake Data Warehouse — V2 Guide
Export your CRM data as Snowflake-ingestion-ready files. CSV or JSONL, date-range filtered, with starter DDL. No JDBC, no streaming sync, nothing to break during an API version bump. Same export-first pattern that made our QuickBooks V2 ship.
Overview
Snowflake Data Warehouse V2 emits clean files; you handle the upload to Snowflake on your side. Pick a type (customers, interactions, transactions), date range, format (CSV or JSONL), preview, download. Then PUT the file to your Snowflake stage and COPY INTO the target table.
Three operations on the Snowflake side, all standard:
-- 1. Stage your file
PUT file:///path/to/rubipro_customers_*.csv @your_stage AUTO_COMPRESS=TRUE;
-- 2. Load it
COPY INTO RUBIPRO_CUSTOMERS FROM @your_stage/rubipro_customers_*.csv
FILE_FORMAT = (TYPE='CSV' FIELD_OPTIONALLY_ENCLOSED_BY='"' SKIP_HEADER=1
EMPTY_FIELD_AS_NULL=TRUE NULL_IF=('NULL', ''));
-- 3. Verify
SELECT COUNT(*), MAX(created_at) FROM RUBIPRO_CUSTOMERS;
What's new in V2
Export-first architecture
Clean files instead of streaming sync. Tenants get reliability — exports are static artifacts, every download is auditable, and there's nothing in the path that could break when Snowflake bumps an API version. Your warehouse engineers handle the COPY INTO with whatever orchestration tooling they already use.
Three canonical export types
Customers, interactions, and transactions — the three highest-value data shapes for analytics. Each is tenant-scoped and date-range filterable. Up to 50k customers / 100k interactions / 100k transactions per export.
CSV and JSONL formats
CSV for column-typed Snowflake tables (RFC 4180 quoted, header row, NULL-handling-friendly). JSONL for VARIANT-typed loads (one JSON object per line, ready for STRIP_OUTER_ARRAY=FALSE).
Starter DDL download
One click → download a .sql file with CREATE TABLE IF NOT EXISTS for all three export types. Adjust column types if your downstream queries need different precision; otherwise paste-and-run in Snowsight.
Preview before download
The preview button hits the same endpoint with mode=preview, returns the first 10 rows as JSON, no file written. Sanity-check the export shape and row count before committing.
Audit-logged history
Every export attempt — success, preview, error — logs to snowflake_exports with row count, byte size, file name, error message. Visible in the dashboard's Recent Exports table and the sysadmin state viewer.
Export types
Customers
Full CRM customer list. One row per customer.
Columns: id, tenant_id, first_name, last_name, email, phone, status, address_line1, address_line2, city, state, zip_code, country, created_at, updated_at
Filter: by created_at (date range). Cap: 50,000 rows per export.
Interactions
Canonical event stream. Every customer interaction the CRM has logged.
Columns: id, tenant_id, customer_id, agent_id, channel, direction, status, notes, duration_seconds, start_time, end_time
Filter: by start_time. Cap: 100,000 rows per export.
Transactions
All payment_attempts (every status — succeeded, declined, failed, refunded).
Columns: id, tenant_id, customer_id, agent_user_id, amount, currency, status, payment_method, gateway, gateway_transaction_id, card_brand, card_last4, description, created_at
Filter: by created_at. Cap: 100,000 rows per export.
CSV vs JSONL
Both formats carry the same data. Pick based on your downstream Snowflake schema:
- CSV — RFC 4180-quoted, header row, UTF-8. Best for Snowflake tables with
NUMBER/VARCHAR/TIMESTAMP_NTZcolumns. Faster to query, stricter on types. UseFIELD_OPTIONALLY_ENCLOSED_BY='"'in the COPY INTO file_format. - JSONL — JSON Lines, one object per line. Best for tables with
VARIANTcolumns or when you want Snowflake's semi-structured query operators (flatten(),$col:field). More flexible, slightly slower to query. UseTYPE='JSON' STRIP_OUTER_ARRAY=FALSEin the file_format.
Starter DDL
The "Download starter DDL" button on the module dashboard hands you a .sql file with three CREATE TABLE statements. Run it in your target Snowflake database/schema once.
-- Excerpt — full file is generated per-tenant with proper header
CREATE TABLE IF NOT EXISTS RUBIPRO_CUSTOMERS (
id NUMBER(38, 0),
tenant_id NUMBER(38, 0),
first_name VARCHAR(100),
last_name VARCHAR(100),
email VARCHAR(255),
phone VARCHAR(50),
status VARCHAR(50),
address_line1 VARCHAR(255),
address_line2 VARCHAR(255),
city VARCHAR(100),
state VARCHAR(50),
zip_code VARCHAR(20),
country VARCHAR(100),
created_at TIMESTAMP_NTZ,
updated_at TIMESTAMP_NTZ
);
All three table DDLs are in the file plus recommended COPY INTO recipes as comments.
PUT + COPY INTO recipes
CSV
PUT file:///path/to/rubipro_customers_*.csv @your_stage AUTO_COMPRESS=TRUE;
COPY INTO RUBIPRO_CUSTOMERS FROM @your_stage/rubipro_customers_*.csv
FILE_FORMAT = (TYPE='CSV' FIELD_OPTIONALLY_ENCLOSED_BY='"' SKIP_HEADER=1
EMPTY_FIELD_AS_NULL=TRUE NULL_IF=('NULL', ''))
ON_ERROR = CONTINUE;
JSONL
PUT file:///path/to/rubipro_interactions_*.jsonl @your_stage AUTO_COMPRESS=TRUE; COPY INTO RUBIPRO_INTERACTIONS FROM @your_stage/rubipro_interactions_*.jsonl FILE_FORMAT = (TYPE='JSON' STRIP_OUTER_ARRAY=FALSE) MATCH_BY_COLUMN_NAME = CASE_INSENSITIVE ON_ERROR = CONTINUE;
Scheduling exports
V2 ships with manual exports. For most tenants, a daily run is enough — set the date range to "yesterday's date" before downloading and you get clean incremental loads.
Common cadences:
- Daily — for high-volume contact centers. Set From/To to yesterday, run before BI dashboards refresh.
- Weekly — for most tenants. Set From to the day after last week's run.
- Monthly — for tenants whose analytics is monthly anyway.
V2.1 will expose a public API endpoint for scheduled exports (with API key auth) so you can wire this into your existing data-pipeline tooling.
Audit log
Every export logs to snowflake_exports — tenant_id, user_id, export_type, format, date range, row count, file size, status (success/preview/error), error message, created_at. Visible to tenant admins on the dashboard's Recent Exports table and to sysadmins on the state viewer at /admin/view-snowflake-state.php.
Known limitations (V2)
- Export size caps — 50k customers, 100k interactions/transactions per export. Larger exports need to be split via date-range filters.
- No live sync — by design. V1's streaming connector is gone. If you need real-time, use Snowflake's external functions to poll our REST API instead.
- No API endpoint for scheduled exports yet — V2.1 will add this. Today, exports are tenant-admin-UI driven only.
- No incremental detection — every export is full within its date range. Set the date range yourself for incremental loads.
- JSONL doesn't compress on the RubiPro side — Snowflake's
PUT AUTO_COMPRESS=TRUEhandles compression at upload time.
Frequently asked questions
Why export instead of live sync?
What's in each export type?
CSV or JSONL?
Starter DDL — what's in it?
How big can exports get?
Tenant isolation?
Where's the audit log?
Snowflake credentials?
Can I automate?
Date range — how does it work?
How much does this cost?
Ready to load your CRM data into Snowflake?
Activate, download starter DDL, run CREATE TABLE, export, COPY INTO. ~15 minutes to first table.