Cannibalizing the Kiss

Stars in the spiral unbecoming devoid, hurtles of caress, grope, cataclysm, they render near kiss between cannibals elliptical on haunches backlit and toward a cessation of flares. Quadrants seduced…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Snowflake Cloud Data Warehouse

A truly elastic, scalable cloud data warehouse

Snowflake is a cloud Data warehouse offered as Software-as-a-Service(SaaS) on multiple clouds(AWS and Azure) for analytics workload. This is a cloud service similar to AWS Redshift, Google Big Query(sort of). Lately, I have been hearing a lot of good things about this and a lot of companies moving to this cloud data warehouse. Hence I decided to dig a little deeper into it.

As you can see there are 3 major components:

Depending on what your requirements are these can vary but overall I found these things as few the many reasons why so many people are moving towards Snowflake.

But wait, all these already available in AWS redshift or I can get a similar feature in any other cloud data warehouse solutions.

There is no dark magic involved in improving the efficiency of your queries. Based on whom you ask this can be considered as a standout feature or a major hindrance but I am not a fan of tuning queries according to my workload as I feel the way data evolves so quickly in organizations. It becomes more tricky to play catch up to turn the all necessary knobs to make the query faster.

Snowflake claims they tune all the queries “automagically” via a dynamic query optimization engine. No need for any indexes, updating statistics, partition keys or pre-shard data for even distribution when you scale up. All of this will be done by their patent-pending dynamic optimization.

But I still feel snowflake can work on providing the necessary knobs for people who would like to tune their queries.

The underlying file system can either be Amazon S3 or Azure storage. This gives the choice to have our Data warehouse in Azure or AWS. But it also utilizes all the storage, IO throughput guarantees already provided by these cloud vendors. All of this data is encrypted, compressed and distributed to optimize performance.

These are typically a cluster of compute resources in Snowflake. These are stateless compute nodes. Since storage is decoupled from compute, the virtual warehouse is a bunch of worker nodes with no state, although they do have cache that store some data to improve query performance but the actual data is never stored in these nodes. Virtual Warehouse essentially is a set of worker nodes that we can pick, choose and scale as per the workload that would just execute the query on demand.

Another big advantage is that multiple Virtual Warehouses can be run on the same data that is stored on the cloud storage.

Snowflake automatically suspends the warehouse if it is inactive for the specified period of time. This is enabled by default and suspends automatically after 10 minutes of inactivity.

Auto-resumption is the feature that automatically resumes the warehouse when any statement that requires a warehouse is submitted and the warehouse is the current warehouse for the session. This is enabled by default.

You can control these things when you create your virtual warehouse.

There are other interesting things like Micro-partitions & Data Clustering which allow users to tune some queries to work around the fact that there are no indexes or partitions. Also, there is Snowpipe which is Snowflake’s continuous data ingestion service which is an alternative for bulk loads.

We can delve into those aspects some other day.

There are a lot of other cool features in Snowflake that makes it to be considered for your next cloud data warehouse. Also, they offer 400$ credit if you are planning to do some POC.

Thanks for reading! Please do share the article, if you liked it. Any comments or suggestions are welcome! Check out my other articles here.

Add a comment

Related posts:

Inclusive Summer Reading List

To be completely honest, I love YA literature. I always have. YA fiction is hands down my favorite genre. So when I entered the classroom, I didn’t struggle reading what my students were reading. I…

HUJAN DI AWAL DESEMBER

Memulai sesuatu memang tidaklah mudah, apalagi bila datang tak menentu dari prediksi. Memulai lagi-lagi tidak hanya modal berani, namun konsistensi adalah titik awal yang bisa dikatakan sulit untuk…

The Subversive Nutcracker

For those who celebrate Christmas, few things are as evocative of the season as the brightly-colored nutcrackers most of us set on our window sills and mantlepieces. Growing up around Pennsylvania…