Creating a connection pool for Snowflake
Have you ever tried to query Snowflake from your web application? You might have noticed that your queries are taking longer than you expected from the app. The reason for this is you are creating a new snowflake connection every time there is a request from your web app. This adds an additional overhead of ~2s for each request. In this article, we will see how to create a connection pool for snowflake so that you can utilize the same connections to make the queries rather than creating new connections for every request.
I’ve implemented the connection pool on NodeJS, but you may use the language of your choice and the concept will remain the same. So let’s get started!
- Installing the necessary libraries
~ npm install --save snowflake-sdk
~ npm install --save generic-pool
2. Create a file
snowflake.js and add the following code
Let us understand what we have done here
- We have first created a factory object. Inside which we are initializing create, destroy and validate functions that will be responsible to create, destroy and validate the Snowflake connection
- Then we have created an opts object. Here we have configured the connections pool as per our needs. You can read more about the configuration options here
3. Add the following code to utilize the connection
We can utilize the pool and query using a connection from the pool by importing the
snowflake.js file and calling the query function.
- The query function first acquires a connection from the connection pool
- Then it executes the query which is passed to the query function
Some findings on connection pool :
- Does creating new connections start the warehouse?
According to our testing, the warehouse didn’t start when we create a new connection (It only starts when we query the warehouse). That means creating new connections will not add extra credits to the warehouse.
- Does keeping the connections alive will cost you?
The cost for creating and maintaining the Snowflake connections depends on the service layer which handles most of the metadata operations. The cost for the services layer depends on your subscription and it is very little or free.
- How long should you keep the connections alive?
There is an option called
clientSessionKeepAlivewhich probably could be used to keep the connection alive forever. But since it queries the warehouse after a certain interval this option could cost you extra to keep the connections alive.
That's all there is for creating a connection pool. Let me know your thoughts and if you have any questions in the comments below!