5 strategies for Snowflake cost management

5 strategies for practicing Snowflake cost management while still enjoying all the platform’s advantages

Article by Mick Wagner

When organizations make the decision to implement the Snowflake data warehouse as a service (DWaaS) solution, cost management usually ranks high among the chief reasons. Avoiding the high costs of building and maintaining an on-prem platform and the risks of over-paying for unused system resources can certainly be attractive incentives. Even so, we often hear of cases of “sticker shock” when the first Snowflake bill arrives, or when administrators notice their credits dwindling at alarmingly fast rates.

Yes, Snowflake can enable organizations to lower overall data warehousing costs — if the solution is used in a cost-effective manner. The pay-as-you-go pricing model is comparable to using Uber as opposed to owning your own vehicle: you avoid the expense of hardware and maintenance, but every use incurs a discrete cost. If those uses are not prudently executed, tightly controlled, and closely monitored, any hoped-for overall cost savings can quickly dissipate.

Fortunately, many of the keys to more effective Snowflake cost management are built into Snowflake itself, and knowing how to use these features can mean the difference between cost savings and a budgetary fiasco. The following five strategies can be particularly effective in preventing credits from slipping away and costs from spiraling out of control.

1. Use Resource Monitors and monitor queries with Snowflake’s Query Profile tool.

2. Set timeouts for workloads and accounts.

3. Scale your virtual warehouse appropriately and use auto-suspend/auto-resume.

Also, by creating virtual warehouses for specific use cases, you’ll be able to customize them for each case. For example, you can set up ELT warehouses to auto-suspend immediately after completion, since they’re no longer needed after you run them. Report warehouses, on the other hand, should have some time built in before suspending (5 to 10 minutes is a good starting point), as people are running reports and the cached data in both the result cache and warehouse cache allows for quicker data retrieval.

You may also want to set up a SQL script to change the warehouse size at certain times of day based on user consumption patterns. There may be a heavy use period in the morning, for example, when people get into the office and run reports, which could warrant a larger warehouse; in the afternoon, when demand decreases, a smaller warehouse may be all you need.

4. Model your data appropriately.

Remember to model your data to answer business questions. Star schemas are commonly used for this purpose.

You’ll also want to pre-aggregate data wherever possible, since storage is cheaper than compute. This will not necessarily work for near-real-time data, but if you are batching data several times a day, it will help keep your compute costs down, as your BI tool will not have to query as much data to present summary information.

Also review your SQL to avoid DISTINCT, GROUP BY, or ORDER BY if they are not necessary.

Create cluster keys on fields that are used for filtering (like dates and/or codes) or that are used in joins. Remember, you can create a cluster key on more than one field.

Cluster keys can help partition the data better in Snowflake micropartitions, which will provide better pruning of unnecessary data from a query.

Snowflake is one of the latest “as a service” innovations that enable businesses to take full advantage of all the cloud offers, including the potential for overall cost savings over on-prem solutions. By implementing these five strategies, your organization can enjoy flexibility, data sharing, availability, performance, speed, and other benefits of using Snowflake, while still keeping your budget in check.

Ready to put your data to work for you? We can help. Visit our page here: https://www.logic2020.com/services/advanced-analytics

Enabling clarity through business and technology solutions.