In a data world where AI is all the craze, there still exists a challenge today around AI readiness due to things like data governance and data quality. Most recently from dbt Labs: 2025 State of Analytics Engineering Report, it was called out that fundamentals still matter the most with 56% of data practitioners stating poor data quality as a top challenge.
Often times when data teams experience data quality challenges, it usually boils down to one thing - a well designed data model and transformation layer. In this blog post, I’m going to share how we helped a national pet‑insurance company build a centralized, scalable data transformation layer. As a result, the company had governed and scalable business logic. Data analysts were able to focus on the more interesting problems and business teams were able to be a bit more self serve in navigating their data.
Our client was at a real inflection point at the time of our introduction. More data was continuing to flow into BigQuery via Airbyte. Technical debt was continuing to grow in their data pipelines, and the Data Analyst was drowning in ad-hoc requests that required VERY quick turnaround. The same business critical questions were being asked month over month, and should have been easy to address:
Numbers were consistently unreliable and it became difficult to report trusted data both internally as well as externally to the board and potential investors. The data environment was quickly becoming a hot mess, and the business had growing concerns around their investment in data and why it was taking so long to get answers.
Peeling the onion one layer deeper, the data team had invested in a BI tool that they were leveraging to transform data and build dashboards. The challenge here is that at scale, this becomes pretty unmanageable. Consequently the Data Analyst had less and less time to provide valuable business insights.
Performance issues began to emerge making it difficult to view dashboards and make decisions quickly. Data was being flagged as incorrect, untrustworthy, and unusable from consumers rather than being caught beforehand (e.g were insurance plan status’ captured correctly, was enrollment data fully synced for each customer, etc.). This resulted in an influx of ad-hoc requests that were solved in a hasty and reactionary way. Duplicated datasets arose, inconsistent metrics were created, and time was wasted trying to manage this complex system.
With an unmanageable BI environment and lack of best practices, as the technical problems grew so did the organizational ones. There was no documentation. This meant that onboarding new hires (both on the data and non data teams) was very difficult. For existing team members, nobody knew where to access information, nor had clarity on how metrics were defined. When numbers did not align with expectations, there was no way of knowing why - what were the limitations of the data, what assumptions existed, was the dashboard just stale?
The business team and the data team were no longer on the same page. There was a huge communication gap when it came to metric definitions, data lineage, and report ownership because the team did not work together to define this. Stakeholders began to look for their answers across different platforms and create excel spreadsheets that would become unknowingly outdated.
We see this problem all the time. At an early stage spreadsheets work, and then the company needs something slightly more sophisticated. Typically this may look like investing in BI and a data analyst team member. From here, because the company is still relatively small, logic and data models start to live in BI tools. Thus, data practitioners gets locked into a vicious cycle with their business stakeholders, combating maintaining dashboards, ad-hoc requests, and “data modeling” in BI. When a company is small, some processes just don’t work anymore. The good news is there is a pathway forward! In this case, the leadership team had the courage to recognize these failing processes, and had an appetite for a better way. Not only to scale as a company, but also to create a better experience when reporting information to the board and potential investors. This is when Data Culture was brought in.
Data Culture proposed a simple yet powerful shift- move transformations out of the BI tool and into a centralized layer in BigQuery. Our tool of choice? dbt. Documentation was also introduced as a way to better communicate across teams and reduce the time needed for new hire onboarding.
In order to enable a more scalable, accessible, and trusted way for stakeholders to view reports in Sigma Computing, we took the following approach:
The gains for leveraging a transformation layer were huge! There was a lot more simplicity, data was more reliable, and Sigma’s performance improved. As more repeatable use cases came up, it was easy to expand the model.
Simplicity - Reporting models built in DBT helped with the cleaning and processing of data, creating datasets that were readily built for analysis. Additionally it became easier to track the lineage of metrics and make business logic changes in one place.
Reliability - Instead of using raw data or manual spreadsheets to analyze data directly in BI, DBT helped to centralize business logic that made the data more reliable. Anomalies were able to be detected and metrics were defined in one place using agreed upon business logic. By having DBT testing in place, data quality issues were now flagged before the VP of Finance.
Speed - By using DBT, the performance and load times of displaying reports in Sigma improved significantly. Additionally, the data analyst no longer had to build out datasets or rebuild already created datasets due to changes in the business, making it faster to answer common business questions.
Scalability & Transparency - with DBT in place, as the business continued to scale, it became easier to incorporate changes in business operations to the model and expand into new ones. This also avoided the old approach of creating new datasets for new analysis. This is because DBT provided a set of fundamental tables to build from, ensuring validation across all datasets and minimizing net new build time.
Better Communication - Documentation was available for both technical and non technical stakeholders. We leveraged dbt docs, Confluence, and Dashboards to add further clarity on the data being used by stakeholders . This made a huge difference in onboarding team members, reporting trustworthiness, and change management.
In the case of this pet insurance company, implementing data strategy and a transformation layer helped the organization have quality up front with clear a understanding of business processes. It allowed the team to take a proactive approach with data when it came to optimizing operations, deepening customer and plan insights, and drive sustainable growth. The company has been set up to scale over time including expanding modeling capabilities and incorporation of AI technology.