If you work in data, this may sound familiar - monthly reviews are around the corner, a key teammate is OOO, and a BI Dashboard contains logic that needs updating for an upcoming board meeting. The process is undocumented, pipelines are broken, and when it does get fixed, the numbers now “look wrong” with no explanation. Chaos brews among the business to fix these problems and fast! In this scenario, you can either keep slapping duct tape to save the day, or you can stop everything and take time to build that scalable data model you know is desperately needed. So, how do you decide what to do?
Data Landscape Going In
We recently worked with a healthcare organization who was early in their data journey. They were leveraging manual data files from vendors to report on operational metrics. The metric logic and transformations lived in the BI Layer. It was known that this was not best practice, but it was effective for the current state. It was clear to the Head of Data and VP of Business Operations that their organization needed to invest more in having a sophisticated data layer, especially with upcoming clinic migrations to Athena. The team was looking to build foundational and standardized data models that could scale with the company.
The Fork In The Road
Our team at Data Culture came in to be a strategic partner in building data capabilities. Our goal was to build foundational and standardized data models, revamp reporting, and share actionable insights. It was important to move fast in our development, parsing out the minimum requirements now vs features that could be added later. We worked with business stakeholders and the Head of Data to build a 12+ week roadmap, sharing a game plan, milestones and outcomes for the course of the engagement.
However, we immediately hit a roadblock when one of our primary stakeholders went on leave and a monthly report needed to be updated. Processes were not fully documented and fragile. It became clear that the data in manual files changed month to month, metric definitions were unclear, historical data could not be mapped to new data feeds, and the pipeline was not refreshing as expected.
The team decided to pause on the data roadmap and focus solely on stabilizing the current process (patchwork). This was a defensible choice as board meetings don’t wait. However, our team also knew that if we only stabilized reports and never progressed on the model, we would be in an endless loop of ad-hoc work. Additionally, all of the challenges with manual processes would not be to be solved for. So, how did we decide which path to take?
Stabilizing Current Processes While Building For The Long Term
Our team emphasized the importance of splitting our time 85/15 (patchwork/long term). It was important to stabilize the immediate pain points by replicating processes outside of BI and centralizing them in the Warehouse. In parallel, it was also important to continue incremental work on the long-term model and reduce friction that comes with having manual data. We also needed to document metrics, assumptions, and limitations to provide clarity for anyone consuming reports. This looked like:
- Taking a lift-and-shift approach for BI logic into Snowflake. Reproducing reports to have identical numbers reduced friction with stakeholders and allowed for data auditing.
- Capture history by creating a manual file union table. New manual files were replacing old files in the code. This meant data could get added, removed, or updated and it was difficult to track file changes month over month. By having a union table, we were able to provide explainability for the data.
- Documenting metrics. To provide clarity for the business, and free up time for the Head of Data, each metric in it’s current state was clearly defined. This included explaining what date the metric was oriented on, what filters were applied to the metric, and how data was bucketed (which CPT codes constituted new and returning visits).
- Building a lightweight, iterative identity-resolution plan. Without having to create a full dimensional model for patients, an iterative approach was taken for available features to tie patients in new and historical systems. In this step, the most important thing was to document where the gaps were, what was coming next, and when.
- Prioritized foundational modeling for the highest-value metrics. Working on the long term model just a little bit allowed us to stay ahead of incoming requests requiring more data modeling work.
- Maintaining communication every step of the way. Given the time sensitive nature of the reports, we provided updates twice a week to show progress, communicate timelines, explain gaps and next steps.
Outcomes For The Team
- Process Improvements: With logic centralized in the warehouse, it was easy to update code with new files and run the changes to flow downstream without breaking pipelines and dashboards.
- Traceability: Having historical copies of manual uploads allowed for explainability for shifting numbers, easing concerns related to data integrity.
- Momentum For Long Term Modeling: Incremental modeling meant still making progress on the foundational model, allowing for a clear path beyond ad-hoc firefighting for monthly reviews.
- Shared Knowledge: Shared documentation with the business allowed for the Head of Data and GM to have more time for broader business initiatives.
- We were ready: As always in data, stakeholders will continue to ask for more. For us, this meant questions around how we can leverage live data with historical data among other questions. Since we allocated time to data modeling, we were ready to go when the businesses needed us, rather than having to catch up later.
A Decision Framework For When To Stabilize vs Invest
So, now you may be wondering how you can apply this to your own organization. My recommendation would be to ask these questions to help pick the right path for you:
- Is there an upcoming, immovable deadline (board meeting, external monthly review)?
- If yes → stabilize enough to ensure consistent numbers for that deadline
- If no → stick to the long-term data model plan. However, I would challenge you to think about how your data roadmap could be broken into bite-sized pieces to reduce bottlenecks that allow stakeholders to start consuming data
- Will a temporary fix create repeated monthly work?
- If yes → prioritize a short fix plus a plan to automate or model in parallel
- If no → allocate resources to implementing your the short term solution; you can pick back up on your long term roadmap shortly after
- Can logic be replicated quickly?
- If yes → replicate logic and processes now, and archive the manual steps; this buys breathing room for long term work as stakeholders can run with what they have
- If no → communicate to your team what and why this will take a long time. Share how logic replication outcomes may still produce challenges going forward; compare this with your long term solution and how it can help fill the gaps; Be sure to consider how your data model can take an agile development approach to reduce bottlenecks
- Does the team have the bandwidth and resources for incremental modeling?
- If yes → Allocate some time (even if just 10%) towards long term modeling. It is entirely possible that because the short term solution won’t solve for all of the problems, the team will quickly begin asking for things that the long term solution will solve for. In these instances, you will be ready!
- If no → Consider allocating all resources to the short term request to resolve as quickly as possible. However, make it clear that the long term request will still be needed to fill gaps shortly after. Don’t forget to explain why and the outcomes for the team.
Defending Your Roadmap and Communicating Process
Once you have decided what path to take, it’s important to replan your data roadmap and communicate with the team. Let the team know how you plan to allocate resources. Share what this looks like for the team week over week, and the outcomes that will be achieved for stakeholders.
Let your team know what the problems are today, how they show up in workflows, how the proposed solution fixes those problems, and where there may still be challenges with each solution (both short and long term).
In the end, the most important thing is clarity, explainability, and confidence in your plan for the team. Short term fixes don’t necessarily mean failure. However, you want to make sure there is a clear path back to the long-term model.