Business intelligence (BI) reports can help companies make better decisions. But even as digital transformations have advanced analytics democratization, IT and data engineering teams find themselves grappling with "report stacking" because of an inability to reuse data, inconsistent metrics, and low development efficiency. The result is growing "technical debt," with costs measured in dollars and lost productivity. Data engineers can spend a third of their time on tasks caused by technical debt, and it can delay product development by months. That's why it's important to know what technical debt is–and how to avoid it.
What is technical debt?
Technical debt describes the result of short-term development solutions that are not rigorous enough to also avoid future development burdens. Here’s a story illustrating how technical debt occurs.
Jack is Fiction Internet’s new growth team data analyst. After joining Fiction, his first task was to develop new email marketing strategies targeting users with an interest in Fiction’s web content. To gain insights for his campaign, Jack analyzed historical data and found the clean dataset edm_activities from past campaigns in an S3 data lake, but when information like website access data was needed, Jack had to ask other analysts specific things like where is the data, what is its quality, what is the processing logic, what is the cycle of data permission applications, etc.
Although the process was difficult, the task for which the data was needed was urgent, so Jack was unambiguous in his requests, ensuring he got what he needed to complete the project. Because everything went smoothly, Jack wrote a project report, satisfied that his extra work would pay dividends and serve as a template for future projects. However, three months later, Jack's manager said that his BI report expansion application was rejected and asked him to clean up the system.
Jack was stunned by the comments he read:
- Your reports are never reused.
- Data analyst Rose wrote a similar report last month.
- Rose calculated the campaign’s open rate differently, including bounced emails in the denominator.
Jack approached Rose to discuss whether their reports could be combined, but it was difficult to reach a consensus. They agreed to address the issue… later. Because Jack and Rose put off reconciliation of their inconsistencies, Fiction Internet began accruing technical debt.
Where does technical debt come from?
Fiction Internet uses a traditional data development process: data flows through ETL to a data lake or warehouse and is visualized through reports requiring ETL development, scheduling management, storage resources for result and temporary data, and computing resources to perform ETL tasks. Each redundant report creates chaos and inevitably leads to an accrual of technical debt.
Every time a new report is needed, a new ETL and a new report are added to the existing product pool. Once a pattern is formed, it is repeated; and as more users need access to data, the result is more reports and more data silos, compounding the problem.
How to avoid technical debt?
Management guru Peter Drucker once said, "You can't improve what you don't measure.” That maxim holds true for technical debt. Without metrics to understand the scope and cause of the issue, technical debt will only increase. But by changing the development approach to manage with "metrics" as the core and decoupling upstream data and downstream business through a metrics store, metrics reuse between reports can be achieved, reducing the generation of technical debt.
By employing a metrics store, users can enter source data, self-define business metrics, and collaborate with others to achieve greater alignment and reuse of business metrics, solving the pain points of enterprises metrics management, application, and analysis through:
- Efficient collaborative management: use metrics as the common "management language," align business and management operations, and improve organizational capabilities.
- Business agility improvement: by eliminating inefficient processes, developers and users can respond to data demands faster.
- Consistent data caliber: centrally manage metrics to ensure consistency, improving the reuse of index data between business units.
- Reduced development costs: business personnel create and reuse metrics; data teams eliminate heavy ETL work, focusing on metrics management; human efficiency is greatly improved.
Back to Jack and Rose
Fiction Internet adopted a Metrics Store. Now, whenever Jack needs to view email marketing data, he can see the required metrics on the platform. The metrics are published on the platform for other users to use. Because of Jack's initiative, technical debt was eliminated, system utilization improved, and there was no longer a need to apply for costly capacity expansion.
Now Rose can fully trust Jack’s metrics because she can clearly see their definitions and logic on the platform. That means Rose can deliver business insights faster and spend more time finding and developing data–which is what a data analyst should do.
Dong Li is the Technical Founding Member of Kyligence.
Related articles: