Surprisingly see, though there are tons of resources on Power BI report development, it’s hard to find a resource on how to validate a Power BI report. To me, validation is a key step in the whole development process, which shares the same importance as data modelling and data visualization. Without validation, we don’t have confidence in the data quality since we don’t know it ourselves.
For the whole week, I have been caught up in a report validation. Though it requires lots of effort and energy, I basically concentratedly worked for more than 6 hours per day, I gradually achieve the goal and also understood why it was so tough because there was some misunderstanding in the business logic. I found it’s a rewarding journey, from finding issues to digging out causes to figuring out solutions. Here is an introduction to the process of report data validation. This blog is only about data quality, not evolving in other aspects.
Validation phases
Power BI report validation consists of two phases, unit testing and health check. The first phase is to make sure you connect to ideal data, quality-wise and type-wise. The second phase is to check if the data returns correct results, after all of the transformation, modelling and visualization, function-wise.
Unit testing against the data source
After initially connecting to the data source, either via Import mode or DirectQuery mode, the first thing is always to check the overall data quality. In my experience, I will go to check the number of rows of each table and data type to see if they match the data source. If there is no issue and the appropriate relationships have been established, I’ll create the base measures for the report, which usually are a count or sum of the metrics. Then apply filters, break them down by different dimensions such as by months, markets, products, and so on, and compare them to the data source.
This step is the basis of the future development of the report, just like a solid foundation for a building.
In this step, we need to pay attention to data type. Data type matters. If we don’t have a proper data type, you may find your calculations in Power BI won’t match the data source, which will put you in frustration. To learn more about data type, please visit my other blog here.
Cross-report testing/Health check
The power BI development process is fun and enjoyable. You transform data, build data models, create measures, columns, tables, and relationships, then put them into beautiful visuals. It’s rewarding when a state-of-art report is in front of you. Wait, are you confident in the report? Without a thorough health check, we don’t have confidence.
After the first phase of validation, why do we need to validate the data again? The reason is, in the process of development, anything you’ve done with the data model can cause data issues. For example, the relationship is a big one. Joining tables on different fields can result in different results. The filters on a visual are another big deal, which sometimes invisibly and secretly changes your desired results. The fields you put in a visual also may cause duplicates, which your users generally don’t like. The major one, DAX functions you chose to use, may cause huge issues.
The solution is to do a thorough health check to see if all the functions work expectedly.
Firstly, apply filters on visuals, then export the data from the visuals to compare to the original data source, to see if there is nuances that happen.
Secondly, apply interactions across visuals, then do the export-and-compare step again.
Usually, in this phase, I’ll do many modifications to the data model, changing relationships, trying different DAX codes, adding or removing fields to or from a visual, etc. Do you remember, anything we’ve done on the data model will cause changes in the calculations? So please be prepared to do validation thoroughly and carefully. In my experience, this phase may take half of the full length of the development time. It is worth it. After this step, everything that we see on the report now is crystal clear to you, and every question that users raise in the future will be well answered by you.
This is how we build confidence in our work.