Case Study

Synthetic Data for Systems Integration Testing at a Legal Tech Customer


Brewdata has a hospital chain customer that collects large volumes of patient data. The data has to be aggregated and submitted to the government for statistical purposes. Due to the Personally Identifiable Information (PII) and privacy rules (GDPR and POPI), the data needs to be cleansed of that information first. This was an onerous process for the hospital chain and with the introduction of the Brewdata suite of products and services has become a more sustainable and disciplined process
A Brewdata customer in the Legal Tech domain had a need to test integrated enterprise systems. This required a very close representation of the Production Data set to ensure that business rules, and exceptions are triggered in the work flow as closely as possible. The use of the Brewdata suite of tools was instrumental in ensuring the Quality Assurance Process of the customer was met to the highest standard of capability.

Use Case

Brewdata’s Customer has complex systems that are integrated and share data across each other. Any software release in one system has the potential to impact the workflow and functionality across the larger enterprise ecosystem. Unique combinations of data could potentially trigger workflow exceptions, escalations, and notifications that enterprise systems rely on to provide business services. GDPR and derivative privacy regulations such as POPI in the countries that the Brewdata Customer operates, restricted the use of Production Data in non-production systems. As such there was a need to create Synthetic Data sets in large volumes with different permutations and combinations to ensure that the testing was comprehensive in nature. With each sprint release, roughly two weeks apart, complete regressions were needed across the system. The rigorous testing required Quality Assurance teams to be focused on results of the testing and any remediation in a timely manner, as opposed to test data generation to support the numerous test cases.


The Brewdata Customer first engaged Brewdata in a FREE discovery workshop to share more information about the use cases that were specific to the customer and learn more about the capability of Synthetic Data generation using Brewdata tools An initial services engagement helped Brewdata fine tune the tools to the specific Customer Use Case and generate the first sets of Synthetic Data. Subsequently the customer tool over the process of uding the Brewdata Studio and Brewdata Suite to generate Synthetic Data. Brewdata followed up at frequent intervals to understand any challenges and assisted as necessary to augment the tools or streamline the Synthetic Data generation process.


Brewdata’s Customer was able to generate large amounts of test data in record time leveraging the Brewdata product suite and using Production Data as the inspiration and input source for the generation of Synthetic Data. Datasets for different use cases were generated with ease with both the Brewdata Services Engagement and later with the Brewdata Studio product used independently by the customer. No time was spent by the Quality Assurance team in manually generating data to support test cases and instead was spent identifying and documenting quality feedback in the systems integration use cases. And in fact many of unique permutations and combinations of Synthetic data that was generated by Brewdata Studio exposed interventions that were required by the Quality Engineering team to address functionality gaps in the overall enterprise architecture, issues that would not have been uncovered if the Test Data was produced manually.


As the Brewdata customer expands its products and services to the marketplace, new capabilities will be introduced over time and hence the need for additional new Synthetic Data will continue as the customer develops new systems capabilities. Brewdata’s customer has ambitions of venturing into AI capabilities and has the intention of using the Synthetic Data generated for Systems Testing in other areas of value generation including as training data. Hence the value generated by those data sets will be amplified many times over, as well.

Getting in Touch

Related Case Studies