Our test suite implementation follows the methodology outlined in CheckList[1] and uses the code open-sourced in marcotcr/checklist.

Test Suite Curation and Coverage

Each test case is made up of several utterances, which are curated using either a template or foraged directly from anonymized customer conversations. There are two ways of test case curation: bottom-up & top-down. In a bottom-up approach, we first gather failed utterances, group and add them to an existing test case or create a new one. Alternatively, in a top-down approach, a test case is first proposed by a stakeholder and supporting utterances are then created using templates or from actual utterances.

Coverage for each test case is dependent on the permutability of examples and prevalence of real data. For instance, it is far easier to provide substantial coverage for a test case that only modifies a location or vendor name, as these values are easily permutable. Similarly, it is also straightforward to provide good coverage for a test case built by grouping failed production data. On the other hand, it takes much more effort and creativity to create examples that are novel paraphrases.

When a template is used to permute an example, there could be a lack of diversity in data. The CheckList implementation provides a way to alleviate this - by using language models to suggest words. This ensures that these templated examples are not limited to the creativity of the human authoring the test case. As the quality of the test suite is contingent upon on its coverage, the test case curation is the most labor intensive step of this implementation.

Scalability Concerns

It is natural to be concerned about the scalability and maintainability of such a test suite, especially if test cases are designed mostly top-down. However, this is no different from test driven software development, where test case creation is factored into the development effort. So, we only really need a mindset change about effort estimates in the model development lifecycle.

Attachments

  • Original document
  • Permalink

Disclaimer

Expedia Group Inc. published this content on 24 August 2021 and is solely responsible for the information contained therein. Distributed by Public, unedited and unaltered, on 24 August 2021 14:13:01 UTC.