1907 09029 Code-Aware Combinatorial Interaction Testing
An empirical comparison of combinatorial testing, random testing and adaptive random testing. TTR was implemented in Java and C (TTR 1.2) and we developed three versions of our algorithm. In this paper, we focused on the description of versions 1.1 and 1.2 since version 1.0 was detailed elsewhere (Balera and Santiago Júnior 2015). The Feedback Driven Adptative Combinatorial Testing Process (FDA-CIT) algorithm is shown in (Yilmaz et al. 2014). At each iteration of the algorithm, verification of the masking of potential defects is accomplished, isolating their probable causes and then generating a new configuration which omits such causes. The idea is that masked deffects exist and that the proposed algorithm provides an efficient way of dealing with this situation before test execution.
As the order in which the parameters are presented to the algorithms alters the number of test cases generated, as previously stated, the order in which the t-tuples are evaluated can also generate a certain difference in the final result. The conclusion validity has to do with how sure we are that the treatment we used in an experiment is really related to the actual observed outcome (Wohlin et al. 2012). One of the threats to the conclusion validity is the reliability of the measures (Campanha et al. 2010). We automatically obtained the measures via the implementations https://www.globalcloudteam.com/ of the algorithms and hence we believe that replication of this study by other researchers will produce similar results. Even if other researchers may get different absolute results, especially related to the time to generate the test suites simply because such results depend on the computer configuration (processor, memory, operating system), we dot not expect a different conclusion validity. Moreover, we relied on adequate statistical methods in order to reason about data normality and whether we did really find statistical difference between TTR 1.1 and TTR 1.2.
Combinatorial methods can be applied to all types of software, but are especially effective where interactions between parameters are significant. The primary industry applications for ACTS are in database and e-commerce, aerospace, finance, telecommunications, industrial controls, and video game software, but we have users in probably every industry. Applied combinatorial testing to industrial control systems, using mixed-strength covering arrays, “resulting in requiring fewer tests for higher strength coverage”. In general, we can say that IPOG-F presented the best performance compared with TTR 1.2, because IPOG-F was better for all strengths, as well as lower and medium strengths.
One possibility is to use the Compute Unified Device Architecture/Graphics Processing Unit (CUDA/GPU) platform (Ploskas and Samaras 2016). We must develop other multi-objective controlled experiment addressing effectiveness (ability to detect defects) of our solution compared with the other five greedy approaches. The conclusion of the two evaluations of this second experiment is that our solution is better and quite attractive for the generation of test cases considering higher strengths (5 and 6), where it was superior to basically all other algorithms/tools. Certainly, the main fact that contributes to this result is the non-creation of the matrix of t-tuples at the beginning which allows our solution to be more scalable (higher strengths) in terms of cost-efficiency or cost compared with the other strategies. However, for low strengths, other greedy approaches, like IPOG-F, may be better alternatives. In computer science, all-pairs testing or pairwise testing is a combinatorial method of software testing that, for each pair of input parameters to a system (typically, a software algorithm), tests all possible discrete combinations of those parameters.
Controlled experiment 1: TTR 1.1 × TTR 1.2
A model of dependencies between input parameters of NEWTRNX is created. Results of NEWTRNX model analysis and test case generation are evaluated. Therefore, considering the metrics we defined in this work and based on both controlled experiments, TTR 1.2 is a better option if we need to consider higher strengths (5, 6). For lower strengths, other solutions, like IPOG-F, may be better alternatives. IPO-TConfig is an implementation of IPO in the TConfig tool (Williams 2000).
It is JAVA-based and completely free tool with GUI which makes it even easier to use for anybody. Unlike other tools, Pairwiser provides a wide range of functionalities and features that one can explore in combinatorial testing. We develop test-case selection techniques, where test strings are synthesized using characters or string fragments that may lead to system failure. Demonstrated discovery of a number of «corner-cases» that had not been identified previously.
Such algorithms accomplish exhaustive comparisons within each horizontal extension which may penalize efficiency. Thus, it is interesting to think about a new greedy solution for CIT that does not need, at the beginning, to enumerate all t-tuples (such as PICT) and does not demand many auxiliary matrices to operate (as some IPO-based approaches). In the context of CIT, meta-heuristics such as simulated annealing (Garvin et al. 2011), genetic algorithms (Shiba et al. 2004), and Tabu Search Approach (TSA) (Hernandez et al. 2010) have been used. Recent empirical studies show that meta-heurisitic and greedy algorithms have similar performance (Petke et al. 2015).
- Thus we can refer to this type of testing as «effectively exhaustive» (within reason).
- The goal of this second analysis is to provide an empirical evaluation of the time performance of the algorithms.
- In Section 3, we show the main definitions and procedures of versions 1.1 and 1.2 of our algorithm.
- Although the size of the test suite is used as an indicator of cost, it does not necessarily mean that test execution cost is always less for smaller test suites.
However, their method was worse than such greedy solutions for unconstrained problems. Considering the metrics we defined in this work and based on both controlled experiments, TTR 1.2 is a better option if we need to consider higher strengths (5, 6). This tool is the simplest to use because we just have to write the test factors and constraints (if any) and the test configurations are generated.
Combinatorial Interaction Testing (CIT) approaches have drawn attention of the software testing community to generate sets of smaller, efficient, and effective test cases where they have been successful in detecting faults due to the interaction of several input parameters. Recent empirical studies show that greedy algorithms are still competitive for CIT. It is thus interesting to investigate new approaches to address CIT test case generation via greedy solutions and to perform rigorous evaluations within the greedy context. As in controlled experiment 1, TTR 1.2 did not demonstrate good performance for low strengths.
The question below on combinatorial coverage explains why this heuristic is important. To measure cost, we simply verified the number of generated test cases, i.e. the number of rows of the final matrix M, for each instance/sample. The efficiency measurement required us to instrument each one of the implemented versions of TTR and measure the computer current time before and after the execution of each algorithm. In all cases, we used a computer with an Intel Core(TM) i CPU @ 3.60 GHz processor, 8 GB of RAM, running Ubuntu 14.04 LTS (Trusty Tahr) 64-bit operating system. The goal of this second analysis is to provide an empirical evaluation of the time performance of the algorithms.
This insertion is done in the same way as the initial solution for M is constructed, as described in the section above. However, comparing with version 1.0 (Balera and Santiago Júnior 2015), in version 1.1 we do not order the parameters and values submitted to our algorithm. The result is that test suites of different sizes may be derived if we submit a different order of parameters and values. The motivation for such a change is because we realized that, in some cases, less test cases were created due to non-ordering of parameters and values. In the context of software systems, robustness testing aims to verify whether the Software Under Test (SUT) behaves correctly in the presence of invalid inputs.
This is explained by the fact that, in TTR 1.2, we no longer generate the matrix of t-tuples (Θ) but rather the algorithm works on a t-tuple by t-tuple creation and reallocation into M. This benefits version 1.2 so that it can properly handle higher strengths. We performed two controlled experiments addressing cost-efficiency and only cost.
In all the other comparisons, the Null Hypothesis was rejected and TTR 1.2 was worse than the other solutions. This can be attributed to the fact that the algorithm focuses on test cases that have parameter interactions that generate a large amount of t-tuples, which is usually seen in test cases with larger strenghts. This can be verified by the fact that the algorithm gives priority to just covering the interaction of parameters with the greatest amount of t-tuples.
An early researcher in this area created a short one-hour Combinatorial Testing course that covers the theory of combinatorial testing (of which pairwise testing is a special case) and shows learners how to use a free tool from NIST to generate their own combinatorial test suites quickly. The number of tests produced by ACTS or other covering array generators is proportional to vt log n, for v values per parameter, with n parameters, for a t-way covering array. Note that this is not the exact number of tests produced; the test set size is proportional to this value, i.e., the number of tests grows exponentially with the number of values, but only logarithmically with the number of parameters. This size is a characteristic of covering arrays, and holds for all covering array generating tools, not just ACTS. For the tester, this means that it is best to keep the number of values per parameter under about 10, but it is not a problem to have hundreds of parameters.
Software on this site is free of charge and will remain free in the future. It is public domain; no license is required and there are no restrictions on use. You are free to include it and redistribute it in commercial products if desired. NIST is an agency of the United States Government, conducting research in advanced measurement and test methods. 80% module and branch coverage, 88% statement coverage 15 faults found; 2-way tests found as many as 3-way.