Home // SnT // News & E... // PhD Defence: Next Generation Mutation Testing: continuous, predictive and ML-enabled

PhD Defence: Next Generation Mutation Testing: continuous, predictive and ML-enabled

twitter linkedin facebook email this page
Add to calendar
Speaker: Wei Ma (SerVal group)
Event date: Tuesday, 17 May 2022 03:30 pm - 05:30 pm

We're happy to welcome you to the PhD Defence of Wei Ma (SerVal group) on 17 May 2022 at 15:30.

The event will take place digitally on WebEx. Click here to join.

Members of the defense committee:

  • Prof. Dr Yves LE TRAON, University of Luxembourg, Chairman
  • Dr Ezekiel SOREMEKUN, University of Luxembourg, Deputy Chairman
  • Prof. Dr Michail PAPADAKIS, University of Luxembourg, Supervisor
  • Prof. Dr Paolo TONELLA, Università della Svizzera Italiana (USI), Member
  • Prof. Dr Zhang LINGMING, University of Illinois at Urbana-Champaign, Member
  • Dr Thierry TITCHEU CHEKAM, SES Satellites, Expert with advisory capacity


Software has been an essential part of human life, and it substantially improves production and enriches our life. However, the flaws in the software can lead to tragedies, e.g. the failure of the Mariner 1 Spacecraft in 1962. The issue gets even more severe, since the complexity of software systems grows larger than before and ArtificialIntelligence models are integrated into the software (e.g., Tesla Deaths Report). The software systems evolve and frequently change due to the new requirements, and the outputs of the artificial intelligence models are non-deterministic. Due to these reasons, testing such modern artificial software systems is challenging. We have witnessed many new testing techniques emerge to guarantee the trustworthiness of artificial software systems. Coverage-based Software Testing is one early technique to test Deep Learning(DL) systems by analyzing the neuron values statistically, e.g., Neuron Coverage(NC).

In recent years, Mutation Testing has drawn much attention. Coverage testing metrics can easily be fooled by generating tests to satisfy test coverage requirements, although these test cases may be meaningless. On the contrary, Mutation Testing is a robust approach to approximating the quality of a test suite. It has been generalizedto test software systems integrated with DL systems, e.g., image classification systems and autonomous drivingsystems. Mutation Testing is one technique based on detecting the artificial defects from many crafted code perturbations (i.e., mutant) to assess and improve the quality of a test suite. The behaviour of a mutant is likely to be located on the border between correctness and non-correctness since the code perturbation is usually tiny. Through Mutation Testing, the border behaviour of the subject under testing can be explored well, which leads to the high quality of the program. However, the application of Mutation Testing encounters some obstacles, e.g., one main challenge is that Mutation Testing is resource-intensive. Large resource-consuming makes it unskilled in modern software development because the code frequently evolves every day. This dissertation studies how toapply Mutation Testing in Artificial Intelligence Computation, exploring and exploiting the usages and innovations of Mutation Testing encountering AI algorithms, i.e., how to employ Mutation Testing for modern software systemsunder test. 

First, this dissertation adapts Mutation Testing to the modern software development, Continuous Integration. Mostdevelopment teams currently employ Continuous Integration(CI) as the pipeline where changes happen frequently. It is problematic to adopt Mutation Testing in Continuous Integration because of its expensive cost. At the sametime, traditional Mutation Testing is not a good test metric for the code changes as it is designed for the wholeprogram. We adapt Mutation Testing to test these program changes by proposing commit-relevant mutants. This type of mutant affects the changed program behaviours and represents the commit-relevant test requirements. Weuse the projects from C and Java to validate our proposal. The experiment results indicate that commit-relevant mutants can effectively test the code changes.

Second, based on the previous work, we introduce MuDelta, an AI approach that identifies commit-relevant mutants, i.e., specific mutants that interact with the program code change. MuDelta leverages a combined schemeof static code characteristics as the data feature. Our experiment shows that commit-based mutation testing issuitable and promising for evolving software systems. MuDelta uses manually-designed features that require expertknowledge.

Third, the dissertation proposes a new approach GraphCode2Vec to learn the general code representation. Recentworks utilize natural language models to embed the code into the vector representation. Code embedding is a keystone in the application of machine learning on several Software Engineering (SE) tasks. Its target is to extractuniversal features automatically. GraphCode2Vec considers program syntax and semantics simultaneously by a combination of code analysis and Graph Neural Networks(GNN). We evaluate our approach in the mutation testingtask and the other three tasks (method name prediction, solution classification, and overfitted patch classification). GraphCode2Vec is better or comparable to the state-of-the-art code embedding models. We also perform the ablation study and probing analysis to give insights into GraphCode2Vec.

Finally, the dissertation studies Mutation Testing to select test data for the deep learning systems. Since DeepLearning systems play an essential role in the different fields, the safety of DL systems takes centre stage. Such AI systems are much different from traditional software systems, and the existed testing techniques are not supportive of guaranteeing the reliability of the deep learning systems. It is well-known that DL systems usuallyrequire extensive data for learning. It is significant to select data for training and testing DL systems. There are several measurement metrics to guide choosing data to test DL systems. We compare a set of test selectionmetrics for DL systems. It shows that uncertainty-based metrics are competent to identify misclassified data. Thesemetrics also improve classification accuracy faster when retraining DL systems.

In summary, this dissertation shows the usage of Mutation Testing in the era of Computational ArtificialIntelligence. Mutation Testing is an excellent technique to test traditional software and artificial software systems, while AI algorithms alleviate the main difficulties of Mutation Testing in practice by reducing the resource cost.