Generative Artificial Intelligence for Software Engineering: Research Approach

3 May 2024

This paper is available on arxiv under CC 4.0 license.

Authors:

(1) Anh Nguyen-Duc, University of South Eastern Norway, BøI Telemark, Norway3800 and Norwegian University of Science and Technology, Trondheim, Norway7012;

(2) Beatriz Cabrero-Daniel, University of Gothenburg, Gothenburg, Sweden;

(3) Adam Przybylek, Gdansk University of Technology, Gdansk, Poland;

(4) Chetan Arora, Monash University, Melbourne, Australia;

(5) Dron Khanna, Free University of Bozen-Bolzano, Bolzano, Italy;

(6) Tomas Herda, Austrian Post - Konzern IT, Vienna, Austria;

(7) Usman Rafiq, Free University of Bozen-Bolzano, Bolzano, Italy;

(8) Jorge Melegati, Free University of Bozen-Bolzano, Bolzano, Italy;

(9) Eduardo Guerra, Free University of Bozen-Bolzano, Bolzano, Italy;

(10) Kai-Kristian Kemell, University of Helsinki, Helsinki, Finland;

(11) Mika Saari, Tampere University, Tampere, Finland;

(12) Zheying Zhang, Tampere University, Tampere, Finland;

(13) Huy Le, Vietnam National University Ho Chi Minh City, Hochiminh City, Vietnam and Ho Chi Minh City University of Technology, Hochiminh City, Vietnam;

(14) Tho Quan, Vietnam National University Ho Chi Minh City, Hochiminh City, Vietnam and Ho Chi Minh City University of Technology, Hochiminh City, Vietnam;

(15) Pekka Abrahamsson, Tampere University, Tampere, Finland.

Table of Links

3. Research Approach

As indicated earlier, this paper aims to present a research agenda listing open Research Questions regarding GenAI in SE. We conducted a literature review (Section 3.1) and a series of focused groups to achieve this outcome (Section 3.2 and Section 3.3). The overall research process is presented in Figure 2.

3.1. Literature review

A comprehensive and systematic review might not be suitable for research on GenAI in software development at the time this research being conducted. Firstly, research work is rapidly conducted and published on the topic, hence rendering the findings of a comprehensive review probably outdated shortly after publication. Secondly, we found a lot of relevant work as gray literature from non-traditional sources such as preprints, technical reports, and online forums. These sources may not be as rigorously reviewed or validated as peer-reviewed academic papers, making is difficult to assess their quality and reliability. Thirdly, we would like to publish the agenda as soon as possible to provide a reference for future research. A systematic literature review would consume extensive effort and time, which might be obsolete by the time the review is complete.

Our strategy is to conduct focused, periodic reviews to capture the most current and relevant information without the extensive resource commitment of a comprehensive review. This approach allows for agility in keeping up with the latest developments without claiming comprehensiveness and repeatability. Our search approach involves two channels:

• Online portals: we used Google Scholar and Scopus to search with a formulated search string

• Gray literature sources: we search for papers from Arxiv and PaperwithCode

• Forward and backward snowballing: we scan citations forward and backward from the articles included in our review.

We used the search terms in Google Scholar and achieved the result (latest searched date October 2023). Google Scholar has the advantages of comprehensiveness and free access. It also covers grey literature where we found a significant number of research on GenAI at the searching time.

Figure 2: Research Agenda on GenAI for Software Engineering

3.2. Focus groups

We conducted four structured working sections as focus groups to identify, refine, and prioritize Research Questions on GenAI for SE. Focus groups have been used as a means to gather data in SE research [35, 36, 37, 38] Focus group sessions produce mainly qualitative information about the objects of study. The benefits of focus groups are that they produce insightful information, and the method is fairly inexpensive and fast to perform. This differs from a brainstorming section where participants are selected and navigated by a moderator, following a structured protocol so that the discussions stay focused [36, 39]. Kontio et al. also suggest online focus groups with a lot of advantages, such as group thinking, no travel cost, anonymous contribution, and support for large groups [36]. The value of the method is sensitive to the experience and insight of participants. Hence, we presented the information of the participants in Table 1.

Focus groups’ timeline is from April 2023 to September 2023 (as shown in Figure 2). All participants are SE researchers who have experience or interest in the topic, as shown in Table 1. For each focus group, we developed a plan, which included the agenda of the section and a set of exercises for the participants. Each focus group lasted between 2 to 3 hours. To capture data from focus groups, we use moderator’s notes, Miro boards 7, and recording in case of online sessions.

The focus of each focus group is different:

• Focus group 1 (Exploratory brainstorming): Brainstorming was aimed at exploring ideas for potential opportunities and challenges when adopting GenAI in software development activities. We discussed a question, “Which SE areas will benefit from GenAI tools, such as ChatGPT and Copilot?”. We used SWEBOK areas to initialize the conversations. At the end of the group, 11 categories were created. Each category is assigned a section leader who is a co-author responsible for synthesizing content for the section.

• Focus group 2 (Exploratory brainstorming): We discussed “What would be an interesting research topic for GenAI and Software Development and management?”. We initiated the discussion about “What could be interesting Research Questions for an empirical study on GenAI in SE?” Some questions were generated during the working session.

• Focus group 3 (Validating brainstorming): prior to the meeting, a list of possible RQs was made. The list included 121 RQs. From this meeting, we aimed to validate the questions, if they are correctly and consistently interpreted, and if they are reasonable and meaningful to all participants of the meeting. Participants had sufficient time to review thoroughly and critically the RQs list. Each person left a comment for every question about their meaningfulness and feasibility. After this step, the RQs were revised and restructured.

• Focus group 4 (Validating brainstorming): The discussion was conducted in subgroups, each group focused on one particular SE category. Prior to this session, the question list was finalized. In total, there are 78 RQs in 11 categories. RQs that are consensusly not practically important or meaningful are excluded. We also discussed and ranked RQs according to their novelty and timeliness of each RQs. However, we did not achieve consensus and a complete list of rankings for all RQs. Therefore, we decided not to present the ranking in this work.

Figure 3: An outcome from Focus Group 2

3.3. Threats to validity

According to Kontio et al. [37, 36]threats of the focus groups method in SE studies include group dynamics, social acceptability, hidden agenda, secrecy, and limited comprehension. To minimize the group dynamics threat, i.e., uncontrollable flow of discussion, all sections were coordinated by the main author, strictly following the pre-determined agenda with time control. Social acceptability can influence discussion results, i.e. participants contribute biased opinions because of positive or negative perceptions from others. We introduce clear rules for contribution at the beginning of focus groups. For online sessions, Miro is used so participants can anonymously provide their opinions. As time is relatively limited for an online focus group, some complex issues or points might not be not necessarily understood by all participants (limited comprehension). This threat is mitigated by organizing focus groups following up with each other. Between the sessions, participants were required to do their ”homework” so discussing points could be studied or reflected further beyond the scope of the focus groups. Four focus groups occurring in four months give sufficient time for individual and group reflection on the topic.

Figure 4: Research Agenda on GenAI for Software Engineering

← Previous

Generative Artificial Intelligence for Software Engineering: Abstract and Introduction

Up Next →

Generative Artificial Intelligence for Software Engineering: Background