Community Expectations for Research Artifacts and Evaluation Processes (Additional Material)

This repository contains the material used in and produced during the study Community Expectations for Research Artifacts and Evaluation Processes accepted at the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) 2020.

Paper Abstract

Background. Artifact evaluation has been introduced into the software engineering and programming languages research community with a pilot at ESEC/FSE 2011 and has since then enjoyed a healthy adoption throughout the conference landscape.

Objective. In this qualitative study, we examine the expectations of the community toward research artifacts and their evaluation processes.

Method. We conducted a survey including all members of artifact evaluation committees of major conferences in the software engineering and programming language field since the first pilot and compared the answers to expectations set by calls for artifacts and reviewing guidelines.

Results. While we find that some expectations exceed the ones expressed in calls and reviewing guidelines, there is no consensus on quality thresholds for artifacts in general. We observe very specific quality expectations for specific artifact types for review and later usage, but also a lack of their communication in calls. We also find problematic inconsistencies in the terminology used to express artifact evaluation's most important purpose -- replicability.

Conclusion. We derive several actionable suggestions which can help to mature artifact evaluation in the inspected community and also to aid its introduction into other communities in computer science.

Paper Preprint

The paper preprint is available here:

Artifact Availability

The artifact is maintained as a GitHub repository.
It is versioned according to the state of the publication.
Released versions are automatically archived to Zenodo.

Version History

Version	Archive (for previous versions)	Description
1		Blinded version available to the paper reviewers.
2		Version available for Artifact Evaluation
3		Version including the artifact reviewer's comments
4		Paper camera-ready
5		Added 2020 calls for artifacts, Added 2019/2020 committee data

Calls for Artifacts
- Process Call Analysis Process and Datafile Description
- Data Raw Collected Calls for Artifacts (Collected Calls for Artifacts (CfA) of the inspected conferences.)
- Data Derived Artifact Evaluation Commitees (collected from CfAs) (Excel) - e-mail addresses removed
- Data Derived Text analysis results from calls of artifacts
- R Script Analysis script for CfA tags
Survey
- Process Survey Analysis Process and Datafile Description
- Survey (Our survey questions in printable form. Survey was filled out online.)
- Survey Questions (Excel) - used in scripts for relations to the survey
- Data Raw Raw result data file (Excel) - anonymized at F3
- Data Derived Card sorting results (Labels obtained with the open card sorting method.)
- Plots
  - R Script Committee sizes and responses (Figure 1)
  - R Script Histogram of individuals by number of AECs served in (Figure 2)
  - Output Figures (Figures derived from the collected data.)
- Analyses
  - R Script A helper script to distinguish responses by community
  - R Script Analysis of questions with numeric answers
  - Output Results from the analysis of numeric answers
  - R Script Analysis of full-text answers using the tags from open card sorting
  - Output Results from the analysis of open card sorting tags
- R Script Script to run both plots and both analysis scripts

Legend

Data: Data artifact
Raw: Raw data collected
Derived: Data derived from the collected data
R Script: Analysis script written in R
Output: Script output
Process: Process Description and Data Formats

Data File Formats

Data files are provided as CSV or Excel files. They can be read and processed with R or other capable environments (e.g. pandas, Excel itself, ...). The process description for the call analysis and the survey analysis contain format descriptions for the data files.

Requirements

The analysis script provided in this artifact require R to run. We were using R version 3.6.1 (2019-07-05) at the time of the study. The scripts make use of the following packages and were executed with the following versions of these packages:

Package name	Version used
dplyr	0.8.3
ggplot2	3.2.0
gsubfn	0.7
readxl	1.3.1
stringr	1.4.0
svglite	1.2.3
tibble	2.1.3
tidyr	0.8.3

Executing scripts

All scripts will automatically install the necessary packages if not present in the current environment. However, there were cases in the past when functions were deprecated by the package authors. During the time of the study we did not experience such a case, but they might appear in the future.

To create the scripts and inspect results, we used RStudio. Thus, the scripts may either be executed in RStudio or directly on the command-line. For the latter, please use the following format: R < scriptname.R --vanilla to avoid interference with other projects. The analysis/survey folder contains a convenient runall.R script to run all scripts for the analysis of survey results. Their output will be placed in the analysis/survey/output folder. Warnings may occur, but were checked during development.

Sections of the Paper and their Relationship to the Artifacts

In the following table we show the relationship between the sections of the paper and the data and analysis artifacts provided here.

Section	Involved Analysis	Involved Data
3.2.2	R Script Analysis script for CfA tags	Data Raw Collected Calls for Artifacts Data Derived Text analysis results from calls of artifacts
4.1	R Script Committee sizes and responses R Script Histogram of individuals by number of AECs served in R Script A helper script to distinguish responses by community	Data Derived Artifact Evaluation Commitees (collected from CfAs) (Excel) Data Raw Raw result data file (Excel)
4.2	R Script Analysis of full-text answers using the tags from open card sorting	Data Derived Card sorting results for question G4 (Excel)
4.3.1	R Script Analysis of full-text answers using the tags from open card sorting R Script A helper script to distinguish responses by community	Data Derived Card sorting results for question AE1 (Excel) Data Derived Card sorting results for question AE2 (Excel) Data Derived Card sorting results for question AE3 (Excel) Data Derived Card sorting results for question AE4 (Excel) Data Derived Card sorting results for question AE5 (Excel) Data Derived Card sorting results for question AE8 (Excel) Data Derived Card sorting results for question AE9 (Excel)
4.3.2	R Script Analysis of full-text answers using the tags from open card sorting R Script A helper script to distinguish responses by community	Data Derived Card sorting results for question AU2 (Excel) Data Derived Card sorting results for question AU5 (Excel) Data Derived Card sorting results for question AU8 (Excel) Data Derived Card sorting results for question AU11 (Excel) Data Derived Card sorting results for question AU12 (Excel)
5	R Script Analysis of full-text answers using the tags from open card sorting R Script Analysis of questions with numeric answers	Data Derived Card sorting results for question F01 (Excel) Data Derived Card sorting results for question F02 (Excel) Data Derived Card sorting results for question F03 (Excel) Data Derived Card sorting results for question F99 (Excel) Data Raw Raw result data file (Excel)