Creating Excel Reports with R: NSSE Student Comments Report

Creating Excel Reports with R: NSSE Student Comments Report

This post was adapted from a poster presentation for AIR 2020.

Brendan J. Dugan-In 2019, NSSE staff upgraded the Student Comments Report to include a bar chart like the one below, to illustrate satisfaction among students who did and did not leave a comment at the end of the survey. We also changed how we craft the report, processing and populating all the data elements with R instead of Visual Basic and SPSS.

Like SPSS, R is a statistical software, but it is also a programming language like Python, meaning it can clean and model data as well as perform more general programming operations.

The general process for making all our reports involves VBA macros feeding tables of summary statistics (or in this case, tables of comments) generated with SPSS into the reports, where cell references populate different fields with data elements. Excel has been our primary method for creating reports because of its ubiquity, gentle learning curve, and ease of formatting templates for printing. Likewise, SPSS has been widely taught in the social sciences and higher education and is relatively easy to use, although the macros developed to create different report features are more complicated.

Using R to handle everything from data import and cleaning to populating and formatting the Excel reports sped up report production from about a few hours to a few minutes, and let us move away from triangulating between SPSS and Visual Basic macros, which could be slow and buggy. While R can also be used to create reports in a variety of formats through R Markdown, using Excel as the report template allowed others to make text or formatting edits and maintained continuity in our reporting format, both for users and internally.

One advantage of using R and the openxlsx (Alexander Walker, 2019) package is that data can be written directly into cells rather than being referenced (say, from a hidden data sheet) or (programmatically) copied and pasted from SPSS files. Data for populating dynamic fields, like institution name, IPEDS number, or comment prompt, are read into memory in R, cleaned, and then written directly into the appropriate cells. Cells can be formatted with custom styles as well; doing so programmatically is more reliable than manually editing cell styles, and pretty readable.

# add institutional info to cover sheet ####
# a given institution's data
inst_info <- tibble(NAME_REPORT = "NSSEville State University",
                    MOD4_CIV = TRUE,
                    CIV05comment_n_fy = 35,
                    CIV05comment_n_sr = 42,
                    civ_prompt_text = "Think about the experiences you may have had..."

writeData(target_wb, # our template workbook
          sheet = "Cover",
          x = instl_info$NAME_REPORT, # institution full name
          startCol = 1,
          startRow = 9,
          colNames = FALSE)

cover_instname_style <- createStyle(
  fontName = "Calibri",
  fontSize = 18,
  fontColour = "#7A1A57",
  valign = "top",
  halign = "center",
  borderColour = "#EFAA22",
  border = "bottom")

         sheet = "Cover",
         style = cover_instname_style,
         cols = 1,
         rows = 9)

           sheet = "Cover",
           cols = 1:11,
           rows = 9)

NSSE features plenty of survey customization options, which necessarily entail making changes to reports. For example, starting in 2017, institutions could choose from four different end-of-survey comment prompts. Students at Canadian institutions receive slightly different demographics questions than do U.S. students. An institution that uses certain Topical Modules, like the Civic Engagement Module, receives comments from them in addition to end-of-survey comments. All these customizations need to be reflected in our reports.

If an institution used a module, those comments get fed into the report, along with the counts of comments by class level; if not, the extraneous sheet is deleted from the template.

if (instl_info$MOD4_CIV == TRUE) {
  x <- filter(nsse, ! %>%
    select(IRclass, IRenrollment, IRsex19, CIV05_19txt) %>%
    arrange(IRclass, desc(IRenrollment), IRsex19)
  # add comments
            sheet = "Civic Eng. Module Comments",
            x = x,
            startRow = 7,
            colNames = FALSE)
  # add counts info & intro text
            sheet = "Civic Eng. Module Comments",
            x = paste(instl_info$CIV05comment_n_fy, "first-year students and",
                      instl_info$CIV05comment_n_sr, "seniors responded to the following question in the Civic Engagement module:"),
            startCol = 1, startRow = 4)
  # add prompt text
            sheet = "Civic Eng. Module Comments",
            x = instl_info$civ_prompt_text)
} else {removeWorksheet(target_wb, "Civic Eng. Module Comments")}

All NSSE reports undergo a data quality check to ensure that facts and figures align between two sources – usually, summary tables from two analysts. These checks can also be incorporated in this process by reading data from one set of summary tables and comparing them to the other, generated during the report creation process.

In summary, revising the Student Comments Report provided an opportunity to test change how we produce reports without throwing out the baby with the bathwater. Keeping virtually all aspects of report creation (data manipulation, summarizing, reporting) in one environment allows for better troubleshooting and speed, while maintaining Excel as the template allows non-R users to help shape and edit reports.