Marcel’s Lab - Part B - Data Transfer from Qualtrics to R

What this sub-lab will teach you

We want to teach you a couple of things in this lab

  • Exporting network data from Qualtrics
  • Importing the data into R
  • Clearning the data in R
  • Getting the data into network format

Sub-Lab’s Objective

This lab is divided into three sub-sections (for better orientability), and in the separate sub-modules you will:

  • Task 1: Implement your research plan (and hypothesis into a Qualtrics survey) - Lab 20-2-A
  • Task 2: Network Data - Export from Qualtrics and Import into R - Lab 20-3-B

(NB: preferably you have collected your own data, but in case that was not possible, or too difficult for some reason, we will also provide a sample dataset before the lecture / lab )

Instructions

Exercise 1 - Getting your R/R-Studio environment ready

We will be working with with the tidyverse (especially dplyr) in this DataImport and Data Wrangling Sub-Session. These packages need to be installed / loaded in R first.

Should you prefer working directly in R with the RMarkdown file, please access the file here

     # if you are pretty sure you did not install any of the below packages to your R / RStudio environment before, then please run the install-commands
    # otherwise just go to the next code block

    install.packages("tidyverse")
    install.packages("dplyr")
    install.packages("tidyselect")
    install.packages("stringr")  
 

Loading important packages into R.

    # otherwise load / import the packages you will need for the data import and cleaning / wrangling in this session

    library (tidyverse)
    library (tidyselect)
    library (stringr)
    library (dplyr)
 

Check what your working directory is, and either put the survey data / network .csv-file in there, or set your working directory to where our .csv file resides on your PC.

    # otherwise load / import the packages you will need for the data import and cleaning / wrangling in this session

    getwd() #tell you what your working dir is

    setwd() #lets you set a new working directory
# set this to your working directory
setwd("~/yourRProjectName/yourWorkingDir")

responsesFile <- "xyzxWave3raw.csv" #set this to the filename of your downloaded Qualtrics .csv file 

#alternatively, you can use our class dataset (which I uploaded here to GitHub)
responsesFile <- "https://raw.github.com/MarMall/NetworkApproaches/main/Instructions/Labs/data/sampleNetworkSurvey.csv"

Exercise 2 - Loading the data into R


# #remove row(s) with question
# responses <- responses [-c (1), ] 
# head(responses)

#better way to avoid row with question recoding everything into character variables
#so we don't have to change them all back again later

#setwd("~/surveyDataImport/waves")

# readr::

responses <- read.csv (responsesFile, header = TRUE, nrows = 1)
variables <- colnames (responses)

#variables
#head(responses)

# variables[1] <- "ID"
responses <- read.csv (responsesFile, header = FALSE, skip = 2)
colnames (responses) <- variables

## Here you have your dataframe of all the responses
# have a look inside
responses 

Exercise 3 - First network relation - Friendship

   ## Now lets recode the first question results into network format

#responses to first question -- selection by question number

## exclude 99 as this is "Nobody"
QID149 <- responses %>%
  select (c (
            #"ID",
             #"recipientLastName",
             # "QID169_1",
             vars_select (names (responses),
                          starts_with ("QID149", ignore.case = TRUE),
                          -starts_with ("QID149_99", ignore.case = TRUE) #filter out NOBODY
                          )))


QID149 <- QID149[1:14] #we only have 14 responses, so we work only with those.
#QID149

Exercise 4 - Second network relation - Influence

## exclude 99 as this is "Nobody"
QID128 <- responses %>%
  select (c (
            #"ID",
             #"recipientLastName",
             # "QID169_1",
             vars_select (names (responses),
                          starts_with ("QID128", ignore.case = TRUE),
                          -starts_with ("QID128_99", ignore.case = TRUE)
                          )))


# QID128 <- QID128[2:7]

QID128 <- QID128[1:14] # again, we only had 14 responses to the survey
#QID128


QID128 <- QID128 %>%
  # Recoding NAs to 0  
  mutate (across (vars_select (names (QID128),
                               contains ("QID128", ignore.case = TRUE)),
          ~ replace (., is.na (.), 0))) %>%
  ## AND making the data binary
    mutate (across (vars_select (names (QID128),
                               contains ("QID128", ignore.case = TRUE)),
          ~ replace (., . != 0, 1))) #this ensures the data is binary



dim(QID128) # nice, this is squared 14x14

#QID128

Exercise 5 - Turning the data into a matrix

# 
# QID149 # friendship relation
# QID128 # influence

QID149_mat <-  as.matrix(QID149)

#QID149_mat
is.matrix(QID149_mat)

# checking the dimensions of the matrix
#dim(QID149_mat) # should be symmetric

##same for the influence relation

QID128_mat <-  as.matrix(QID128)

# # checking the dimensions of the matrix
# dim(QID128) # should be symmetric

#QID128_mat
is.matrix(QID128_mat)

Saving your network relation matrices in R


# save the matrices as .RData files

save (QID149_mat,
        file = paste0 ("Friendship_q149"  ,".Rdata"))

save (QID128_mat,
        file = paste0 ("Influence_q128"  ,".Rdata"))

Achievements

Great, you bought your first network relations into (adjacency) matrix format, so that working with them in several network analysis packages in R will be much easier.

Moving on to sub-lab 20-4-C

Please move on to the next sub-lab 20-4-C, where you will do a first exploratory analysis of your network data in R and the igraph package.