Find us on GitHub

A Data Carpentry Workshop

Stellenbosch University

6 - 8 November, 2017

Nov 6 - 7 from 08:30 to 16:30, Nov 8 from 08:30 - 12:00

Instructors: Angelique van Rensburg, Amy Hodge, Ian van der Linde, Juan Steyn

Helpers: Chrismarie Enslin, Samar Elsheikh

General Information

Thank your for your interest in this Data Carpentry workshop.

The ever-increasing digital nature of research requires researchers, postgraduate students, and research-support staff to equip themselves with the skills to create, manipulate and manage data in digital format. This can involve complex research data management techniques. However, very often researchers and students can perform simple to complex data management by mastering tools and techniques which don’t require the purchase of pricey software licenses nor necessitating highly specialist skills.

It is one of the aims of the growing e-research support community at Stellenbosch University to assist researchers, postgraduate students, and research-support staff to master such tools and techniques. In cooperation with the South African Center for Digital Language Resources (SADiLaR) and the Digital Humanities Organisation of Southern Africa (DHASA), the SU Library and Information Service and the SU Information Technology Division aim to build local communities to facilitate collaboration and skills transfer across all academic disciplines in support of scientific data management.

Workshop Aims

This workshop aims to provide a broad introduction to the following concepts and tools

  • Introduction to Digital Scholarship
  • Research Data Management
  • Data formatting, cleaning, and manipulation using OpenRefine
  • Introduction to R for analysing data

The workshop is funded by the South African Center for Digital Language Resources.

The workshop is organised by the Stellenbosch University Library and Information services and the Stellenbosch University Information Technology Division in collaboration with SADiLaR and DHASA.

Registration and other information

Please register for the workshop through the online registration form by 1 November 2017. Registration is free of charge, although a penalty fee is applicable for a no show. Space is limited, if you have any questions please contact

Data Carpentry workshops are for any graduate student, staff member or researcher who has data they want to analyze, and no prior computational experience is required. This hands-on workshop teaches basic concepts, skills and tools for working more effectively with data.

We will cover Digital Scholarship, Data organisation in spreadsheets, Data cleaning with OpenRefine and Data analysis and visualisation in R. Participants should bring their laptops and plan to participate actively. By the end of the workshop learners should be able to more effectively manage and analyze data and be able to apply the tools and approaches directly to their ongoing research.

Who: The course is aimed at postgraduate students, staff members and researchers.

Where: E-classroom, Stellenbosch University Library (J.S. Gericke Library), Stellenbosch University, JS Marais Square, c/o Victoria and Ryneveld Streets, Stellenbosch, South Africa. Get directions with Google Maps.

Requirements: Participants must bring a laptop with a Mac, Linux, or Windows operating sytem (not a tablet, Chromebook, etc.) that they have administrative privileges on. They should have a few specific software packages installed (listed below). They are also required to abide by Data Carpentry's Code of Conduct.

Contact: Please mail for more information and if you have any trouble registering for the workshop.

Preliminary Schedule


Please be sure to complete these surveys before and after the workshop.

Pre-workshop Survey

Post-workshop Survey

Day 1

6 November 2017

Morning Digital Scholarship & Data organisation in spreadsheets
Afternoon Data organisation in spreadsheets & Data cleaning with OpenRefine

Day 2

7 November 2017

Morning Data cleaning with OpenRefine & Data analysis and visualisation in R
Afternoon Data analysis and visualisation in R

Day 3

8 November 2017

Morning Data analysis and visualisation in R


Lunch will be catered for.

Tea/Coffee will be provided according to schedule.

We will use this Etherpad for chatting, taking notes, and sharing URLs and bits of code.

Detail Programme

Day 1 - 6 November 2017

08:30 Coffee, tea and installations
09:00 Digital Scholarship
10:00 Coffee and tea break
10:20 Introduction to data organisation in spreadsheets
12:00 Lunch
13:00 Quality control and data manipulation in spreadsheets
14:20 Coffee and tea break
14:40 Introduction to OpenRefine
16:15 Wrap-up

Day 2 - 7 November 2017

09:00 Data cleaning in OpenRefine
10:30 Coffee and tea break
10:50 Introduction to R
12:15 Lunch
13:15 Starting with data in R
14:45 Coffee and tea Break
15:00 Manipulating data in R
16:15 Wrap-up

Day 3 - 8 November 2017

09:00 Visualising data in R
10:30 Coffee and tea Break
10:50 Visualising data in R
12:00 Finish


Using spreadsheet programs for scientific data

  • Formatting data & problems
  • Dates as data
  • Quality control
  • Exporting data
  • Data format caveats
  • Cleaning data with OpenRefine


OpenRefine (previously Google Refine) is a tool for data cleaning that runs through a web browser, and any browser - Safari, Firefox, Chrome, Explorer - should work fine. You will need to download OpenRefine and install it, and when you open it, it will run through the browser, but you don't need an internet connection, and the data will all be stored on your computer.

R for data analysis and visualisation

  • Introduction to R
  • Starting with data
  • Aggregating and analyzing data with dplyr
  • Data visualisation with ggplot2
  • R and Databases


To participate in a Data Carpentry based workshop, you will need working copies of the described software. Please make sure to install everything (or at least to download the installers) before the start of your workshop. Participants should bring and use their own laptops to insure the proper setup of tools for an efficient workflow once you leave the workshop.

Please follow these Setup Instructions.

We maintain a list of common issues that occur during installation as a reference for instructors that may be useful on the Configuration Problems and Solutions wiki page.