Final student showcase of Reproducibility and Data Science workshop series

April 29, 2022
Image

One of the CCT Data Science Team’s goals is to help University of Arizona researchers better manage their code and data. Recently, we held a successful workshop series called Reproducibility and Data Science Skills, which was geared toward the needs of CALS graduate students and postdocs. This workshop series covered version control, project management, intermediate R skills, and documentation, which was based on a similar series we ran for ESA SEEDS grad students. Instructors Jessica Guo and Kristina Riemer added two follow-up sessions focused on participants applying their new skills to their own research. 

Asking participants to showcase the work they had done on their research projects was a successful new feature. This session was intended to be casual and collegial, as an opportunity for participants to celebrate their progress and demonstrate application of their new skills. Participants walked us through new GitHub repos, R scripts, and organized projects (Fig. 1). All of the topics we covered in the workshop were applied by at least one of the participants in their own work, indicating that these were useful skills for CALS researchers. Huge thank you to Lia Ossana, Richie Thaxton, and Priyanka Kushwaha for showcasing the before-and-after of their projects!

Image

Figure 1. Before and after project re-organization from local files in a single folder to a hierarchical folder structure in a public GitHub repository. Courtesy of Lia Ossana.

The post-workshop survey provided both positive and constructive feedback. On 13 specific tasks, participants indicated that their abilities improved after the workshop (Fig. 2). We credit the perseverance of the students as well as the scaffolded nature of the series for these improvements. In particular, many students changed the way they worked via confident use of git and GitHub. 

Image

Figure 2. Three skills in which students rated their ability higher after participating in the workshop series.

In anticipation of the fall 2022 series, we plan a number of further improvements. Twelve total sessions are now planned to devote adequate time to branching in git/GitHub and use of R Markdown for documentation. We had good engagement on Slack and some visits to our office hours, so we’ll continue to develop a community of practice, especially to support folks whose immediate lab groups are less interested in data science. Lastly, we are going to shorten the interval of time between the ten content sessions and the two follow-up sessions, due to a drop in retention that’s likely due to too much time elapsing. 

We had so much fun teaching this workshop series during spring 2022, and had great interactions with all of the participants! We’re teaching this again, including modifications from lessons learned, this fall, so if you’re in CALS at the University of Arizona, please check out the website page for more information and the application link!