Teaching Quarto in Intro to Data Science
Dr. Elijah Meyer
Duke University
STA 199 - Summer 2023
Aug
– What
– Why
– How
– Benefits
… of Quarto in Introductory Data Science
– National Science Foundation (2014)
– National Science Foundation (2014)
– National Science Foundation (2014)
– National Science Foundation (2014)
– National Science Foundation (2014)
– National Science Foundation (2014)
An Intro Data Science course is setting the foundation for students to engage in the research process
Since the growing use of computational workflows…
– In the 1990s, Jon Claerbout launched the “reproducible research movement
– In the 2000s and 2010s, several high-profile journal and general media publications focused on concerns about reproducibility…
– Reproducibility tool <==> Research Process
Quarto <==> Data Science
… serious errors in interpretation reported results
Crisis: Lack of Trust in Science
Teaching reproducibility tools (such as Quarto) should not be viewed as an additional topic, but instead should be the foundation behind any introductory course to instill good habits for future researchers
Make reproducible research the norm rather than the exception
From the perspective of …
150 students with little to no statistics / data science / coding experience
Teaching using the dsbox
curricula (cite)
Students are expected to use R
Topics include data visualization; data cleaning; data modeling
High level Suggestions
– Teach quarto code side-by-side with the other coding languages
– Make reproducibility an expectation
– Lean into what Quarto has to offer
Informative Syntax Completion
Visual and Source Mode
Informative Error Messages
More…
Informative syntax aligns with what you want to accomplish “how do I turn these messages off?”
Fig-height and width is as important as making the figure itself. It’s all learned together. “How do I adjust my figure width and figure height?”
Highlight how coding can be more approachable in the visual tab
Gets students trained to think about and visualize professional documents throughout the process. Making this the expectation, and again not something additional
Highlight the approachability of Quarto error messages and how this can help alleviate some tension in an intro class
Examples of common errors and the error messages (Ask Mine)
In class exercise to teach Quarto
Minimal YAML
Minimal chunk options
Use well scaffolded Quarto documents
Render early and often!
– Useful tool: Good habit to build for future coursework / career
– Work they can be proud of
– Example
Efficiency: Consistent formatting -> easier grading
Invest time into lessons / activities now -> adapt later
Extendability: Use with Python, Julia, and more
– Reproducibility is not optional
– And has a place in any introductory class with coding
– Set the expectation
– Start Minimal and build up!
– More about YAML
– More about Errors
– What am I missing?