Teaching (with) Quarto

Teaching Quarto in Intro to Data Science

Dr. Elijah Meyer

Duke University
STA 199 - Summer 2023

Aug

Outline

– What

– Why

– How

– Benefits

… of Quarto in Introductory Data Science

What Is Data Science

– National Science Foundation (2014)

What Is Data Science

– National Science Foundation (2014)

What Is Data Science

– National Science Foundation (2014)

What Is Data Science

– National Science Foundation (2014)

What Is Data Science

– National Science Foundation (2014)

What Is Data Science

– National Science Foundation (2014)

An Intro Data Science course is setting the foundation for students to engage in the research process

What is Quarto

What is Quarto

(a)

(b)

Figure 1: Code (a) to Document (b)

Research and Reprodicibility

Since the growing use of computational workflows…

– In the 1990s, Jon Claerbout launched the “reproducible research movement

– In the 2000s and 2010s, several high-profile journal and general media publications focused on concerns about reproducibility…

Research and Reprodicibility

– Reproducibility tool <==> Research Process

  • Quarto <==> Data Science

  • … serious errors in interpretation reported results

Reproducibility: Why This Matter

Crisis: Lack of Trust in Science

  • Thus, Quarto isn’t something “extra” in a data science course…. it’s the foundation FOR it!

Argument for Quarto

Teaching reproducibility tools (such as Quarto) should not be viewed as an additional topic, but instead should be the foundation behind any introductory course to instill good habits for future researchers

Make reproducible research the norm rather than the exception

How

Quarto in Introductory Data Science

From the perspective of …

  • 150 students with little to no statistics / data science / coding experience

  • Teaching using the dsbox curricula (cite)

  • Students are expected to use R

  • Topics include data visualization; data cleaning; data modeling

How

High level Suggestions

– Teach quarto code side-by-side with the other coding languages

– Make reproducibility an expectation

– Lean into what Quarto has to offer

  • Informative Syntax Completion

  • Visual and Source Mode

  • Informative Error Messages

  • More…

Quarto Tour

Quarto Tour

Quarto Tour

Quarto Tour

Quarto Tour

Informative Syntax and Completion

Informative syntax aligns with what you want to accomplish “how do I turn these messages off?”

Informative Syntax and Completion

Fig-height and width is as important as making the figure itself. It’s all learned together. “How do I adjust my figure width and figure height?”

Visual Tab

(a)

(b)

Figure 2: Source (a) to Visual (b)

Visual Tab

Highlight how coding can be more approachable in the visual tab

Gets students trained to think about and visualize professional documents throughout the process. Making this the expectation, and again not something additional

Errors

Highlight the approachability of Quarto error messages and how this can help alleviate some tension in an intro class

Examples of common errors and the error messages (Ask Mine)

Example AE

In class exercise to teach Quarto

  • Minimal YAML

  • Minimal chunk options

  • Use well scaffolded Quarto documents

  • Render early and often!

Benefits

From the student’s perspective

– Useful tool: Good habit to build for future coursework / career

– Work they can be proud of

Example

From the instructor’s perspective

Efficiency: Consistent formatting -> easier grading

Invest time into lessons / activities now -> adapt later

If you don’t use R

Extendability: Use with Python, Julia, and more

In Summary

– Reproducibility is not optional

– And has a place in any introductory class with coding

– Set the expectation

– Start Minimal and build up!

Questions

Outstanding Questions

– More about YAML

– More about Errors

– What am I missing?