Lab 4
Dr. Elijah Meyer
Duke University
STA 199 - Summer 2023
June 1st
– Your assigned group can be found on the website here
– Your group has a summer-project
repo with your team name on GitHub. Each of you are to clone this repo. Each component of the project will be completed here.
– Today, you will work in the proposal.qmd
– Find two data sets that meet the criteria. The Resources for datasets
section in your project instructions are a great resource.
– Next, you will then upload each dataset into your data
folder in your summer-project
repo.
There are a few ways to upload data into this data folder. I suggest the following:
On the GitHub repo website, have 1 group member click the data folder; click Add file
in the top right; Click upload files; Drag your file into the repository; Click Commit Changes
Next, have EVERY team member pull
Repeat steps when you are ready to upload your second data set
For this class, we have worked with csv
files. These are comma separated excel files. If you have an excel file that is not a csv
file, you can make it one by going to File -> Save as -> CSV UTF-8 (Comma delimited)
– Identify the source of the data.
– State when and how it was originally collected (by the original data curator, not necessarily how you found the data).
– Write a brief description of the observations.
– Address ethical concerns about the data, if any.
Reminder, this is done in the proposal.qmd
– There is no recommendation to the number of variables you need to include when writing your research question. It can be between 2, it can have more than 2.
– Your research question can (and probably will) change as you get feedback / we move through the Summer session. That’s okay for this class project!
– You will answer this question using statistical procedures we will learn after Exam 1.
– What is the relationship between baldness and age?
– Is it possible that VEGF has an effect in plant photosynthesis?
– How are children affected by exposure to social media?
Unclear what social media means in this context. Unclear how children are defined.
Better: What is the effect of Instagram Likes on the self-esteem of young children under the age of 12?
– Find one published credible article on the topic you are interested in researching. Typically, people use Google Scholar
– Provide a one paragraph summary about the article.
– In 1-2 sentences, explain how your research question builds on / is different than the article you have cited.
Literature reviews are often exhaustive. This is meant to get us initial experience with what literature reviews are and why they are important!
Lastly, take a glimpse
of your data in the proposal.qmd
– Is everyone contributing? Have a meaningful commit?
– Does it Render?
– Is the Repo organized? No added unnecessary documents?
Sometimes, we create multiple merge conflicts that become “to far gone” to fix. In these circumstances for this class, we resort to the following:
– If you do not have any work that you would like saved (all your group’s current work is on GitHub), delete your local repo by clicking the repo file in the Files
tab of R and Delete
.
– Next, go re-clone the project repo.