Exercises with yrbss Dataset
Excercise based on the yrbss dataset, focusing on dplyr functions.
For each question, write the appropriate dplyr code and provide a brief explanation of your approach.
Q1: Select only the age, sex, grade, and bmi columns from the yrbss dataset.
Q2: Rename the column stweight to self_reported_weight.
Q3: Filter the dataset to include only students in the 12th grade.
Q4: Filter the dataset for male students who are 17 years old.
Q5: Filter the dataset to keep only students with BMI greater than 25.
Q6: Arrange the dataset by bmi in descending order.
Q7: Find the average BMI for each grade.
Q8: Count how many students belong to each race4 category.
Q9: Create a new column bmi_category using the following rules:
- BMI < 18.5 →
"Underweight" - BMI 18.5 - 24.9 →
"Normal weight" - BMI 25 - 29.9 →
"Overweight" - BMI ≥ 30 →
"Obese"
Q10: Calculate the average bmi for each combination of sex and race4.
Q11: Find the grade with the highest average BMI.
Q12: Create a new dataset that contains only students in 11th and 12th grade with a BMI above the average BMI of the entire dataset.
Q13: Create a Summary Table
Generate a summary table that shows:
- The total number of students
- The average BMI
- The proportion of male and female students
Q14: Find the Most Common Grade by Race
For each race4 category, determine the most frequent grade among students.