Exercises with yrbss Dataset

Excercise based on the yrbss dataset, focusing on dplyr functions.

For each question, write the appropriate dplyr code and provide a brief explanation of your approach.

Q1: Select only the age, sex, grade, and bmi columns from the yrbss dataset.

Q2: Rename the column stweight to self_reported_weight.

Q3: Filter the dataset to include only students in the 12th grade.

Q4: Filter the dataset for male students who are 17 years old.

Q5: Filter the dataset to keep only students with BMI greater than 25.

Q6: Arrange the dataset by bmi in descending order.

Q7: Find the average BMI for each grade.

Q8: Count how many students belong to each race4 category.

Q9: Create a new column bmi_category using the following rules:

  • BMI < 18.5 → "Underweight"
  • BMI 18.5 - 24.9 → "Normal weight"
  • BMI 25 - 29.9 → "Overweight"
  • BMI ≥ 30 → "Obese"

Q10: Calculate the average bmi for each combination of sex and race4.

Q11: Find the grade with the highest average BMI.

Q12: Create a new dataset that contains only students in 11th and 12th grade with a BMI above the average BMI of the entire dataset.

Q13: Create a Summary Table

Generate a summary table that shows:

  • The total number of students
  • The average BMI
  • The proportion of male and female students

Q14: Find the Most Common Grade by Race

For each race4 category, determine the most frequent grade among students.