This week fellow student and fellow data analysis enthusiast Herb Susmann released student-reported SOFI data on courses at SUNY Geneseo, welcoming people to see what interesting relationships -- or lack thereof -- they could find in the data.
To that end, I downloaded the data, fired up R, and decided to compare how challenging students rated their classes. The individual course data was too narrow a data set, so I examined the data for classes in the natural sciences, the social sciences, and the fine arts.
The data collected includes self-reported values for how challenging a course is from 1 (not challenging), to 5 (very challenging). Given that the data is self-reported, it is more likely that students will over-report the difficulty of their classes. With that in mind let's examine some distributions.
Shapiro tests on the three data sets show that none of them are normally distributed. Due to not being normally distributed, any comparative statistical tests must be non-parametric.
To that end, a series of Willcox rank sum tests were performed between the natural sciences and the social sciences, and the natural sciences and the fine arts.
The Willcox test between natural and social sciences suggests that the difference not statistically significant as referenced in Figure 2. Given the typical academic and scientific rigor between the two disciplines, it is understandable that students would rate their courses as being challenging.
The difference in the challenge ratings between the natural sciences and the fine arts is a completely different story. As suggested in figure 3, the natural sciences have statistically greater challenge rating than the arts.
Although these results would suggest that the sciences are more difficult than the fine arts, it should be noted that these data are self-reported. If students think that their major is more difficult, then they may rate their courses as a 5 when it may really be a 4. Regardless of this bias, the data does provide useful insight into the opinions of these students about how they view their courses.
My R code and analysis are open source, and can be downloaded from here.