This requires approval by the department first. After that, write a python file in the same directory as the dataset, with the correct file name:
Python
from edmlib import gradeData, classCorrelationData df = gradeData("fordhams_dataset_fileName.csv") df.defineWorkingColumns('OTCM_FinalGradeN', 'SID', 'REG_term', 'REG_CourseCrn', 'REG_Programcode', 'REG_Numbercode', 'GRA_MajorAtGraduation')
For majors, define a python list with major or department names, making sure they are spelled the same in the defined ‘classDept’ column (this is the ‘REG_Programcode’ column in the fordam dataset).
For classes, define a python list with class names that match the defined ‘classCode’ column. If ‘classCode’ was not defined but instead both ‘classDept’ and ‘classNumber’ were (like in the fordham dataset), these columns are added to define the ‘classCode’ column automatically, so use the concatenation of the two columns (e.g. ‘Psych1000’ from ‘Psych’ and ‘1000’).
This example filters to Computer Science classes and “core” classes, as defined by Fordham.
from edmlib import gradeData, classCorrelationData df = gradeData("fordhams_dataset_fileName.csv") df.defineWorkingColumns('OTCM_FinalGradeN', 'SID', 'REG_term', 'REG_CourseCrn', 'REG_Programcode', 'REG_Numbercode', 'GRA_MajorAtGraduation') majorsToFilterTo = ['Computer and Info Science', 'Psychology'] coreClasses = [ 'Philosophy1000', 'Theology1000', 'English1102', 'English1101', 'History1000', 'Theology3200', 'VisualArts1101', 'Physics1201', 'Chemistry1101'] df.filterToMultipleMajorsOrClasses(majorsToFilterTo, coreClasses)
A similar filter is available for the correlational data class:
data = classCorrelationData('outputCorrelation.csv') data.filterToMultipleMajorsOrClasses(majorsToFilterTo, coreClasses)
For filtering data to students who have declared certain majors, the ‘studentMajor’ column should have been defined with ‘defineWorkingColumns’. It should be noted that if a student ever declared one of these majors, they will be included. The syntax is very similar:
from edmlib import gradeData, classCorrelationData df = gradeData("fordhams_dataset_fileName.csv") df.defineWorkingColumns('OTCM_FinalGradeN', 'SID', 'REG_term', 'REG_CourseCrn', 'REG_Programcode', 'REG_Numbercode', 'GRA_MajorAtGraduation') # function takes a 'list' df.filterStudentsByMajors(['Psychology', 'Economics'])
Correlations can be obtained with the following example, which filters by GPA deviation first:
from edmlib import gradeData, classCorrelationData df = gradeData('yourDataSet.csv') df.defineWorkingColumns('OTCM_FinalGradeN', 'SID', 'REG_term', 'REG_CourseCrn', 'REG_Programcode', 'REG_Numbercode', 'GRA_MajorAtGraduation') df.filterByGpaDeviationMoreThan(0.2) # The first parameter is the output file name, the second is the minimum # number of students classes must share to calculate a correlation df.exportCorrelationsWithMinNSharedStudents('outputCorrelation.csv', 20)
Furthermore, this correlational data can be made into a chord graph like on the front page (EDMLib) by using the Correlational Data class, which outputs to HTML and PNG in the same directory:
data = classCorrelationData('outputCorrelation.csv') data.chordGraphByMajor()
This is currently limited to averaging data by major. Options for this function can be found here: Export / Graphs.