Hands-on Workshop Session in EC-TEL 2019, Delft, Netherlands
Please use the form at the end of this page, to tell me about yourself. This is important for me to tune the tutorial content. See you at Tempe!
The prediction research has been based on the data from a single past course to build and test predictive models with post-hoc approaches (e.g., cross validation). However, these approaches are not valid for real-world use since they require the true training labels which cannot be known until the target event takes place (e.g., dropouts) (Bote-Lorenzo & Gómez-Sánchez, 2018; Gardner & Brooks, 2018). To overcome this limitation, Boyer & Veeramachaneni (2015) have proposed the in-situ learning approach that allows training a model based on proxy labels. With in-situ learning, predictions can be produced before the target activity takes place, thus it can be used for creating real-world interventions. Nonetheless, their use is very limited in MOOC prediction research. If widely adopted and practiced by researchers and practitioners, educational interventions using predictive models can be promoted in real-world practice.
Another main limitation in the literature is that predictive models are designed independent from the contexts for which they are intended. On the other hand, learning design and the pedagogical intentions can substantially impact the way students engage in the course, which can help determine the engagement indicators critical to student success. For example, most studies include total number of video views as a predictive feature. However, such a cumulative feature may not capture the importance of student engagement in a particular video that is more critical to success in the target assignment or exam. Thus, research is needed to create predictive models that align more closely with the context for which they are intended for.
The first objective of this session is to teach the how to generate actionable predictions on student engagement using in-situ learning approach. Participants will be guided to compute relevant features to build a predictive model and to train the model with in-situ learning approach in Python Scikit-Learn, one of the most widely used machine learning library in the field, will be used in the tutorial.
The second objective is to illustrate the participants how to inform the feature selection with Learning Design, which is highly discarded in the literature. Features informed by learning design of the context studied can lead to more powerful models. In the session, participants will experience how to interpret the learning design to create features more relevant in the context.
Theory (30 mins.)
- Introduction to Machine Learning
- Training Paradigms: Cross-Validation and In-Situ Learning
Hands-On Exercise -PART 1 (45 mins.)
- Getting Familiar with Jupyter Notebook Environment
- Introduction to Python, Pandas, and Scikit-Learn
- Understanding the Data
Hands-On Exercise -PART 2 (90 mins.)
- Building Machine Learning Models in Scikit-Learn with
- Cross Validation,
- Transferring Across Courses
- In-situ Learning
Concluding Remarks (15 mins.)
- Reflecting on the session: Ideas to Put into Practice What Was Learned
WHAT YOU NEED: A fully charged laptop, which has Anaconda (Python 3.7 version) installed. Feel free to contact me (firstname.lastname@example.org) if you need any help before the session.
Erkan Er received his PhD degree in Learning, Design, and Technology from the University of Georgia, USA, in 2016. He is currently working as a postdoctoral researcher in GSIC-EMIC research group in the Department of Telecommunications Engineering, in the University of Valladolid, Spain. His recent research interests include using machine learning and educational data mining techniques to understand and support student learning in massive contexts.
He currently works in the project called WeLearnAtScale.
Alejandro Ortega-Arranz received his BSc and MSc in telecommunications engineering from the University of Valladolid, Spain, in 2014 and 2015 respectively. He is currently a PhD candidate in the GSIC-EMIC Research Group at the University of Valladolid. His main research interests include game-based learning, gamication, and the technologies supporting the implementation of these strategies in online environments at different scales.
Tell Me About Yourself
This form is to help me know more about yourself. Any information you can provide will help me fine tune the content of the session so that you can make the most out of it.
See you in Delft!
Bote-Lorenzo, M. L., & Gómez-Sánchez, E. (2017). Predicting the decrease of engagement indicators in a MOOC. In Proceedings of Seventh International Conference on Learning Analytics and Knowledge (pp. 143–147). Vancouver, Canada. https://doi.org/10.1145/3027385.3027387
Bote-Lorenzo, M. L., & Gómez-Sánchez, E. (2018). An approach to build in situ models for the prediction of the decrease of academic engagement indicators in Massive Open Online Courses. Journal of Universal Computer Science, 1.Accepted.
Boyer, S., Gelman, B. U., Schreck, B., & Veeramachaneni, K. (2015). Data science foundry for MOOCs. In Proceedings of the IEEE International Conference on Data Science and Advanced Analytics (pp. 1–10). Paris, France. https://doi.org/10.1109/DSAA.2015.7344825
Gardner, J., & Brooks, C. (2018). Student success prediction in MOOCs. User Modeling and User-Adapted Interaction, 28(2), 127–203. https://doi.org/10.1007/s11257-018-9203-z