Just got back from EDUCAUSE. I’ll have more on the conference in future posts, but wanted to quickly post a couple of thoughts I have had around learning analytics and transparency based on what I learned at EDUCAUSE and as a result of an EdTech demo session I did this morning with an LMS vendor on learning analytics.
I went to EDUCAUSE with a few goals, one of which was to try to learn more about learning analytics. Specifically, what (if any) are the compelling use cases and examples of faculty and institutions effectively utilizing analytics to solve problems, what are the ethical issues around data collection, how are institutions informing their students & faculty of these concerns, and what technologies are being used to facilitate the collection and analysis of analytics data. And while I didn’t find complete answers to these questions, I did come away with a better 10,000 foot view of learning analytics.
The primary use cases still seem to be predictive analytics to identify academically at-risk students, and to help institutions improve student retention. I get the sense that, while student retention in Canada is important, it is not as critical for Canadian institutions as it appears to be for U.S. institutions. There are likely more use cases out there, but these 2 seem to be the big drivers of learning analytics at the moment.
Earlier today, I attended an LMS demo session on learning analytics where I had a chance to see some of the analytics engine built into the LMS. The demo included a predictive analytics engine that could be used to identify an at-risk student in a course. Data is collected, crunched by an algorithm, and out comes a ranking of whether that student is at risk of completing the course, or of failing the course. When I asked what was going on within the algorithm that was making the prediction about future student behavior, I got a bit of an answer on what data was being collected, but not much on how that data was being crunched by the system – that is, what was happening inside the algorithm that was making the call about the students future behavior.
This is not to single-out a specific company as this kind of algorithmic opacity is extremely common with not only learning technologies, but almost all technologies we use today. Not only are we unaware what data is being collected about us, but we don’t know how it is being used, what kind of black box it is being fed into, and how it is being mathemagically wrangled.
Now, it’s one thing to have something fairly innocuous as Netflix to recommend movies to you based on – well, we don’t really know what that recommendation is based on, do we? It is likely what we have viewed before is factored in there, but it is also likely that the recommendations in Netflix are pulling data about us from services we have connected to Netflix. Mention on Facebook that you want to see the new Wes Anderson movie and suddenly that becomes a data point for Netflix to fine tune your Netflix film recommendations and the next time you log into Netflix you get a recommendation for The Royal Tennenbaums. I don’t know for sure that it works that way, but I am pretty certain that this information from around the web is being pulled into my recommendations. Search for a movie on IMDB. Does that information get shared back to Netflix the next time you log in? Probably.
As I said, the decisions coming out of that Netflix black box are fairly innocuous decisions for an algorithm to make – what movie to recommend to you. But when it comes to predicting something like your risk or success as a student, well, that is another scale entirely. The stakes are quite a bit higher (even higher still when the data and algorithms keep you from landing a job, or get you fired, like teachers in New York State). Which is why, as educators, we need to be asking the right questions about learning analytics and what is happening within that black box because, like most technologies, there are both positives and negatives and we need to understand how to determine the difference if we want to take advantage of any positives and adequately address the negatives. We can’t leave how the black box works up to others.
We need transparency
Which brings me to the point that, in order for us to fully understand the benefits and the risks associated with learning analytics, we need to have some transparent measures in place.
First, when it comes to predictive analytics, we need to know what is happening inside the black box. Companies need to be very explicit about what information is being gathered, and how that data is being processed and interpreted by the algorithms to come up with scores that say a student is “at-risk”. What are the models being used? What is the logic of the algorithm? Why were those metrics and ratios within that algorithm decided upon? Are those metrics and ratios used in the algorithms based in empirical research? What is the research? Or is it someones best guess? If you are an edtech company that is using algorithms and predictive analytics, these are the questions I would want you to have answers to. You need to let educators see and fully understand how the black box works, and why it was designed the way it was.
Second, students should have exactly the same view of their data within our systems that their faculty and institution has. Students have the right to know what data is being collected about them, why it is being collected about them, how that data will be used, what decisions are being made using that data, and how that black box that is analyzing them works. The algorithms need to be transparent to them as well. In short, we need to be developing ways to empower and educate our students into taking control of their own data and understanding how their data is being used for (and against) them. And if you can’t articulate the “for” part, then perhaps you shouldn’t be collecting the data.
Finally, we need to ensure that we have real live human beings in the mix. That the data being analyzed is further inspected and interpreted by human beings who have the contextual knowledge to make sense of the information being presented on a data dashboard. Not only does that person need to know how that data ended up on that dashboard and why, but also how to use that data to make decisions. In short, faculty need to know how to make sense of the data that they are being given (and I’ll touch on this more in a future blog post when I write about Charles Darwin University Teaching & Learning Director Deborah West’s analytics presentation which centered around the question “what do teachers want?”)
One approach from UC Berkeley
At EDUCAUSE, I saw a really good example of how one institution is making their data processes more transparent. In a presentation I saw from Jenn Stringer, Associate CIO of UC Berkeley, there was a slide that hilighted the data policies that they have put in place around the ethical collection and use of learning analytics data.
These principles are reminiscent of the 10 learning data principles set out by the Data Quality Campaign and the Consortium for School Networking.
Additionally, UC Berkeley also makes a student analytics dashboard available to the student so that they get the same view of the analytical data that their faculty get. I think both of these are excellent starts to working ethically and transparently with learning analytics data.
But for me the big question remains – what are the compelling use cases for learning analytics, and are those use cases leading to improvements in teaching & learning? So far, I am not sure I came away from EDUCAUSE with a better understanding of how analytics are being used effectively, especially by faculty in the classroom. If you have some interesting use cases about how analytics are being used, I’d love to hear them.
Photo: Learning Analytics #oucel15 keynote by Giulia Forsythe CC-BY-NC-SA