October 25, 2013

Understanding Education through Big Data

Category: Edtech
close up of data code creating light tunnel

The seduction of ‘Big Data’ lies in its promise of greater knowledge. The large amounts of data created as a by-product of our digital interactions, and the increased computing capacity to analyse it offer the possibility of knowing more about ourselves and the world around us. It promises to make the world less mysterious and more predictable.

This is not the first time that new technologies of data have changed our view of the world. In the nineteenth century, statistical ‘objective knowledge’ supplanted the personal knowledge of upper-class educated gentlemen as the main way in which governments came to know about those they governed. In our own time we seem to be facing a new revolution in which the basis of how we come to ‘know’ something – our epistemological foundations – is becoming reliant on big data analysis. From the perspective of this new epistemological turn, our knowledge – from the performance of healthcare staff to how we choose a romantic partner – rests on the extent to which it is known through big data analysis. But what does it mean for education if the way that we know about it is governed by big data? Here, I sketch out some of the questions raised by the turn to a ‘big data epistemology’ in education.

Learning Analytics

It is not new that educational institutions collect and analyse data for predicting and intervening in children’s educational performance. But this data is often limited and disconnected, kept in separate repositories, in different formats, or never formally recorded at all. What is new is digitising, meta-tagging and aggregating that data with many other data sets, making possible new connections, predictions and diagnoses. This is the field of ‘learning analytics’ – described as the collection, analysis and use of data patterns to optimize conditions for improving learning.

This is what is being attempted by services like inBloom, Knewton, and other new start ups announced at SXSW in March 2013, indicating a potentially lucrative new market tapping into students’ data. These services draw together existing data from a wide range of sources, as well as data produced as a by-product through children’s use of technology. By including so much data about individual children, and comparing that to the data from hundreds of thousands of other children, these services can create a learning profile for each individual child, diagnosing their strengths, weaknesses and challenges. After diagnosing problems, it then prescribes solutions, in the form of more educational technology software from its partners. Bill Gates sees the use of data as the next technological revolution in education, and the Gates Foundation, Carnegie Foundation and others have provided $100 million of support for inBloom. While the partners of these services do not necessarily have direct access to student data (unless the school district or state already has a relationship with them), they benefit by being able to target specific new software directly to individual students and teachers and by having better access to aggregated student data to drive future product development.

Learners and their Profiles

This application of learning analytics reframes teachers’ knowledge of an individual child away from an interpersonal relationship that recognises the uniqueness and difference of the other person, towards a knowledge determined by analysis of a child’s data trails. Big data analysis might then come to be the way that we ‘know’ a child educationally – how they learn, where their strengths and weaknesses lie, what kinds of teaching they might respond to, in short, who they are.

Because most teachers do not have the time, resources, skills, or access to large aggregated data sets needed to undertake such complex analysis, organisations like inBloom and Knewton provide neatly packaged results direct to teachers – taking the process by which judgements are made out of teachers’ control. Teachers and schools become ‘end users’ of data, positioned as unable to engage with, question or unpick the algorithmic processes by which diagnoses and prescriptions are made.

How does this emphasis on data-driven knowledge shape what it is possible to know about learners? Big data sets means that an averaged ‘norm’ can be identified for certain characteristics (age, location, socio-economic status, previous educational performance, etc.), creating an idealised ‘other’ to which an individual is compared. Attention is thereby focused on the gap between a child’s observed data patterns and where the data says they ‘could’ or ‘should’ be. Efforts are consequently focused on closing the gap, and individualised ‘catch-up’ work becomes the norm, while other possible responses – such as looking at how the classroom or curriculum could be organised differently, starting from the learner’s strengths and interests, or understanding the underlying reasons behind a learner’s development – are made less visible.

Know Thyself

Learning analytics also potentially changes how learners come to think of themselves. As our digital interactions are tracked, tagged, organised and presented back to ourselves, Rob Horning argues that it is becoming impossible to separate our own subjective sense of who we are from our ‘data-self’. Our digitised data, and how it is represented back to us becomes “a new dimension of what makes our experiences ‘real’.” So it may be that children’s sense of themselves as learners comes to be more dominated by visualisations of their educational data through apps, web profiles and infographics than through processes of reflection and dialogue. The ancient maxim to “know thyself” becomes instead: “measure thyself.” If the reliability of our knowledge rests on the extent that it can be backed up by big data, our learning profiles may be seen – both by others and ourselves – as more robust and objective descriptions of who we ‘really’ are, supplanting and dismissing our own messy, subjective self-knowledge.

But the question of our ‘real’ identity is a slippery topic. Many researchers now see identity not as a pre-existing fact to be discovered, but as something that we continually make and re-make using a range of resources – including our relationships with other people and technologies. While none of us wakes up in the morning with a total personality transplant, we do make choices about what kind of person we want to be and how we want to present ourselves to others. We make choices about which aspects of ourselves to share and which to keep private. Any description of our learning identity is therefore necessarily always partial – it cannot encompass the totality of who we are because who we are is in flux and depends on the context we are in. It is not just that we do not have enough information – a problem that could be solved by big data – but that no amount of information can pin down our inherently fluid learning identities.

Who Decides Our Learner Identities?

If it is not possible to be completely sure that a learning profile created from data is true or fixed, the important question is who gets to decide what the data means? When a learner’s identity is something they define in their relationships with teachers and peers they have an element of choice in determining what kind of learner they are, and what kind of learner they might want to become. They can provide the context that makes sense of their data. They can challenge or resist others’ interpretations of their actions and motives. In short, they have some control and voice over who they want to be as a learner.

If we do not like our data-driven profiles, we can try to adapt our behaviour to produce more favourable data. Just like Facebook or Twitter, these systems encourage users to continually produce more data. This data is where the real value lies for the providers, allowing more precise targeting, advertising and development of educational software. The considerable commercial value of these kinds of data is evidenced by the World Economic Forum’s decision in 2011 to designate “biological, ambient, environmental data gathered about the person” as a new economic asset class that will open up […] new possibilities for targeted delivery of services and goods.”

But not everyone wants, or is able, to adapt themselves to fit into the particular educational values and assumptions of what it means to be a successful learner built into learning analytic algorithms. For every individual able to adapt and fit in, there are those who can only adapt so far, because to do otherwise would be to deny alternative cultural educational values, aims and notions of success. A refusal to play the game comes at the expense of becoming invisible to powerful networks and the absence of a learning profile backed up by big data may seem suspicious in itself.

The use of big data analytics in education is not necessarily useless or insidious. For one thing, it can provide a useful additional perspective, ‘from the outside in’ about learners’ development. But we need to consider the implications and consequences of using big data analytics as our main way of knowing about education. It tends to simplify big social and political questions about what kinds of learners we are and want to be, or how education should respond to major social and economic challenges, to a simple process of prescribing the next piece of educational software to download. These big questions do not have simple, single answers. Different traditions, different approaches and different people will come up with different answers. Rather than locking ourselves into one perspective, we need to be open to multiple ways of understanding education and learners, opening up the possibility of a range of different responses. Crucially, learners, at the heart of the process, should be part of the debate about their own learning and education.

Banner image credit: infocux Technologies http://www.flickr.com/photos/infocux/8450190120/