Tuesday 15 April 2014

Why are students so hard to count?

A common phenomenon in universities is the argument about data, and in particular differences between the number of students a department thinks it has, and the number of students that ‘the university’ thinks the department has. Let’s set aside for a moment the unworthy suspicion that such arguments are a smokescreen to disguise other issues. Why is it so hard to get student data right?

One reason is the specificity with which student data is defined. If you’re counting students to work out what size classroom to put a course in, then you need to know how many people it is (headcount) and how many are following that course. If you’re counting for budgeting purposes you might prefer full-time-equivalent (fte) and only those that are enrolled and paying fees. (And, by the way, there are students who are following courses who haven’t enrolled or haven’t paid fees.) The perspectives about the right data differ depending on need. And if you think about students who might be re-sitting a module or a year; or who might be undertaking placement work for all or some of their study; or who might be part-time at the moment but in a broadly full-time pattern of study; or many other possibilities, then you can see that there’s a lot of detail to be argued over.

Another, related, reason is that data is often collected by a university to satisfy the demands of an external return – from HESA, for instance, or from a funding council. The definitions used in such collections can be abstruse, to say the best. For instance, a few years ago there was a change in the way that taught postgraduate students were counted for funding purposes, meaning that students might be very present in a university – enrolled, having tutorials, using library facilities – but would be counted as zero fte for funding purposes. They weren’t being ignored – they’d have been accounted for in a previous year’s return – but a data set used for an external return is not then comparable directly with the reality of the institution. Sometimes an almost theological attention to the detail of definitions and rules is needed.

A third, big, reason, is life itself. The model whereby a cohort of students enrols in September and pursues study diligently through the year is just that: a model. In reality people come and go – because of funding, because of family reasons, because they themselves are not sure if the course they are following is right for them. Universities ask students to inform them when they have a change in circumstances or attendance, and so students sometimes do this. And it makes the record a fluid thing. A count of students in the morning may not be the same as a cont taken that afternoon – it isn’t a data problem, it’s life.

A fourth reason is that data systems are complex things. In any reasonably large university there will be many people who interact with students and who record the transactions on the student record system. These record systems have a lot of fields (check out the HESA list of fields for the student return if you don’t believe me). There’s a lot of scope for errors. Nowadays systems do have checks within them, but they aren’t foolproof – the human capacity to find new ways to input data is truly wonderful. (For instance, I once was supported by a temporary PA, who was ordering stationery for me. The finance system required cost codes and account codes, and as this person didn’t have access to the manual, the approach used was to put in random numbers ‘til it worked. I got the stationery, for sure, but it probably didn’t help the management accountants ...)

So what to do about this? Here are three approaches which can help.

Firstly, get in the habit of specifying exactly what data you need. Precisely. Planning and data teams can help by giving menus of data, so users have the knowledge to ask precisely. You’ll reduce apparent data errors this way, but more importantly you’ll promote the idea that precision of specification matters. More sophisticated data users can then have more sophisticated arguments.

Secondly, and related, don’t make data collection and submission the business only of a few people. It can be easy for those who make external returns appear as the guardians of secret and arcane knowledge. (Is a countable year one of my three score and ten?) Not everyone will want to engage with the detail of the HESA return, but if more people know that there is a specific coding, and that it can be found (HESA are very good and transparent) then more people might recognise that what they input does matter.

Thirdly, help the people who collect and own the data in your university to work together. Data quality isn’t about doing a hard sum, it’s more like weeding a vegetable patch. Unless you check regularly and are willing to get your hands dirty, then data errors will occur. Give someone the role of overseeing data quality (often the planning function will do this) and ensure that they bring the data owners together regularly. The more a sense of team develops here, the better your data quality will be, and the fewer arguments you will have.

No comments:

Post a Comment