Thursday, August 6, 2015

Research Evaluation – an argument for a ‘census’ driven collection of publications

In this age of accountability no one questions the idea that data relating to university research publications is collected and reported on. Research publications are no longer only a mechanism for disseminating research findings but they are now also a measure of research performance.

It is hardly surprising then that discussions arising from a recent review by PhillipsKPA are not around whether we should collect research publication data but how we can collect it more efficiently. Australian universities currently report research publications data through the Higher Education Research Data Collection (HERDC) and Excellence in Research for Australia (ERA). One of the 27 recommendations from the PhilipsKPA Review of University Reporting Requirements is to streamline the collection of research data into a single collection. Combining the two collections into a single collection will only be worthwhile if it improves the efficiency, integrity, transparency and utility of the data being collected. Any consultation document will hopefully clarify for the sector how the combined collection will achieve this.

While both mechanisms currently collect research publication and research income data that is really where the similarities stop. The type of data collected, the level of detail collected and importantly the purpose of the collections are quite different.

The purpose of the HERDC is to collect research income and publications data to inform the distribution of research block grants to universities based on their relative performance in each measure. The HERDC only reports on publication volume and does not consider the field of research or the quality of the research – at best it provides a proxy for the volume of research activity across Australian universities.
According to the ARC’s ERA documentation the objectives of ERA are much broader than the HERDC and are listed as, to:

·         establish an evaluation framework that gives government, industry, business and the wider community assurance of the excellence of research conducted in Australian higher education institutions;

·         provide a national stocktake of discipline level areas of research strength and areas where there is opportunity for development in Australian higher education institutions;

·         identify excellence across the full spectrum of research performance;

·         identify emerging research areas and opportunities for further development; and

·         allow for comparisons of research in Australia, nationally and internationally, for all discipline areas.

If we focus on the collection of research publications data in each collection there is one main difference. In the HERDC, publications data is collected for all publications that acknowledge the university with which the author is affiliated on the publication itself – for example through an author ‘by-line’ – regardless of whether the author is currently employed at the university or not - I will refer to this as an ‘address’ based collection. In the ERA collection publications data is collected for all publications authored by researchers employed by a university at a census date (usually 31 March of the year preceding the ERA collection) – regardless of whether the university is acknowledged within the publication or not - I refer to this as a ‘census’ based collection. The difference between an ‘address’ based collection and a ‘census’ based collection may not at first seem significant – but it is.
Consider the case of the HERDC – data are collected on publications only where the university has been acknowledged on the publication – regardless of whether the researcher or research group still works at the university. Once the data have been collected and reported to the Department of Education the numbers are used to distribute block grant funding (approximately $1700 per publication point). This is effectively rewarding universities for the volume of publications they can report that list the university in the byline. There is no consideration made of the quality or the focus of the research in the publications – just the volume. If researchers who produced the publications have since left the university the university is still credited with their research activity.

A big advantage of the address based collection is that it is easy to determine which outputs are eligible for collection and which university they belong to. An address based collection could be conducted by a third party (for example through a citation data provider like Scopus or Web of Science). The disadvantages are that the collection is retrospective in that the researchers may have left the university but their output still contributes to the block grant allocation. This is fine if you are rewarding past performance but problematic if you are trying to profile current research strengths of Australian universities or if you are trying to fund for future research success.
Now consider the ERA which also collects publications data. These publications are the ones produced by the current cohort of staff at the university and not just the ones with a university listed byline. When the publications are reported they are assigned to fields of research and subsequently given a quality score by a national evaluation panel. This allows the Department (and the public) to see where research excellence exists in Australian universities and where research strengths may be emerging. The main disadvantage of a census based collection is that it requires more administrative work to collect the data as publications are not readily identifiable by a university byline within the publication. A census based collection cannot easily be performed by a third party. Advantages of the census based collection are that it represents the current research profile of the university and encourages universities to strategically recruit researchers to contribute to the research profile of the university.

In the case of a new “combined collection” for research publications data it is not immediately clear whether it would be done based on the ‘address’ or the ‘census’ date. Each has its pros and cons and each is useful for a different purpose. However, I would argue that the ‘census’ based approach is more appropriate in this case because of the following reasons:

1.       It allows universities to demonstrate a current research profile based on researchers who are actually working at the university rather than a retrospective profile where researchers may well have left the university since the evaluation

2.       It allows universities to respond strategically to changes in the research landscape and funding environment for example by recruiting researchers to complement or strengthen their existing research profile

3.       It trusts universities with the responsibility of presenting their research in a meaningful way based on their own knowledge of their researchers and their research rather than leaving it up to a third party or a generic business process relying on accurately recorded address data

4.       It aligns with the national uptake of a universal researcher ID in Australia such as the ORCID
I would conversely argue that using an ‘address’ based collection for a research evaluation may result in what I call ‘phantom’ units of evaluations. A publication can have multiple authors and therefore multiple universities listed in the byline – each author may also have multiple bylines. This results in each author of each publication potentially contributing to multiple university research evaluations. A ‘phantom’ evaluation would be where a university may appear to have a minimum volume of publication output (for ERA this is 50 publications to trigger an evaluation) based only on the fact that the university’s name appears at least once on 50 publications. However, while the byline appears on the publication the author may not actually work at the university – for example, if the author has since left the university or where the author has multiple bylines which include other university affiliations in addition to the one they work at. The ‘address’ based collection would also potentially disadvantage universities who have strategically invested in recruiting new researchers to their university. In this case while the new university is paying the salary of the researchers, those researchers’ publications would be contributing to another university’s research evaluation based on their previous bylines.


  1. A good analysis, but only on publications. There's so much more on which effort and impact could be based. I think the universal adoption of ORCID should take place to cope with name variations and changes. When that is done, why not make the researcher the unit of analysis and build from that to an organisational profile? Yes, it's a much smaller unit of analysis, but when you hear the plaints of an organisation being unfairly ranked would that be so much worse?

  2. Hi John - thanks for the comment!

    I agree with you. Once ORCID is adopted it will increase the utility of the data. It should decrease the number of times a researcher has to list their publication record - for example in grant applications, resumes and home pages. It should make the research more visible and easier to discover.

    The researcher as a 'unit of analysis' (what a way to refer to a researcher!) is an interesting idea too - it would allow for aggregation at the organisation level but also at the discipline level locally, nationally and internationally. I do wonder whether it would create an unhelpful competition between individual researchers though when we are trying to encourage cooperation?

    And yes, you are right, publications are only one part of puzzle - let's se what happens if they start evaluating research impact only on patents and commercialisation income.