Knowledge traditions and the use of evaluations

This external evaluation view from Øyvind Eggen discusses knowledge traditions and the use of evaluations.

Written by Øyvind Eggen, head of the Rainforest Foundation Norway. Eggen has previously worked as Policy Director in the Evaluation Department at Norad.

When I started working as Policy director in the Evaluation Department in 2013, I had a very simple theory of change: My job was to ensure the best possible evaluations, which would again lead to better utilisation of evaluations, and eventually better aid. When I left Norad a few years later, I had learned that the equation is more complicated. Here are some of the lessons.

Quality not enough

I soon learned that high quality evaluations alone do not ensure better aid. The Evaluation department know this well and works systematically to ensure that evaluations are integrated into knowledge management and quality assurance systems in aid management. This has undoubtedly contributed to increase the use of evaluations.

Still, investing in better use without parallel investments in high quality is futile. The expression ‘garbage in, garbage out' also applies to aid evaluations: Poor evaluations are at best useless, at worst destructive.

I also learned that ensuring high quality was surprisingly demanding, even with evaluators who were highly qualified. One reason was that some had not understood that the days are over, when one could get away with evaluation reports based on general impressions from document reviews and field trips combined with personal judgments, rather than established evaluation methods. Another was that communication about quality of evaluations is very limited when primarily done via formal, written channels at formal check-points.

But I also learned that it is not enough that an evaluation is good. In my experience, there was little correlation between the quality of evaluations and their uptake in aid management. Even among high quality evaluations, there are some evaluations that influence aid management, while others almost repel off.

Immune against evaluations

In some cases, this can be explained by some aid programs being almost immune against evaluations. They remain unchanged regardless of evaluation results, often because of political (‘we cannot change that, it has been politically decided’) or organisational (‘this is the way we do it here’) reasons. Paradoxically, sometimes it seemed that politically-prioritized areas within aid were less open to new knowledge, perhaps because those in charge were more keen to adapt to political dynamics than to improve aid, while aid programs below the radar of the political leadership had more space to learn and improve.

In other cases, I met real willingness to change, yet making use of new knowledge was difficult. There are several reasons for that, but my own experience from dissemination of academic research for aid management inspired me to see one explanation.

Interesting, but useless?

I came to Norad from a position as an academic researcher, where I had specialized in the effects of aid. Holding a small lecture for my new colleagues, I realized that although they were very enthusiastic about my research, they did not seem to find it useful.

One problem was that I could not give an answer to their question number one: Does aid work? Even after three years of research on one particular case – the effects of aid on formal and informal state institutions (governance) in Malawi – I did not have a conclusive answer. Not because my research was unsuccessful, but because social science simply does not have methods that can be used to provide clear answers about causal relationships in the complex social and political dynamics that ‘governance’ entails.

Societal change never goes as planned

I could, however, tell a great deal about different ways aid is likely to affect governance, both positively and negatively. This is insight, that can help increase the chances of aid making more good and less harm, and I gave some practical advice on how it can be done. But even that advice appeared of little use.

One of my main messages was that aid brings with it a whole range of effects, positive and negative, that were impossible to predict during the planning phase. In order to improve aid, I claimed, donors must free themselves from the plans and pay particular attention to anything that may happen, that they had not foreseen, and respond quickly to that.

This is of course not new to anybody with experience from development cooperation. The problem is that the aid management apparatus is not configured to use such knowledge. Its mission is, in an idealized version, to develop strategies, theories of change and results frameworks that enable the transformation of policy intentions (in this case: better governance) into aid programs, which again leads to intended results (impact), preferably within a period of five years or so. This is managed through established multi-annual programs with pre-determined indicators.

Pre-determined indicators

Those indicators pose one of the practical problems in applying knowledge, as they limit the possibility to observe and respond to developments that were not foreseen from the start. Governing through pre-determined indicators makes aid management well able to manage according to plan, but poorly equipped to respond to what actually happens.

At a deeper level, aid management meets a less tangible, but no less important in responding to change that does not go as planned. For aid management, the way we know it, to be meaningful, it assumes a logic that makes it possible to say that “if we do A, it will cause effect B", and afterwards to be able to verify if that has happened. If this is verified – through documentation that it has worked – the intervention is repeated and preferably scaled up and replicated elsewhere. Continuous testing of which interventions that work will then gradually increase the effectiveness of aid.

No pool table

This is a logical recipe for more effective aid, but is not necessarily in line with the insight from some academic research – as well as practical experience. Social change, which basically is what development is about, does not necessarily work in such a way that such equations are useful. Society does not work like some kind of advanced pool table where you can calculate in advance the effects of a specific ‘push’ (aid) into societal change (development). Moreover, even if you do succeed once with a specific ‘push’, it tells you little about whether it will give the same effect next time, or somewhere else.

Of course, nobody in aid management believes that societal change works like a pool table. Neither does anyone believe that development is so random that it does not make sense at all to think about social change in this way. Everyone knows that real life is somewhere between. My point here is not how society works, but that aid management is configured in a way that makes it better able to accommodate one way than other ways forms of looking at societal change.

Knowledge that builds on the implicit presumption that social change (development) can be predicted and influenced in a planned manner in such a way that you can predict the effects of aid – knowledge where the pool table may be a relevant metaphor – has higher chances of being accommodated in aid management, than knowledge that rejects such logic.

This is not of course because aid managers tend to believe in one particular way of looking at societal change, but because their most basic tools – such as the logframe with its later successors, most versions of theories of change, and predetermined indicators as the dominant monitoring tool – becomes meaningless if that premise is not true.

This may explain why my colleagues in Norad seemed mixed about my academic research: Even if they believed that I was right about what I said about aid and governance, they could not use that insight, as it could not be incorporated into the tools they had available as aid managers. It is not about which knowledge tradition that is right about development, but which that can be used in aid management.

Evaluations can accept or reject the underlying logic

Here, evaluations play out in two diametrically different ways. Many evaluations build on a logic seeing aid as part of predictable, planned processes of change with identifiable causal relationships. Such evaluations primarily aim at testing the original assumptions (theory of change) by verifying whether intervention A led to effect B, and suggest adjustments if not. This will over time lead improvement not only of that interventions, but – assuming that the aid management apparatus is able to incorporate evaluation results and learn – aid effectiveness in general.

Other evaluations do not relate to or even challenge this logic. They are often less focused on verifying whether aid works and more to understand how it works. The underlying logic is that learning if one particular intervention has worked is of less use if processes of change are so complex that the same recipe cannot necessarily be replicated elsewhere. Learning more about how aid works may, however, help practitioners respond better to what actually happens, when they plan for something else.

The well-known tension between more rigorous, often quantitatively oriented impact evaluations and the more explorative, normally qualitative evaluations can be understood in such a light. It is not primarily a question of disciplines or evaluation methods, but radically different ways to understand societal change.

Those tensions reflect real dilemmas and not just a clash of paradigms: Both approaches are valid, useful and necessary ways of understanding change, but in radically different and seemingly incompatible ways.

Plan for better use of both knowledge traditions

The challenge when managing evaluations is to make use of both knowledge traditions, as there is no doubt that both are necessary and can be useful, if planned well.

I think the key is to reflect more at an earlier stage with regard to the choice about whether to accept or reject the underlying logic behind an aid strategy, programme or intervention. When drafting a Terms of Reference, it is already late – and it is certainly too late if the choice is made when processing proposals from different evaluation teams representing radically different knowledge traditions.

One key issue to consider is the scope for the possible utilization of an evaluation that challenges the underlying logic. If the general approach and design of a program is already given with low likelihood of major changes, an evaluation that challenge the underlying premises may lead to nowhere and may be wasted. But one that serve to test and improve the interventions applied, within the given parameters, may be enormously useful by improving the effectiveness.

If, on the other hand, there is real possibility to make more radical changes in a program or a strategy, an evaluation that rejects the underlying premises may be most useful. Not necessarily because it will make those in charge adapt to the evaluators’ recommendations: even they do not, they will be pushed to reflect once more on the basic, often implicit postulates behind a strategy and in the end make a more informed decision.

What is most useful, also depends heavily on the nature of the programme and the context. For example, work in stable countries requires a completely different thinking about change than work in fragile states. And the possibility to replicate a successful intervention is very different depending on whether you work with biomedical health interventions, or with civil society advocacy.

Lost opportunities

Such considerations, in early stages of planning evaluations, may lead to more constructive use of the existing tension between different knowledge traditions in aid management – in evaluations, most directly manifested in the choice between the more explorative and the more rigorous evaluations.

However, many evaluation agencies do not consider this during planning of each evaluation, some because they have already positioned themselves with regard to the preferred type of evaluation, others because they wants to remain ‘neutral’ when it comes to disciplines and methods. In practice, they delegate one of the most important questions regarding the use of evaluations more or less to chance (by leaving it open in a tender), or perhaps to individual preference (and educational background) of the officer in charge. That means a lost opportunity to make one that choice subject to strategic considerations with a view to maximizing the usefulness of an evaluation.

Published 10.04.2018

Last updated 10.04.2018