Right-Fit Data for Your Ministry: A Synopsis of The Goldilocks Challenge

Nathan Mayo staff portrait Nathan Mayo
Vice President of Operations & Programs
Read more from Nathan

Check out True Charity Ennoble on Apple Podcasts and Spotify.

Jump to:

The Purpose of the Book

The authors of The Goldilocks Challenge observe that many nonprofits measure too much, too little, or the wrong things. Thus, they want to help nonprofit leaders find the amount of data and analysis that is “just right” for their organization.

The Perspective

This book is written from a distinctly Christian perspective and is directed at the local church, with concepts easily applied by nonprofits, as well. This book also takes a global perspective on poverty but does not neglect the manifestations of poverty at home. The authors assert that it is unacceptable to do nothing and equally unacceptable to just do anything. “We do not necessarily need to feel guilty about our wealth. But we do need to get up every morning with a deep sense that something is terribly wrong with the world and strive and yearn to do something about it.”

The Key Points

Monitoring is distinct from evaluation.

“Monitoring and Evaluation” (M&E) is the common phrase to describe the duties of professional non-profit data analysts — but the two words aren’t synonyms. To understand the distinction, it helps to understand the problem each action solves.

Let’s imagine you launch a community mentorship program to help keep at-risk youth in school. After one year, the rate of school dropouts remains unchanged. It seems something hasn’t worked as planned. But did it fail because the concept was flawed—or was it poorly implemented?

“Monitoring” addresses execution. It asks questions like, “Did we do what we think we did? Did kids really get paired with a mentor? Did the mentors meet the required qualifications? Did they meet with the young people as planned?”

“Evaluation” addresses whether the idea worked and to what degree. Even though it seems like the mentorship program failed, there’s still a lot to learn from closer evaluation of the outcomes. Questions to consider include: “What positive or negative impact did young people report in the surveys? Did they perceive the mentorship as helpful? Were there other measurable changes in indicators we care about? Do we have a counterfactual from kids who weren’t paired with mentors (i.e., a reliable assessment of what would have happened if the program didn’t exist)? Did they do worse?”

For new programs, a focus on monitoring should come first, with impact evaluation happening later in their development.

Collect the right data using CART principles.

For both monitoring and evaluation, not all data is created equal. The following set of principles, identified by the acronym “CART,” will help you collect the right information.

Credible: Can you trust the data?

Not every number accurately reflects reality. Data must be valid, meaning it measures what we think we’re measuring. It must also be reliable, meaning it measures the same information in the same way with consistent results. Threats to both would include unclear survey questions, double-barreled questions (e.g., rate quality and convenience on a scale of 1-5), a long recall period, and social desirability bias (i.e., asking about issues or beliefs which the respondent would rather not disclose).

Once you have good data, accurate analysis is essential to ensure trustworthy results.

Actionable: Will you make any practical changes based on the data?

Gugerty and Karlan challenge their readers to “collect the data you can commit to use.” That means everything collected must lead to clear actions we have the will and resources to implement. In other words, we should intend to do something specific with the information we’re collecting. Otherwise, there is no point in the effort required to collect it, which leads to the next principle…

Responsible: Is the benefit of the data worth the time, effort, and expense involved in collecting it?

An important way to ensure collection of the right kind of data is to map the logic model of the program in substantial detail. The next step is to identify decision points where data collection can help determine needed action. Not only is more data not always better, it’s often worse, as it can obscure more useful information or diminish the time we have for data entry and processing.

Transportable: Can the data help answer questions outside the immediate program or organization it was collected for?

While this principle is most applicable to large programs in a position to publish results from academic impact studies, it also serves as a useful consideration for smaller programs. Is our information easy to share within our organization—and with those outside it?

Outcomes do not equal impact.

In the M&E framing, “outcomes” and “impact” are sequential but distinct steps of “evaluation.”

Outcomes are measured results that follow our programs. This is far more significant than measuring the “outputs” of a program. However, as inconvenient as this fact is, the authors unflinchingly assert that participants’ life change doesn’t necessarily mean we can attribute that change to our program.

Outcomes measurement can prove nothing changed, but it cannot prove our program caused a change. In support of that claim, the authors make a compelling case for why factors like self-selection and “regression to the mean” bring about recipients’ life change in ways not truly caused by the program.

Impact is the fraction of our outcomes actually caused by our programs (e.g., if graduates of a program boost their income by $500 a month on average, it may be that only $250 of that boost is a direct impact of the program). The only way to reliably determine impact is with some kind of “counterfactual.” To accomplish this, the authors discuss the “Randomized Controlled Trial,” popular in medicine but with many limitations in the context of small nonprofits. Beyond this approach, they also detail a number of simpler “quasi-experimental” ways to establish a counterfactual.

Details We Love

This book dives deeper into measurement and evaluation than most readers likely have. In doing so, it emphasizes the value of a good theory of change and the value of thoughtful execution. It is a positive and much appreciated affirmation that limited data, appropriately collected and thoughtfully analyzed will lead to better results than clawing one’s way through reams of indiscriminately-assembled spreadsheets.

It also demolishes the common notion that measuring the quantity of stuff given away is a good measure of impact. But rather than dwell on that simple notion, it takes us boldly into a realm of substantive measurement that few nonprofits ever venture into.

Considerations

Although the authors maintain an encouraging tone, the book can leave readers with the impression that measuring meaningful impact is extremely difficult. While their comprehensive advice is best implemented by those solely dedicated to measurement, nonprofits with small staffs can still extract valuable insights.

As well, readers from typical US nonprofits (which run small and highly intensive interventions like daily after-school or months-long residential programs) may feel disconnected from many of the book’s case studies, which highlight large, overseas organizations with a platoon of community health workers visiting far-flung villages to provide mosquito nets.

Still, there are nuggets to be mined. Being small means monitoring is much simpler, i.e., the executive director is separated from the case manager by a hallway, not an ocean (or is the case manager). Furthermore, a causal link between an intensive program and an outcome is far more likely than one between a superficial program and an outcome.

The authors defend the hardline stance that without a counterfactual, any attempts to measure impact are a waste of time. While their scientific rationale is ironclad, practical alternatives are worth consideration, specifically that lower standards of evidence can still provide useful information. By analogy, in law, criminal convictions must be established “beyond a reasonable doubt,” whereas a civil judgment requires only a “preponderance of evidence” (often described as “more likely than not”).

Thus, perhaps we should strive for the higher standard of evidence for our program efficacy, but be willing to start with the lesser. Funders will prefer “preponderance of evidence” to none, and we’re more likely to learn useful things in the process of measuring at least some harbingers of impact. That said, the authors are right that cause-effect is difficult to document conclusively, and unless we’ve gone the distance to build a counterfactual, we should maintain appropriate humility about what our data can prove.

Who Should Read This?

This book is accessible to program leaders and philanthropists who want to understand nonprofit monitoring and evaluation more deeply. It will especially benefit those who already have a basic understanding of logic models and outcomes measurement but want to ensure the programs they support operate at the next level of effectiveness.

The Goldilocks Challenge can be purchased on Amazon. If you purchase the book through this link, True Charity will earn a small amount as an Amazon Associate.

This article is just the tip of the iceberg for the practical resources available through the True Charity Network. Check out all of the ways the network can help you learn, connect, and influence here.

Already a member? Access your resources in the member portal.

This post Right-Fit Data for Your Ministry: A Synopsis of The Goldilocks Challenge first appeared on True Charity‘s public site.

Related Articles