Making sense of 'self-describe' option in a race/ethnicity question

Hi there, I work for a non-profit and we sometimes gather demographic data to understand who is showing up in programs and whether certain groups experience programs in different ways.

When we ask community members about their race/ethnicity, we include the following categories including a ‘prefer to self-describe’ option.

How do you describe your racial or ethnic group? Check all that apply:

  • Arab or Middle Eastern
  • Black
  • East Asian
  • Hispanic or Latinx
  • South Asian
  • Southeast Asian
  • West Asian
  • White
  • Do you prefer to self-identify? If so, how would you describe yourself? _________________________
  • Prefer not to answer

If the sample of certain races/ethnicities is very small (and we have little confidence in its generalizability), we combine categories to explore the experiences of Racialized, Indigenous and White participants.

The problem: sometimes respondents self-identify in ways that make it impossible to know if they are racialized, Indigenous, or white (e.g. if they state they are South African or Canadian).

Does anyone have a suggestion for overcoming this challenge?

We contemplated asking a follow-up question: Do you identify as a person of colour/a racialized person (e.g., someone who is a race other than white or who is multiracial/mixed race)?

  • Yes
  • No
  • Prefer not to answer

Where I get stuck is that sometimes respondents identify as a race/ethnicity that I would assume is racialized but they don’t identify as such and I’m left feeling unsure whether to consider them racialized or not.

I’d LOVE to hear your thoughts!


Thanks for sharing this challenge! I’m really interested in how you all have chosen to combine categories in those instances.

I’m wondering if you could keep those who prefer to self-identify as a separate category, since it seems like they don’t want to be grouped in the way offered by the question already. And/or I think you could modify the follow-up question to be something more explicit about why you’re asking, like "when we only have a few responses in some identity groups, we combine the above responses into Racialized, Indigenous, and white (maybe offering definitions of these?) to better understand who our programs are reaching. Do you identify with any of those categories?

  • Indigenous
  • Racialized
  • White
  • None of these options"

I’m curious to hear how others respond too!

1 Like

Great discussion on a huge and very common issue @melyule :grinning:

I think along the same lines as what @adellemcd was outlining.

For me, the driver of how we ask the question and what categories we use always comes down to what specific question we’re trying to answer. What are we actually trying to measure? And then choose categories and response options that will maximize our understanding of that.

It sounds like you’re collecting this data to disaggregate program feedback data in order to see if certain groups of people are experiencing your program in similar ways. (Is that right?) Given that and the reality that you have small numbers in many groups,

I’d consider collecting data using three separate binary questions. As mentioned above, you’ll need a brief sentence or two explaining why you’re asking. The use of three separate questions means you’re not aggregating folks into categories they don’t want to be in as well as getting more accurate and larger sample sizes since the details you’re interested in are not mutually exclusive.

Q1: Do you consider yourself racialized? Y/N;
Q2: Are you Indigenous (Or a variation depending on what aspect of Indigeneity you’re trying to capture - identity, tribal enrollment, etc);
Q3: Do you consider yourself white or white and at least one other race? Y/N

Keep us updated with your decisions, I’d love to hear.


Thanks, @adellemcd and @Heather for your thoughts on this topic.

Heather, you are SO right- I should have specified how I hope to use the data because otherwise, what’s the point?

Your hunch is correct, I use the race/ethnicity data (and often gender, age, and other demographic data) to disaggregate program outcomes or organizational impacts to see if different groups experience different outcomes due to their participation. This helps program staff with their program design, outreach, and equity and inclusion efforts.

For that purpose, I like your two suggestions:1) being clear about why we’re asking, and 2) using those three, closed-ended questions that will get us the information we need.

In some cases, the partner organizations I support with evaluation tool design may want to include other questions to ask whether participants identify as a specific race/ethnicity they see in larger numbers in their program, but that feels easy enough.

Thanks again for your thoughts! I’ll reflect some more and let you know what we choose in the end and how it goes :slight_smile:


I love the triple-binary response option that was proposed. While it isn’t quite right for most of my needs, I wanted to share so many thanks for Adelle’s, so-obvious-I-missed it reply!

I’m moving my org away from a completely open-ended comment box for race/ethnicity (very much not comfortable putting people in boxes when they share challenging answers like nationalities) and the write-in responses will be tallied as “identified differently” (or something like that) and shared in an appendix or other summary method.

I am also interested in hearing how the binary format (and other options) works for Melissa’s and other’s teams!
Cheers, Chris