BIG PICTURE QUESTIONS
What’s the big idea?
In English, there are approximately one million words. If everyone uses the same word, the value of that particular word goes down. Why? Because the human brain is wired to pay attention to new information and ignore the old. We stop noticing the same, tired word. So, if you use the same word a lot, or a word that is used by a lot of other organizations, people will notice it less than one they don’t see very often.
Translation: If you feel like you’re not being noticed as much as you’d like, make sure you’re using words that other organizations aren’t using. That way, more people will notice you, engage with you and support you.
But how is a person like you to know if a word is used a lot by nonprofits or not? We wondered the exact same thing and decided to do some research to find out.
What did we research, exactly?
We selected 2,503 nonprofit organizations at random. They represent a wide variety of sub-sectors, budget sizes, and geographic regions.
Those organizations used a grand total of 15,469,368 (ish) words on their websites. We analyzed all those words. We took out function words like ‘a’, ‘the’, and ‘but’ since we have to use those words for a sentence to (you guessed it) function. That way we could look exclusively at content words and see what the most frequently used adjectives, adverbs, nouns and verbs were. What we learned is that nonprofits are using only a fraction of the words available to them.
What do these findings mean for nonprofits?
That there are a whole lot of un-used words out there just waiting for you to use them!
Out of the one million words, a tiny fraction are actually being used. Think of the opportunity that presents for you. Our brains love novelty. It makes them light up and think happy thoughts. It triggers a drug that makes someone want to learn more about whatever the neat, new thing is. That thing could be you, your work, your mission. And it all starts with a word.
MORE DETAILED/RESEARCH-Y QUESTIONS
What is that pie chart saying exactly?
Ah yes, the pie chart. It shows the contribution each sub-sector is making to the prevalence of this word.
Drilling down a bit, it’s a weighted percentage because some sub-sectors have more organizations than others and we didn’t want that to make it look like they were using a word more. If everyone is using a word about as much as everyone else, our chart will show all the pieces as equal. If we didn’t weight the percentages the chart would mimic the percentages for the number of websites in each sub-sector.
Another way of thinking about it is to ask: “percent of what?” It is the percent impact that sub-sector had on the total frequency for that word.
What should I do with the information in the pie chart?
It depends on whether the word is one The Wordifer tells you to stop using, use with caution, or use all you want.
If a word is used so much that The Wordifier tells you to “Stop!”, it probably doesn’t make much difference what sub-sector you are in.
The pie chart will be most helpful in dealing with words that should be used with caution. If a word in this category is also used predominately by your sub-sector, you should treat it as a “Stop!” word. If it isn’t used much in your sub-sector, you can feel more comfortable using it. If it is used heavily by a different sub-sector, it may even be intriguing if you use it. Consider, for example, how novel the word choice “spiritual” feels to you when describing a program for a church vs. for an art museum.
If a word shows up in the green, go ahead and use it (with common sense….again, see caveats). Even if it is used a lot in your sub-sector, it isn’t used heavily. Also, you should know that the percentages are less reliable for green words. Since we didn’t find those words as often, the sample size of those words is smaller. This means that the particular breakdown in our sample may be due to a fluke rather than a real trend throughout the sector. Mostly, we just find the sub-sector breakdowns for green words interesting and amusing. If you find one that makes you chuckle, let us know.
Why did you decide to look at 2,503 websites? What’s magic about that number?
Currently 2503 urls, but some resisted that web crawler. We are working on a protocol for cleaning the list to make sure the total represents websites that actually handed over their text. (e.g. a website that returns “forbidden” isn’t actually handing over their text, but the return isn’t empty which makes it excluding them not straight-forward.) We will also be adding to the url list before publishing the paper we are working on to ensure we achieve our target confidence interval.
How did you pull the sample? Is it really, truly random?
A truly random sample from the IRS list at http://apps.irs.gov/app/eos/forwardToPub78Download.do (downloaded and sorted using Excel’s RAND() function, for those of you who are curious.)
From there, it’s no longer truly random. The websites were searched for by searching for the name. Obviously, any non profit without a website isn’t used, but also excluded are any that can’t be readily found using a search engine. Town and state names were used to distinguish multiple organizations with the same name but not every nonprofit uses the address that is the same as the address of their registered agent. We also excluded foreign language websites (without english text, some foreign language was ok). All of this biases our sample in favor of organizations that are trying harder to get their message out to a broad US audience.
What if I can’t avoid a red word?
It can be unavoidable, and in a few cases we even recommend it. (See caveats.) Just make sure you surround any red, overused words with lots of interesting, green words.