A blog for fans of Bananagrams, word games, puzzles, and amazing things
Showing posts with label dictionaries. Show all posts
Showing posts with label dictionaries. Show all posts

Wednesday, June 11, 2014

More reasons that your dictionary is inadequate

James Somers, writer of many fine stimulating blog posts and a nice recent profile of Douglas Hofstadter (titled "The Man Who Would Teach Machines to Think"), has published a long blog post on what makes a good dictionary. He addresses the importance of a proper dictionary to a writer who wants to write well, and he closes with instructions on how to install such a dictionary on all your electronic devices.


Here is his "You’re probably using the wrong dictionary".

Sunday, December 15, 2013

The amazing ENABLE word-list project

While looking for a good word list to use for a project that I am working on, I discovered ENABLE (which stands for Enhanced North American Benchmark LEexicon), a word list that seems to have been compiled mainly by Alan Beale (with some help from Mendel Cooper) in order to create a reference that can be used when playing word games. Since it is an open and freely available list, it has served as the basis for the word lists used in many games, such as Words with Friends. What distinguishes this word list from the many others out there is how thoroughly its creation has been documented in the many files in the ENABLE package and its supplemental archive.

For this reason, many of the disadvantages of the Scrabble Tournament Word List can be eliminated. For instance, as the compilers themselves note:

In contrast to other word lists, the ENABLE list has not been crippled by being limited to words under an arbitrary length. The ENABLE list is eminently suitable for most word games, such as Anagrams and Clabbers, and for crossword puzzle solving, rather than just for Scrabble. A great deal of research has gone into removing this limitation, however the list is much the better for it.
Another critique of the Scrabble Word Lists and Dictionaries is that they are carrying around many words that were in dictionaries back in the 1970s but have long since disappeared from both usage and lexicons. The ENABLE supplement includes a list of 9,768 stale words (which it defines as words that appear in the Scrabble Tournament Word List but not in modern dictionaries).

Most of these stale words (like AXAL (an obsolete form of "axial") and WHERVE ("a round piece of wood put on a spindle to receive the thread")) were words I had never heard of and therefore had no problem eliminating from the word list for my project. There were also some words that I thought needed to be retained based on being in common usage including SPELUNK/SPELUNKED/SPELUNKING (which, according to the Google Books Ngram Viewer, has been used with increasing frequency since about the 1940s) and UPSTANDING (which peaked in popularity in the 1920s, reached a local minimum around 1970, but has been on the upswing since 1990).

This is only a sampling of what makes ENABLE so useful. Amateur lexicographers and other interested parties can find and download the whole ENABLE package through this page.

Monday, November 7, 2011

How Scrabble dictionaries are made

A web site called Word Buff has an interview with Darryl Francis on the making of the British Scrabble tournament word lists. From his self-description, Francis sounds like a cool guy with interests in wordplay and language. He writes articles for Word Ways, a magazine of recreational linguistics (which I can recommend if you enjoy wordplay). And he cites Martin Gardner's Scientific American columns as a major influence.

Darryl Francis and Allan Simmons are the Dictionary Committee for WESPA, the World English Scrabble Players Association. They've basically been in charge of the British Scrabble word list since its inception.

The British Scrabble tournament word list (previously called SOWPODS and now apparently "CSW" as an abbreviation for Collins Official Scrabble Words) is formed by taking all the words in the most current American Scrabble tournament word list and adding in any valid words from the Chambers Dictionary and the Collins English Dictionary. Francis's interpretation of what constitutes a valid word is given in his response to a question about whether he would ever exclude words that satisfy all the rules:
Let's go back to a group of dictionary entries I mentioned earlier - the internet domain names for countries. There's around 200 of these, running from AC (Ascension Island) to ZW (Zimbabwe). They appear to satisfy the criteria for acceptability of words.

They're not dictionary-listed with an initial capital letter, nor a hyphen nor an apostrophe. They're not marked as abbreviations, they're not marked as foreign. On what basis should they not be allowed as two-letter words?

My answer to this question is that a) these two-letter abbreviations (called "country code top-level domains") are proper nouns, and b) they are in fact abbreviations, whether the dictionary says so or not.

Francis goes on to say:
Yet to allow a sudden influx of two-letter words, most of which are unpronounceable and not recognisable to the man in the street, would be to upset the fine balance that already exists with two-letter words.

Two-letter words are so key to the game that to double their number overnight would almost certainly provoke an outcry from Scrabble players - and probably the media, too.

I could portray this as a question of how to balance strict rule-following with common sense. Ultimately the Dictionary Committee chose not to include all those country codes, so they do use some common sense in their decisions. And they do have to make many difficult judgment calls. But it seems like they only rejected these country codes because there are so many of them and because Scrabble players would be upset by their inclusion.

To me, this demonstrates the subtle biases that have crept into the system to make official Scrabble dictionaries (unsurprisingly) give tournament Scrabble players what they want. As I understand it, what a plurality of them want is a word list that retains the words they have spent so much time memorizing, while occasionally adding handfuls of new words that increase Scrabble scores and make the game easier and more fun for them.

And this is perfectly fine, so long as these Scrabble word lists aren't misappropriated as authoritative sources for other games...


Monday, June 20, 2011

Words deleted from the new British Scrabble dictionary

One point in favor of the British approach to Scrabble dictionaries is that they appear to actually delete words from the list once they stop appearing in their current source dictionaries. Some of the deleted words are words that the sources corrected, either by capitalization (Freon), splitting into two words ("jet plane", not "jetplane"), or elimination of abbreviations ("arccos" is not a word; it's an abbreviation for "arccosine").

When I last checked in on the UK Scrabble dictionary committee, they were talking about doing away with some obscure or erroneous words in the Collins Scrabble Words list. Frequently singled out were "smoyle" (an obsolete form of the verb "smile") and "Pernod" (a brand name for a French liqueur which also appears in the American Scrabble dictionary).

While nearly 400 words have been deleted, somehow both "smoyle" and "Pernod" survived the cuts. Here are some that did not:

APFELSTRUDEL

[The Anglicized form, "apple strudel", appears to have taken over for the original German form.]

ARCCOSES

[This is supposed to be the plural of "arccos", itself a deleted word since it is merely an abbreviation for the arccosine function in trigonometry. Including "arccos" as a word is a somewhat understandable mistake, but pluralizing it as "arccoses" is fairly egregious, as no one ever writes such a thing. This is probably one of the glaring problems in the previous edition of the Collins Official Scrabble Words that caused the world Scrabble tournament people to reject it and retain their old list.]

AWESTRIKE
AWESTRIKING

[AWESTRUCK and AWESTRICKEN are apparently still fine. It turns out that no one awestrikes. The Chambers Dictionary has switched to a hyphenated form: "awe-strikes". From surveying the Internet, I'd say it's more popular to "strike awe".]

BARRACOOTA

[This is an obsolete spelling for "barracuda".]

BELLPUSH    a button used in ringing a bell

["Bellpull" is still fine.]

BRICKSHAPED

BROADMINDED    incapable of being shocked. Opposite of shockable.

CARDCASTLE

["Cardcastle" is apparently an obsolete synonym for a house of cards. The last three words have all switched to hyphenated forms.]

CARPARK    a space for parking cars

[I was a bit disappointed by this deletion until I looked up the one instance I know this phrase from (The Restaurant at the End of the Universe), and discovered that Douglas Adams also preferred writing it as two words:
"I'm in the car park," said Marvin.
"The car park?" said Zaphod, "what are you doing there?"
"Parking cars, what else does one do in a car park?"
"OK, hang in there, we'll be right down."
In one movement Zaphod leapt to his feet, threw down the phone and wrote "Hotblack Desiato" on the bill.
"Come on guys," he said, "Marvin's in the car park. Let's get on down."
"What's he doing in the car park?" asked Arthur.
"Parking cars, what else? Dum dum."
]

CHILIOI    one thousand

[Greek word meaning "thousand"; it can be singular or plural; seems to come up most often because it appears in the Book of Revelation]

CORNRENT    rent paid in corn

[Naturally.]

DEPENDACIE

[This was already a very rarely used word, meaning "submissiveness". Shakespeare used it in Antony and Cleopatra, but modern printings have substituted the word "dependency".]

EUROPEANISE
EUROPEANIZE    To cause to become like the Europeans in manners or character; to habituate or accustom to European usages.

[Someone realized that these words are almost always capitalized. On the other hand "Francization" and "Francisation", the noun forms of Francize and Francise (meaning to make something French), have just been added to the CSW.]

FLASHFORWARD

[I have a feeling that LOST fans will have something to say about this. FLASHBACK is still on the list.]

GRENZ    as in grenz rays, X-rays of long wavelength produced in a device when electrons are accelerated through 25 kilovolts or less [adj]

[Grenz rays (ultrasoft X-rays with wavelengths between 0.07 nanometers and 0.4 nanometers)) were discovered by German physician Gustav Bucky. Bucky noted that the effects of this radiation on biological tissue were somewhat like ultraviolet light and somewhat like the adjacent X-ray part of the spectrum, so he called them "Grenz rays" from the German word Grenz, meaning "boundary". The term seems to have been confined to medicine and is now falling out of usage as Grenz ray therapy is giving way to other techniques.]

HAMBURGHER    a patty of ground beef

HEROE    a man revered for his bravery, courage etc, also HERO

HOWSOMEVER

[Apparently this is an archaic form of "however".]

PARAMAECIUM
PARAMOECIUM    Any of various freshwater ciliate protozoans of the genus Paramecium, usually oval and having an oral groove for feeding.

PLAYBUS    a bus with activities for children

[This seems to be a British concept. As far as I can tell, it's a bit like a bookmobile, except that rather than being a mobile library, it's more like a mobile playground with possibly some educational elements or facilities. From photos I've found, I'd define a playbus as a double-decker bus with ball pits, slides, tunnels, all with lots of padding and primary colors. "Playbus" has apparently transitioned to a capitalized form.]

POCKETPHONE

SHOTPUT

SIDESTREET

["Shot put" and "side street" are now standard.]

STOCKHORN

[This now extinct musical instrument is similar to the better known "hornpipe" and the less well known "pibgorn". It was a single-reed woodwind constructed from a sheep's shin bone and used a cow's horn for the flared part at the end that amplifies the sound. The stockhorn is the Scottish version of this instrument. It also goes by the name "stock-and-horn".]

SWONE    a fainting fit

UPSWARM    to send up in a swarm

[Now you "up-swarm" something (e.g., bees). Shakespeare used this one too, but he wasn't up-swarming bees.]

WASM    an outmoded policy

[This is apparently a portmanteau word, resulting from the combination of WAS and ISM. Or looked at differently, an outdated ISM becomes a WASM. This is one of the words that was removed because it was dropped from the Chambers dictionary (the other UK source dictionary) due to lack of usage.]

WYSIWYG    what you see is what you get, matching computer display with what will be printed (adj)

YOS

[The shortest deleted word was deemed to be an incorrect pluralization of the noun "yo", where "yo" is defined as
an expression of calling for attention
]

I find this list intriguing. It's like a graveyard for forgotten words. (Here lies "Grenz rays".)

Remember, if you want to keep your favorite words alive, you have to use them. Write books about them! Insert them gratuitously into blog comments! Or the ideas they represent may become wasms.

Wednesday, June 8, 2011

The new Scrabble words (if you use the British Scrabble dictionary)


A couple of people asked for my opinion on the "new Scrabble words", so I looked them over. The first and most important thing to point out is that these new words have been added to Collins Official Scrabble Words (CSW), effectively the Scrabble tournament dictionary for most of the world, but not for the United States, Canada, or Thailand. Since Collins Official Scrabble Words (equivalent to the "SOWPODS" word list) automatically includes the American Scrabble tournament word list, new words are only added from the British side when they are absent from the most current American list.

There were about 2800 words added to this list. I've picked out the most interesting ones to discuss:

The good

One category of additions that I found most welcome are the many new computer and Internet terms: autosave, blogosphere, inbox, linkrot, metadata, overclock, permalink, timestamp, and whitelist.

(Less welcome is the inclusion of "readme" as an adjective (as in referring to files named "README.TXT" as "readme files").)

Other terms that I have heard frequently and seem appropriate for such a word list are: afterparty, arthouse, beestung, breadstick, buzzkill, edamame, fanboy, nunchucks, regift, ribeye, spork, and upsell.

The absence of "spork" and "nunchucks" from the American Scrabble dictionary had bothered me, so I am glad to see these additions.

The not-so-good

Other new words I am more skeptical about. "VoIP", which is clearly an acronym (Voice over Internet Protocol) is listed as a new word. Apparently some pronounce it like /voyp/ rather than spelling it out (/vee oh eye pea/), but as long as it is spelt with any capital letters, it seems clear that it can't be played in a word game without risking fisticuffs.

And they've added "XRAY", even though any sensible spelling would be "X-ray" even when it's used in a phonetic alphabet (Alpha, Bravo, Charlie...). Adding this entire phonetic alphabet has also resulted in the inclusion of "India", "Juliet", "November", "Quebec", and "Yankee".

The word grok comes from Robert Heinlein's Stranger in a Strange Land where he defines it as "to understand so thoroughly that the observer becomes a part of the observed — to merge, blend, intermarry, lose identity in group experience." Grokking is a profound, transformative understanding of something. Unfortunately, the new Collins Scrabble Words list has added to the past and present participles ("grokked" and "grokking") some alternate spellings which are clearly wrong ("grocked", "groked", "grocking", and "groking"). Maybe these will get fixed in a future version.

Also, the CSW indicates that the word "quantum" has grown a second pluralization ("quantums" as opposed to the standard "quanta"). This may turn out to be one of the words that the Collins word list editors wind up eating (the traditional punishment for any quickly recalled words).

The improper verbs

The new words FACEBOOK and MYSPACE are both listed as verbs, meaning (depending on who you ask) 1) to search for someone's profile on the respective web sites, 2) to post something on these sites, or 3) just to generally use these sites. Since the words in the word list are written in all capital letters, it's not possible to tell whether these words are supposed to be retaining or dropping their capitalization in verb form. There is a history of brand names becoming lowercase verbs: Hoover ⇒ hoover (to clean with a vacuum cleaner; also, to suck up like a vacuum cleaner), Xerox ⇒ xerox (to photocopy), Velcro ⇒ velcro (to fasten together the two fabric pieces of a hook-and-loop fastener). If I had to extract or propose a rule of thumb, I'd say that a brand name can become a lowercase verb when it has been generalized beyond the original brand. The definitions of FACEBOOK and MYSPACE as verbs seem both overly broad (in the range of actions they can denote) and overly narrow (in each only referring to one particular site). In contrast, another brand name that has just appeared on this word list as a verb is PHOTOSHOP. This seems totally appropriate to me since it's been around long enough that "to photoshop" means (in my mind) to manipulate an image using graphics software, without being restricted to Adobe's Photoshop.

The rest

The definitions below are quotations from the Zyzzyva word study program, and my comments are in brackets.

BEATBOXING    a form of hip-hop music in which the voice is used to simulate percussion instruments

BETCHA    a spelling of 'bet you' representing colloquial pronunciation

BLOKART    a land vehicle with a sail

[While we can welcome this word for the whimsicality it embodies, a different spelling ("BLOWKART") has left the building. Most likely, heading in the direction of the prevailing wind.]

BOBBLEHEAD    a type of collectible doll, with head often oversized compared to its body

CATFLAP    a small opening in a door to let a cat through

CHEESESTEAK    a sandwich filled with grilled beef and cheese

CHESSBOXING    a hybrid sport which combines the sport of boxing with games of chess in alternating rounds

[While this was originally a made-up sport, there are now regular international chessboxing tournaments in London.]

CATAPHOR    a word that has the same reference as another word used later

[A cataphor is a phrase for which the meaning only becomes clear later in the sentence. Example: "Although he worked very hard at his wall-balancing lessons, Humpty Dumpty ultimately had to contend with the fact that he was still egg-shaped." He cataphorically refers to Humpty Dumpty.]

CROWDSOURCE    to outsource work to an unspecified group of people, typically by making an appeal to the general public on the Internet

CRIA    the offspring of a llama

[Apparently this word is often used in crossword puzzles.]

CUSPY    of a computer program, well written and easy to use

DISEMVOWEL    to remove the vowels from (a word in a text message, email,etc) in order to abbreviate it

EMERSED    (Of leaves) rising above the surface of water

[Plants that grow out of the water are said to be emersed. Contrast with "immersed".]

ENURN    to put into an urn, also INURN

EXERGY    a measure of the maximum amount of work that can theoretically be obtained from a system

GLAMPING    a form of camping in which participants enjoy physical comforts associated with more luxurious types of holiday

MONOTASKING    the act of performing one task at a time

MWAH    a representation of the sound of a kiss (interj)

PAREIDOLIA    a psychological phenomenon involving a vague and random stimulus being perceived as significant e.g. seeing faces in clouds

PORLOCK    to hinder by an irksome intrusion or interruption

[This term comes from Samuel Taylor Coleridge's story of how he emerged from a dream with the poem that would have been Kubla Khan fully formed in his mind. He claims to have written down the first 54 lines (the only ones that were eventually published) before being interrupted by a visitor from Porlock. Some scholars doubt this story, but "to Porlock" makes for a great new verb. It seems to be mainly used in British English, but I vote that everyone start using it.]

RISORIUS    a facial muscle situated at the corner of the mouth

[The risorius is the muscle people use when they fake a smile (smiling with upturned lips, but not with their eyes). An authentic smile uses the zygomaticus major and zygomaticus minor muscles to pull up the corners of the mouth and also uses the orbicularis oculi muscles to raise the cheeks and form crow's feet around the eyes. Other primates (like lemurs, macaques, orangutans, gibbons, and chimpanzees) do not even have a well-defined risorius muscle.]

SKYLESS    without a sky

[It is listed in the 1911 version of the Century Dictionary with the definition: "Without sky; cloudy; dark; thick." I first thought that this word meant literally without a sky (as in a planet that has no atmosphere), and while it is occasionally used that way in science fiction, it's more generally used figuratively, in melancholy descriptions.]

SPARTICLE    a shadow particle such as a SQUARK believed to have been produced at the time of the Big Bang

[There is an theory called "supersymmetry" which would tidy up a lot of little mathematical problems with the current physics theories of how fundamental particles and forces work. Supersymmetry says that every fundamental particle has a supersymmetric partner. This scheme of adding an S to the beginning of the names of some fundamental particles to denote their hypothetical supersymmetric partners has produced such words as "sfermion", "stau sneutrino", "smuon", and "sstrange squark". These must be fun to pronounce! Sparticles are the sorts of things that physicists would love to find evidence for in particle accelerators like the Large Hadron Collider.]

SPLISH    to splash

[The Wiktionary currently has two definitions:
splish, the noun: "(onomatopoeia, humorous) splash"

and

splish, the verb: "(intransitive) To make a light splashing sound."

The "light splashing" definition rings true to me.]

STOOZE    to borrow money at an interest rate of 0%, a rate typically offered by credit card companies as an incentive for new customers

[The Wikipedia entry indicates that "stoozing" money includes, not just borrowing money at a 0% interest rate, but then investing it (for instance, in a high interest savings account), and then paying it back. This is a sneaky technique for earning money, apparently named for Stooz, a user of the Motley Fool's Credit Card discussion board in the UK, who used and posted about this technique often. While it was originally referred to as "doing a Stooz", a variant spelling has developed that drops the capitalization and adds a silent E.]

STORMSTAYED    isolated or unable to travel because of adverse weather conditions, esp a snowstorm

[This is useful as a more general term than "snowed in". I suggest we import this as "stormstuck".]

SUNGAZING    the practice of staring directly at the sun at sunset or sunrise, esp in the belief that doing so allows one to survive without eating food

[Bananagrammer.com recommends stargazing or moongazing, if you value your retina. Also, eating food occasionally is a good idea, unless you can photosynthesize.]

TRUTHINESS    the quality of being considered to be true because of what the believer wishes or feels, regardless of the facts

[Coined by Stephen Colbert, this is a word more loaded with connotation than a line of text can easily convey.]

TURDUCKEN    a dish consisting of a partially deboned turkey stuffed with a deboned duck, which itself is stuffed with a small deboned chicken

VELLUS    as in vellus hair, short fine unpigmented hair covering the human body

[The opposite of "terminal hair" (dark, thicker body hair).]

WHOLPHIN    a hybrid of a whale and a dolphin

[At Sea Life Park in Hawaii, a bottlenose dolphin and a false killer whale that were being kept together, unexpectedly produced offspring. The false killer whale is actually another species of dolphin, but the discrepancy in sizes (the false killer whale mother was 14 feet long and weighed 2000 pounds while the father was 6 feet long and massed 400 pounds) and the fact that such a combination had never before been seen made the world's first known false-killer-whale/dolphin hybrid a surprise. The fully grown wholphin is 10 feet long and weighs 600 pounds. She is also midway between her parents in shape and color and number of teeth. (Bottlenose dolphins have 88 teeth, false killer whales, 44, and the wholphin has 66.) She, in turn, has mated with a dolphin and given birth to another wholphin. This is another surprise, as hybrid animals (like the mule) are usually infertile.

"Wholphin" is sometimes also spelt "wolphin", although this variant did not make it into the Collins Scrabble Words list.]

Two words that are not new additions, but that I learned from looking through these word lists are: SCOPA (the hair on the legs of many bees, which transport pollen from flower to flower) and UPTALKING:
UPTALKING the practice of speaking with a rising intonation at the end of each statement, as if one were asking a question


It makes sense that a word list that aspires to represent a more international flavor of English be larger. And I have heard on more than one occasion that British English is actually less conservative and is changing more rapidly than American English, so a faster-growing British Scrabble word list is not unexpected. Reading through all these words has been educational and, at times, fascinating, but I'm sure glad that I don't have to memorize them all!


Monday, February 21, 2011

"Zen" and the art of Google N-gram Viewing

Over on the WordSquared blog, WordSquarers are pondering what should be a legal word in a word game. In particular, they are asking whether ZEN should be an allowed word in their game. "zen" is probably the most frequently asked about word because many people (myself included) initially expect that the Word2 game will accept it, but it never does...

The argument in favor of admitting "zen" to the dictionary is that usage suggests that there are two kinds of "Zen": capital-Z "Zen", which refers to Zen Buddhism and lowercase-Z "zen" which refers to a state of extreme calm and centeredness. Of course, the idea of this calm state is a reference to what is considered to be a result of the practice of Zen meditation.

It turns out that "Zen" is sometimes capitalized even in phrases like "a Zen outlook on life" or when something is said to be or feel "so Zen". This usage is consistent with "Zen" being a proper adjective (like "British").

To pursue this question further, I used Google's Ngram Viewer (which really ought to be spelt "N-gram Viewer") to compare the frequency of usage of the words "Zen" and "zen" in books over the last 200 years. The capitalized version completely dominates. (The oscillations in the appearance of "Zen" in English language books seem to reflect periodic variations in Western interest in Eastern mysticism. Roughly similar oscillations can be seen in the usage of "Tao".)



If you look at just the usage of "zen" over time,


you see that back in the 1800s, long before the concept of Zen was even popularized in Western society, instances of "zen" are present in print like some kind of background noise. And indeed, closer examination reveals that these "zen"s have nothing to do with Zen. They are frequently word fragments (like cases where the word "citizen" has been broken between pages and the OCR failed to transmit the dash in "-zen" to Google N-gram Viewer) or abbreviations of names in plays ("Zen." = Zenobia in some plays).

The same search, done on the American English corpus (rather than the overall English corpus, as above),


also fails to show a decisive increase in the usage of "zen" in English books in the U.S..

In contrast, many words admitted into the dictionary show usage patterns that clearly surpass their background noise levels. The first Google N-grams image above shows a good example of this behavior for the word "Zen". And consider "supersize",

a word accepted into the Merriam-Webster Collegiate Dictionary in 2006. Words that have such an abrupt exponential gain in usage must be the easiest for lexicographers to deal with.

I thought I had found a good argument in favor of making "zen" a word when I realized that there is another word which also has a nearly identical usage pattern. Both "Zen" and "Christian" refer to specific religions, and both are also used in a more relaxed fashion as an adjective (roughly meaning "placid" and "humane, altruistic", respectively). But the question of the correct case is the same with "christian": It is not frequently found in an uncapitalized form, and dictionaries nearly universally include only the capitalized form of the word.

It's possible that editors are keeping the uncapitalized "zen" out of books because it does not appear in dictionaries. And then, to the extent that dictionary inclusion reflects usage in print, "zen" is doomed to be perceived as a common misspelling and locked out of dictionaries forever. Of course, these days lexicographers search for new words in lots of other media, including the less rigorously edited Internet, so "zen" may yet be recognized as a legitimate word.


If you want to express your opinion about "zen", the comments on that Word2 blog post are still open, and you can always post here in the shiny new Bananagrammer comments area.

Sunday, March 28, 2010

Non-words in dictionaries

Not all words in dictionaries are real words. Some are bogus. Sometimes this is due to errors. In 1931, during the preparation of the second edition of Webster's New International Dictionary, someone wrote "D or d, cont / density" on a notecard, indicating that "D" and "d" should be added to the dictionary as abbreviations for "density". Through some misunderstanding, this was misinterpreted to mean that a new word should be added to the dictionary: "dord". And it was. Five years after the dictionary was published, an editor uncovered the error and had the definition deleted from future printings.

The second way that a non-word can show up in a dictionary is through someone copying erroneous words from another source. The history of dictionaries contains many incidents of unscrupulous dictionary-compilers stealing entire word entries from other dictionaries.

And this brings us to the third way that fictitious words can show up in a dictionary: They can be deliberately put there. It is common practice among dictionary publishers to insert completely made-up words so that when someone plagiarizes their work, they can catch them red-handed.

In 2005, one of the fake words in the New Oxford American Dictionary was revealed to be "esquivalience" which supposedly meant "the willful avoidance of one's official responsibilities".

In 1903, a dictionary of musical terms was published, and at the very end was this definition:
zzxjoanw (shaw). Maori. 1. Drum. 2. Fife. 3. Conclusion.
This definition was reused in Mrs. Byrne's Dictionary of Unusual, Obscure, and Preposterous Words: Gathered from Numerous and Diverse Authoritative Sources which was published in 1974, but with an alternative (and seemingly, completely fabricated) pronunciation.

If the definition "conclusion" and the incongruous pronunciation weren't evidence enough, a few other facts should have tipped people off that this word is a hoax: a) The Maori language does not contain the letters "j", "x", and "z". b) All Maori words end in a vowel. c) In traditional Maori culture, there are no drums!

Other compilers of information also introduce false information for the same reason. Phone book publishers have included fake names and phone numbers. Encyclopedias are published with false entries (like the New Columbia Encyclopedia's entry on Lillian Virginia Mountweazel, a fountain designer and photographer (famous for photographing mailboxes) who died "in an explosion while on assignment for Combustibles magazine". And map-makers insert fake streets (called "trap streets") or draw their streams as if they were slightly squigglier than they actually are.

What lesson can we draw from all this? Be skeptical. Don't just accept everything you read. Not even in dictionaries. Not even on this blog! And certainly don't believe words when they start with "zzx". That would be dordish.

Saturday, February 6, 2010

Some more of the Official Scrabble Dictionary's greatest mistakes

After posting my essay describing the many problems with the Official Scrabble Players Dictionary (OSPD), I discovered that I am not the only one who has issues with the official Scrabble word lists. An article from the Times of London reports that the official Scrabble word list has come under increased criticism, arguing that the popularity of online Scrabble games has brought players of different positions on this issue into conflict.

The article points out that there is a Facebook group called "The Official Scrabble Dictionary: Winner or Whack?" which appears to be a venue for people to debate the merits of the OSPD. The group has 20 members at present, so the remarkable thing is not that there are some people who have complaints, but that those complaints are being heard. In the UK, the analog of the OSPD is called "Official Scrabble Words" (OSW). The first version of this British Scrabble word list was compiled by Allan Simmons (Scrabble columnist for the Times) and Darryl Francis in 1988. They still maintain the list. Simmons was interviewed in the article:
Mr Simmons is in favour of a wide variety of words, but he believes that archaic words should be removed from the list. "There are lots of archaic, obsolescent words that came from Chambers dictionary. That's not good for trying to promote Scrabble in schools. One of the words that annoys me is 'smoyle', an old form of 'smile'. Nobody is going to spell 'smile' that way now."
In a separate opinion piece, Simmons gives an overview of the situation, concluding that:
We, in the driving seat of the Scrabble community, should be letting go of the archaic word baggage in the interest of a more publicly acceptable word list. We should have a cleanout of all the spellings of ye olde literary works that are no longer in use.

So it sounds like word list reform may be coming to Britain. Whether this will mean changes to the word lists used in America remains to be seen. If dictionary reform is not imminent, we could always organize a demonstration. After all, who could pass up the opportunity to be part of a Million Banana March?

Sunday, December 27, 2009

There is no official Bananagrams dictionary

Many players like to cite the Official Scrabble Players Dictionary to validate words they use when playing Bananagrams. But as discussed in the last post, the OSPD has serious flaws There is no... and may not be the best choice for a reference. The official Bananagrams rules say that "any available dictionary may be used" to decide whether words are acceptable. Some better dictionaries to use with Bananagrams are listed here.

As an alternative, when playing Bananagrams among friends, the group can decide how strict it is going to be about word acceptability. A strict reading of the rules would say that if a word is in the dictionary, it is acceptable, whether or not it is slang. In the groups that I play, we are pretty lax, and frequently we do not even have a dictionary on hand. When playing without a readily accessible dictionary, the validity of words is sometimes debated, but the de facto standard is one of the following (I'm not sure which): 1) The word is only considered a rotten banana if it is misspelt or clearly not a word. 2) The word is considered a rotten banana if most people think it is wrong. 3) The word is considered a rotten banana if the person who played it relents and agrees that it is wrong (or questionable).



Acknowledgments: My thanks to Chuck, who started an e-mail discussion with me which led to this post and a deeper consideration of dictionaries.

Sunday, December 20, 2009

Well, that about wraps it up for the Official Scrabble Players Dictionary

I found a nice critique of the Official Scrabble Players Dictionary in one of the reviews on Amazon. It's by Daniel Pratt, a lexicographer and mathematician who placed second in the first national Scrabble tournament in 1978. He also developed the rating system used for Scrabble players in tournaments. It was around the same time that the first Official Scrabble Players Dictionary was created. Pratt is therefore well qualified for making such a critique.

One of his strongest points is something that I had begun to suspect: The OSPD adds new words when they appear in dictionaries, but does not delete old words once they become obscure enough to not appear in modern dictionaries:
The Official Scrabble Players Dictionary (OSPD) is a compilation of words from twelve U.S. college dictionaries from the last four decades. Four are still in print and, as the descendants of seven of the others, contain most but by no means all of their contributions. As a result, pronunciations, etymologies, and full definitions are no longer available for many entries, especially those found only in the source that has been out of print for a quarter century. NSA members like to twitter that they don't play your grandmother's Scrabble, but in many respects they're using her dictionary. It's a shame they can't bring the game into the 21st century.
He also points out some of the most ridiculous entries, but the word that sticks out most in my mind is one that was pointed out elsewhere: "AARRGHH" is listed in the OSPD, like it is an actual word.


Ammon Shea takes a different position. After reading the entire Oxford English Dictionary (which feat he documented in his book, Reading the OED: One Man, One Year, 21,730 Pages), he describes returning to Scrabble for the first time in many years and how he finds that the list of allowed words no longer makes sense to him:
Why, for instance, does the Scrabble Players Dictionary list both howf and howff (Scottish words defined as ‘a place frequently visited’), but neglects to include many of the variants that Joseph Wright has in his English Dialect Dictionary, such as houf, hauf, hofe, hoff, houf, houck? I tried to play swad (a bumpkin, or fat person), which appears in most of the dictionaries I own, on three different occasions before I remembered that the Scrabble game wouldn’t recognize it.
He then goes on to observe:
Some complain that the Scrabble Players Dictionary is too inclusive; I find the opposite to be true. Rather than only let in some of the strange words, they should have opened the floodgates and allowed them all. It has become a game of memory, rather than a game of language.
Clearly, you can't please everybody.



In response to comments on his review, Pratt recommends several replacement dictionaries - the American Heritage College Dictionary, the American Heritage High School Dictionary, the New Oxford American Dictionary, and the Canadian Oxford Dictionary - with a discussion of the advantages of each.

Saturday, October 31, 2009

Another reason Scrabble dictionaries are inadequate for Bananagrams

The Scrabble board is a 15-by-15 grid, so naturally, Scrabble dictionaries have no need for words longer than 15 letters. Oddly even "Super Scrabble" which is Scrabble played on a larger board and with more tiles, still uses the 15-letter limit.

Some 16-letter words for your consideration:

ambidextrousness
anagrammatically
anthropomorphize
autobiographical
biodegradability
bureaucratically
counterclockwise
crystallographer
cryptozoological

I can totally see someone staring off with "clocks" in their grid, turning it into "clockwise", and then in a brilliant burst of insight, combining it with "count" and some stray tiles to make "counterclockwise". If I ever pulled that off, I would be so proud that I wouldn't even care if I won the game. (OK, I'd still care, but either way it would be awesome.)

What is the longest word that you have ever used in a Bananagrams match? For those who like games with rounds that escalate, the must-contain-an-N-letter-word would be an interesting and challenging variation. How high can you go?

Sunday, June 14, 2009

"Za" and dictionaries

A recent article on the Wall Street Journal site discussed the effect of adding new words to the list of legal Scrabble words. The controversial words are things like "za" (a rare slang term for pizza), "qi" (a Chinese word, meaning life force... a new-fangled spelling of "chi"), and "zzz" (onomatopoeic word for the sound of snoring). Some argue that they make it too easy to use the letters "Z" and "Q", and that their point values should be reduced from 10. Others argue that some of these words are just lame. (It's a good article that also discusses the issues of rule changes in a more general sense.)

Apparently, the way the Official Scrabble Players Dictionary works is that if a word is added to just one of a set of five dictionaries, it is eligible to be considered for the "Scrabble-legal" list. My understanding is that it will be approved as long as it doesn't violate any obvious criteria (hyphenation, proper nounness, foreignness). The virtue of this is that it is an objective approach. I do wonder though how different the list would look if a word had to appear in four of the dictionaries before becoming a Scrabble word.

Should Bananagrams use the Official Scrabble Players Dictionary? Or should it come up with its own dictionary? Is "za" pronounced to rhyme with "baa", or does it retain the schwa sound from the end of the word "pizza"? Do I take every opportunity to use the word "schwa"? (Answer: Yes.)