A blog for fans of Bananagrams, word games, puzzles, and amazing things

Saturday, December 24, 2011

Friday, December 16, 2011

Impossible objects. Impossible words.


Impossible objects are drawings of apparently three-dimensional objects which look correct when their individual parts are examined, but when you look at the object as a whole, it turns out to be not realizable. One of the most famous examples was created by D. H. Schuster and published in a psychology journal in 1964. The paper was titled "A New Ambiguous Figure: A Three-Stick Clevis", and said figure looks like this:


As emphasized by the colored background, the top of the object resembles the upper part of a stirrup (which has the basic form of what is known as a "clevis") and the bottom of the object looks like three parallel rods. Somewhere in between lies the ambiguity that destroys the three-dimensionality. Martin Gardner referred to such drawings as "undecidable figures".

The three-stick clevis has since gone by many other names: blivet, devil's tuning fork, widget, and poiuyt.

Mad Magazine used "poiuyt" as the name for the above blivet when they featured it on their March 1965 cover. The difficulty you may be having in deciding how to pronounce "poiuyt" is due to its unusual origin. The word "QWERTY" was formed by starting from the left side of the top row of a typewriter and taking the first six letters. Applying the same technique to the other end of the keyboard you get the looking glass version of "QWERTY"... "poiuyt".

Just as perspective drawing was invented to allow us to make two-dimensional depictions of three-dimensional things, spelling was invented to allow transcription of spoken language. And just as we can draw objects that are logically inconsistent, so can we write combinations of letters that correspond to no spoken word.

"Poiuyt" has no apparent, standard, or authoritative pronunciation. Dictionaries ignore it. If the reader will indulge me, I will nominate it as our first impossible word.


Another candidate for impossibility is "balge" (as in "balge yellow"), a term that is listed in Merriam-Webster's Third New International Dictionary as having no known pronunciation and no known origin (as if it spontaneously generated on a piece of paper on some lexicographer's desk). "Balge yellow" has been defined as "a brilliant yellow color" and "sunflower yellow", so at least that part of its wordhood is known.

A 1976 survey of color names by the National Bureau of Standards identified
balge yellow as the color pictured here, which also goes by names such as "jonquil" and "Naples yellow". These redundant names may explain why "balge" use ended.

Even though no one seems to know how to pronounce "balge", it doesn't feel undecidable in the way that "poiuyt" does, probably due to the latter's discombobulating four consecutive vowels.

My third nomination for impossible word is YHWH which is the English version of the Hebrew word: יהוה. Controversy surrounds this word. It is used throughout the original Hebrew texts of the Old Testament as the primary name for God. Some pronounce it as "Jehovah" or "Yahweh". Since ancient Hebrew lacked was written without vowels, the correct pronunciation of יהוה is not known. There is a strong taboo against speaking this name in Judaism, so it may be that whatever correct pronunciation might have existed has disappeared due to lack of use. Some believe that the pronunciation is a secret preserved by only a few people in each generation. What I like best about this word is that there is another name you can use when talking about it: "the Tetragammaton" (from the Greek for "having four letters"). The undecidability of YHWH's pronunciation is in an entirely different class than that of poiuyt, but maybe a property of impossible words is that they are all impossible in their own ways. This one seems to be more of an arms-crossed, exasperated "Tetragammaton, you're impossible!" way.

I feel obligated to mention one word that I thought would be impossible but has turned out not to be: Mxyzptlk. Mister Mxyzptlk is a mischievous prank-playing imp from the fifth dimension who occasionally visits Earth to wreak havoc until Superman deals with him. The gimmick was that the only way to send Mister Mxyzptlk back home was to trick him into saying his name backwards.

While Mxyzptlk has been pronounced in a variety of ways throughout the years, allegedly the DC Comics editor gave an authoritative pronunciation early on: "mix-yez-PIT-elick". But I suppose that one could claim that it is in Mxyzptlk's trickster nature that the pronunciation of his name refuses to be nailed down.


Then there are heteronyms which are words that are spelled the same way but pronounced differently:
"bass" can rhyme with "glass" or "space".

"wind" can be pronounced with a short I (like the thing that blows) or a long I (the verb that describes forming a ball of yarn).
They're better classified as ambiguous than outright impossible.

As heteronyms change pronunciation based on the context they are used in, they are analogous to the Necker Cube:


Rather than representing a figure that has no sensible three-dimensional realization, the Necker cube confounds the viewer because it has more than one realization. Most people initially see it as a wire-frame cube, viewed from the top, with the lower-left square as the front face. After studying the figure for some time, it may seem to suddenly shift to a cube seen from above with the upper-right square as the front. I find that I can switch between the viewpoints by focussing on a face that appears to be at the back of the cube which seems to cause it to pull forward.


The impossible word is an exceedingly rare thing because we tend to make up pronunciations for words, even if we have to break the laws of phonics. (Doing so yields ultraphonic words (words outside the range of normal phonics), such as Big Bird's pronunciation of ABCDEFGHIJKLMNOPQRSTUVWXYZ as a single long word by sneaking vowel sounds into strings of consonants like JKLMN.)


Spelling a word is a reductive, lossy process. Accents, tones, sarcasm, are all generally omitted. English orthography, in particular, requires collapsing the full spoken word into a few characters, introducing considerable ambiguity, but from this ambiguity is born many good things, like puns and poiuyts.



Further reading:
  • On balge: According to an 1875 Bulletin from the National Association of Wool Manufacturers, balge yellow was "generally employed on cassimere for vestings". Google Books also has the recipe for dying wool balge yellow.

A word about the very cool font used for the IMPOSSIBLE graphic above: “ISOSIBILIA Typography Designed by Rodrigo Fuenzalida for Neo2. - [Back to footnote reference]

Monday, November 7, 2011

How Scrabble dictionaries are made

A web site called Word Buff has an interview with Darryl Francis on the making of the British Scrabble tournament word lists. From his self-description, Francis sounds like a cool guy with interests in wordplay and language. He writes articles for Word Ways, a magazine of recreational linguistics (which I can recommend if you enjoy wordplay). And he cites Martin Gardner's Scientific American columns as a major influence.

Darryl Francis and Allan Simmons are the Dictionary Committee for WESPA, the World English Scrabble Players Association. They've basically been in charge of the British Scrabble word list since its inception.

The British Scrabble tournament word list (previously called SOWPODS and now apparently "CSW" as an abbreviation for Collins Official Scrabble Words) is formed by taking all the words in the most current American Scrabble tournament word list and adding in any valid words from the Chambers Dictionary and the Collins English Dictionary. Francis's interpretation of what constitutes a valid word is given in his response to a question about whether he would ever exclude words that satisfy all the rules:
Let's go back to a group of dictionary entries I mentioned earlier - the internet domain names for countries. There's around 200 of these, running from AC (Ascension Island) to ZW (Zimbabwe). They appear to satisfy the criteria for acceptability of words.

They're not dictionary-listed with an initial capital letter, nor a hyphen nor an apostrophe. They're not marked as abbreviations, they're not marked as foreign. On what basis should they not be allowed as two-letter words?

My answer to this question is that a) these two-letter abbreviations (called "country code top-level domains") are proper nouns, and b) they are in fact abbreviations, whether the dictionary says so or not.

Francis goes on to say:
Yet to allow a sudden influx of two-letter words, most of which are unpronounceable and not recognisable to the man in the street, would be to upset the fine balance that already exists with two-letter words.

Two-letter words are so key to the game that to double their number overnight would almost certainly provoke an outcry from Scrabble players - and probably the media, too.

I could portray this as a question of how to balance strict rule-following with common sense. Ultimately the Dictionary Committee chose not to include all those country codes, so they do use some common sense in their decisions. And they do have to make many difficult judgment calls. But it seems like they only rejected these country codes because there are so many of them and because Scrabble players would be upset by their inclusion.

To me, this demonstrates the subtle biases that have crept into the system to make official Scrabble dictionaries (unsurprisingly) give tournament Scrabble players what they want. As I understand it, what a plurality of them want is a word list that retains the words they have spent so much time memorizing, while occasionally adding handfuls of new words that increase Scrabble scores and make the game easier and more fun for them.

And this is perfectly fine, so long as these Scrabble word lists aren't misappropriated as authoritative sources for other games...


Saturday, October 22, 2011

The 27th Letter of the Alphabet

Rogues, to speak thus irreverently of the alphabet, I shall live to see you glad to serve old Q — to curl the wig of great S — adjust the dot of little i — stand behind the chair of X. Y. Z. — wear the livery of Etcetera — and ride behind the sulky of And-by-itself-and.

From Act I of Charles Lamb's Mr. H

If you were a schoolchild in the 19th century, the alphabet that you learned would have had 27 letters: all 26 letters of our current alphabet, plus the ampersand symbol.
ABCDE
FGHIJ
KLMNO
PQRST
UVWXY
Z &

Due to the awkwardness of ending a recitation of the alphabet with "W X Y Z and", it was traditional to instead say "W X Y and Z, and per se and", where per se, Latin for "by itself", means that &, standing by itself, represents "and". (Words with one-letter spellings, like A or I, were often orally spelt as "A per se" or "A per se A".) It was this process of alphabet recitation, and hurried enunciations of "and per se and" which spawned a variety of names for the & symbol which ultimately converged on "ampersand". & had been part of the alphabet going back to the days of Old English.

Other than standing for the conjunction "and", the ampersand also sometimes appears in the abbreviation &c, for et cetera. This is due to the origins of the & symbol in the first century A.D., when the Romans would write et (Latin for "and") in cursive in a run-together fashion which became a stand-alone written symbol.

So why do we no longer consider & to be part of the alphabet?

The leading theory is that it's because of that alphabet song, the one that goes
A B C D E F G
H I J K LMNOP
Q R S, T U V
W X, Y and Z.
Now I know my A B Cs.
Next time won't you sing with me?
Many incorrectly believe that this is based on a tune by Mozart. While Mozart wrote variations on this theme at the age of 25 [see Köchel listing K. 265], the original melody that inspired him was a French folk song called "Ah! vous dirai-je, Maman" which eventually served as the music for Twinkle, Twinkle, Little Star. In 1835, the alphabet song was copyrighted under the name "The A.B.C., a German air with variations for the flute with an easy accompaniment for the piano forte", so it does seem like Mozart was responsible for popularizing the melody.

It turns out that that song has influence beyond the ousting of the ampersand. Historically, it has been mainly in the U.S. that Z has been pronounced zee; pretty much everywhere else they say zed. But a quick look at the rhyming scheme of the alphabet song, shows that the zee pronunciation works better. And apparently a lot of children who learn English outside of the U.S. are still exposed to this alphabet song through American children's programming, like Sesame Street. Teachers in England reportedly have to correct kindergarteners who enter school singing the alphabet song in an American accent, right down to the zee.

I leave you with this quote from Steven Wright:
Why is the alphabet in that order? Is it because of that song? The guy who wrote that song wrote everything.

Sunday, October 9, 2011

PAX, the Omegathon, and novelty in video games

I love books and documentaries that examine quirky subcultures. The book Word Freak provides a fascinating look inside the world of tournament Scrabble. Murderball was a great film about the players of wheelchair rugby. My favorite quirky documentary though is The King of Kong: A Fistful of Quarters which dramatizes the competition between two players for the high score in the classic arcade game Donkey Kong.

I've just found an amazing article online called PAX Primer which is the perfect introduction to the quirky subculture that is the Penny Arcade Expo. It covers the origins of the Expo (in case you ever wanted to know how a web comic can spawn a convention dedicated to video games and board games), the growth of video games, and their transition from fringe to mainstream culture.

Earlier this year, I posted about PAX because Bananagrams was an event in the PAX East Omegathon. It turns out that the Omegathon organizers decided to feature Bananagrams in the west coast PAX Omegathon as well. The article says that in the convention program, Bananagrams is described as "like Scrabble, only not boring and for old people".

It later goes on to discuss a few computer games which I might opine are "like video games, only not boring and for old people". The new wave of computer games does not suck you into endless repetition.

Portal is a game where you solve puzzles by shooting two holes on different walls, ceilings, or other surfaces in your environment. These "portals" are connected (as if by a wormhole), so whatever goes in one, comes out the other, with the same momentum. I recently started playing this game and can not get enough of it.

Braid is an even stranger game in which the player gets to control the flow of time. The selling points of the game are listed on the game's web site:
  • Every puzzle in Braid is unique. There is no filler.
  • Braid treats your time and attention as precious.
  • Braid does everything it can to give you a mind-expanding experience.
Braid's programmer, Jonathan Blow, self-financed the game as he coded it over three years as a statement about how video games could and should be different.

Braid does not look like any other computer game. The artwork is great. It was done by the artist behind the surreal web comic A Lesson Is Learned But The Damage Is Irreversible. Braid also does not sound like any other computer game. Its atmospheric music helped to win me over.

With the success of these games, even more ambitious games are in the works, on topics such as non-Euclidean geometry (Antichamber) and four-dimensional space (Miegakure).


In a world where video games have become mainstream, it makes sense for a niche to develop for games that emphasize originality. I am glad that quirky subcultures exist to sustain this kind of bold experimentation.

Tuesday, September 20, 2011

Why words are the lengths they are

Some words are long and others are short. What determines how long a particular word should be? If you look at some long words (like "serendipity", "pandemonium", and "hypothesis") and some short words ("my", "in", and "of"), you might come to the conclusion that short words are short because they are used frequently while long words can afford to be long because they come up rarely. This idea was first proposed by a Harvard linguist named George Zipf in 1936.

Researchers at the MIT Department of Brain and Cognitive Sciences took a fresh look at this question and came up with a new theory. They present their results in a paper titled (spoilers!) "Word lengths are optimized for efficient communication".

How much information is conveyed by a word? Consider the sentence that starts "After I got home, I walked the...". If I finish the sentence as "I walked the dog", the extra word "dog" doesn't convey much information because it's probably one of the words your brain was expecting. More surprising would have been "I walked the cat" or "I walked the bulldozer", "I walked the quasar" or "I walked the plank". It is the amount of surprise that researchers are equating with the information contained in a word. Consequently the information content of a word depends on the context that it appears in. [For those who want a more quantitative explanation, the information contribution from a particular context (like, "I walked the...") is -log(p), where p is the probability that the word appears at the end of that phrase and where log() is the natural logarithm function. To get the total information for a word like "dog", you just sum -p log(p) over all the contexts that "dog" appears in.]

Ideally what the researchers would have liked to examine is the relationship between how long it takes to say words and how much information they convey, but it was easier (and, they argue, an adequate approximation) to use the number of letters in a word in place of its utterance duration. But later, they went back and ran the same tests (for a few languages) using number of syllables instead of number of letters, and the results were the same.

To calculate the relationship between word length and frequency, the researchers used the same N-gram data set that Google used in its N-gram viewer. This figure from the paper summarizes their findings:
The plot on the left shows word length versus word use frequency, with frequency decreasing from left to right. (Here the data has been divided into large groups of words ("bins") and the average lengths and frequency have been used.) For the first few points (high-frequency words like "the"), the slope of the line is strong, but then it quickly flattens out, indicating that for low-frequency words, the frequency of the word doesn't change the length very much.

The plot on the right shows average word length versus the information content of the word. Here, the line starts off jagged but then becomes strongly-sloped and very straight. This tells us that how much information a word carries is indeed a good predictor of how long the word will be.

The researchers also cite other work that has shown that, when speaking, people will speak more information-dense syllables more slowly than less information-dense syllables. (If you've ever listened to the synthesized voice of something like a GPS, you'll be familiar with the jerkiness of the pronunciation that sounds like it is speaking some syllables too slowly and others too quickly.)

It would seem that a corollary to this principle is that as a word becomes more common (or more precisely, loses information density), it experiences a linguistic force, pushing it toward a shorter form. This shortening process is called phonetic erosion. Examples of the resulting shortenings (also called clippings) are "refrigerator" becoming "fridge", "going to" becoming "gonna", and "cabriolet" being completely replaced by "cab". Here are a few other terms that have evolved much shorter forms:
  • advertisement → ad
  • caravan → van
  • examination → exam
  • gasoline → gas
  • gymnasium → gym
  • influenza → flu
  • public house → pub
So, essentially, the researchers found that the old idea that word length is based mainly on frequency of word usage (short words are used often while long words are used rarely) does a poor job of explaining why words are the lengths they are. The amount of information in a word (averaged over the various contexts that it is used in) is a far better predictor for how long the word will be. The only exception to this is the 5% to 20% of words that are the least informative (generally short, high-frequency words like "the" and "and").

This result holds, not just for English, but also for the other ten languages that they examined (Czech, Dutch, French, German, Italian, Polish, Portuguese, Romanian, Spanish, and Swedish).

The basic idea that I take away from this work is that there is some maximum rate that our brains can understand incoming speech, and that our speech patterns reformulate what we are saying to evenly distribute information over time. It makes me wonder whether pausing for effect is taking advantage of this fact. Similarly, when I say a word slowly to emphasize it, maybe I am just slowing it down to suggest that it contains a lot of information.

Epilogue: In case you were wondering, the actual ending to the sentence that started "After I got home, I walked the..." was "...tightrope.".


Saturday, August 6, 2011

The Bananagrammer Equation

A warning to regular readers: This post is not about games nor about words. It is about math and bananas.

Recently the "Batman Equation" has been memetically propagating around the Internet.

The equation represents the outline of the Batman logo. It is apparently the work of a user on Reddit.

I liked it enough to try to make my own. There are a few tricks to this process. First break the shape up into curves that you can easily write equations for, of the form f(x,y)=0.
Then, to make the curves stop at the desired end points, add in terms like the ones you see under the square roots. They evaluate to either 1 or -1, depending upon the grid position; when this value is negative, the square root is no longer real, and the plotting program will not plot anything. Finally, multiply all the equations together, and you get one big long equation:

(This is a cleaned-up and slightly approximated version of the equation I used for plotting.)

The final plot looks like this:

If I stay there can be no party. I must be out there in the night, staying vigilant. Wherever a party needs to be saved, I'm there. Wherever there are words that need anagramming, I'm there. But sometimes I'm not because I'm out there in the night staying vigilant, watching, lurking, running, jumping, hurdling, sleeping. No, I can't sleep. You sleep. I'm awake. I don't sleep. I don't blink. Am I a bird? No. I'm a banana. I am Bananagrammer. Or am I? Yes, I am Bananagrammer. [applies chapstick]

It is remarkable how well a single ellipse traces out the outer edge of a banana silhouette. I checked a couple of other bananas, and they also have this property. I finally went to a grocery store and sifted through all their bananas to find the least elliptical one I could:


From the red sample points along the edge, I found that even this banana was almost well-approximated by an ellipse.


It is the first three data points that make this an exception to Bananagrammer's First Law of Bananas:
The outer edge of the longitudinal section of a banana follows an elliptical path, with the banana's stem being roughly on the end of the ellipse's long axis.

The question to ask at this point is, "Why are bananas shaped the way they are?". The simple answer is that when a bunch of bananas start growing on a tree, they are initially pointing more down than up. As they become larger, they curve up toward the sun. A banana's exact shape will therefore depend on where it is with respect to its neighbors.

A full explanation of why bananas are so elliptical will require more investigation. People who want to give me research grants are welcome to do so. Actually, everybody is welcome to do so. To everyone else, tune in next week. Same Banana-time, same Banana-channel!

Friday, July 22, 2011

August 2011 Bananagrams events

A couple of notable Bananagrams-related events are scheduled to take place next month.

1) Bananagrams is sponsoring the August 13th instance of WaterFire, a spectacular event that takes place along the river in Providence, Rhode Island. One hundred fires blaze along two-thirds of a mile of the river, illuminating the art and performances that accompany the festivities. WaterFire happens several times each summer, but on August 13th, there will be special Bananagrams-related events.

waterfire.org/bananas is the official page for the Bananaganza. Also available is a schedule for the evening's events.

Providence is home to Brown University and a strong arts scene. A graduate of Brown, Barnaby Evans, created the WaterFire concept and has been running it since 1994. Evans was a friend of Abe Nathanson (the inventor of Bananagrams) and wrote a tribute to Abe.

It's very cool that Bananagrams is sponsoring this event. The Bananagrams components of the evening have not yet been revealed. I think they're going to be surprises, but this idea has been cooking for over a year, so I expect it will be a great event. If you are in the area, I recommend checking it out.

Also, if you happen to go to Providence, keep an eye out for the Bananagrams headquarters sign while driving around:

(This photo was sent in by a personal acquaintance and Bananagrams fan. In case you were wondering, that is not a sign for the "Bananagrams Archives Gallery". I believe the "Archives Gallery" is a separate business in the same building.)


2) On August 14th, large-scale Bananagrams will be played in Prospect Park in Brooklyn. They are going to use 1-foot square tiles made of Masonite (a type of processed wood, sometimes used for house siding and interior doors). The game will look something like this:

Further details on the event are here.


I suppose very large-scale Bananagrams would be played by moving around human-sized tiles, like "human chess" (those games of chess where people act as the chess pieces) except it would be much faster. Played with a full set of tiles, you'd need 144 people. Watching people run around and try to figure out where to stand to form words while other players are peeling off the bunch would be awesome. Human Bananagrams is really something that has to be played.

Thursday, July 21, 2011

Hebrew Bananagrams

NEWS FLASH: Hebrew Bananagrams is now available in the U.S. from Amazon!

As posted in the forum, the Hebrew version of Bananagrams is now out. It looks like this:

The Israeli distributor has a web site dedicated to this version of the game, http://www.bananagrams.co.il which is in Hebrew. There is also an English translation of the site.

If there were an award for the language that Bananagrams would play most differently in, Hebrew would be a contender. Hebrew doesn't have an alphabet; it has an abjad - an alphabet without vowels. In written Hebrew, vowels may optionally be indicated by a system of diacritic marks placed above or below the consonants. In the image above, the tiles are all Hebrew consonants. My suspicion is that this will make Bananagrams matches in Hebrew markedly faster than in other languages.

UPDATE: This article gives more information on how the Hebrew translation of Bananagrams came about:
For the Israeli version, the Dalfens [the family that owns and runs the Israeli distributor] did not only translate the game’s instructions and special terms − such as “split,” dump,” and “peel” − but also consulted a Hebrew linguist regarding the frequency of each letter. [...] the allocation of letters would vary depending on whether Biblical or Modern Hebrew is used, according to the linguist. The Dalfens opted to allocate letters corresponding to the spoken language.

Friday, July 1, 2011

Good three-letter words for Bananagrams, sorted by rareness

When playing Bananagrams, 2-letter words are great for rapidly adding letters to a grid, like when you are running off a string of "PEEL"s, but it's hard to make an entire grid out of 2-letter words. In this post, I'm going to examine some of the 3-letter words and show you which are most commonly used and which you might want to add to your active vocabulary.

First, consider this list of some 3-letter words you can make with the letter V:
eve, ivy, ova, rev, van, vat, veg, vet, vex, via, vie, vim, vow

We don't have a good set of data of what words are most commonly used in playing Bananagrams, and Scrabble word choice would likely be a poor substitute since players' priorities in Scrabble are very different than in Bananagrams. WordSquared offers a good compromise: its gameplay shares score maximization with Scrabble, but also includes the rapid score-aloof word-building (such as when evading or outflanking opponents) that epitomizes Bananagrams.

If you sort the above V words by word count (as obtained from WordSquared word pages), you get
van, eve, vet, vat, vie, via, vex, rev, vow, ivy, ova, veg, vim

But sorting by raw word count is not the most useful ordering since the Scrabble tile distribution is going to skew the results. There are lots of As, Es, Ns, and Ts, so of course VAN, EVE, VET, and VAT will be popular words. I wanted to subtract out this bias to see what words would be most and least used if all letters were equally likely to be available. I accomplished this by just dividing each word count by the number of times that each of its letters occurs in a standard Scrabble tile set [which corresponded to the Word2 distribution until the recent Word2 redesign, which fortunately happened after I finished this post]. (For example, the word VAN had been used in WordSquared 24198 times (over about 2 months). In a 100-tile Scrabble set, there are 2 Vs, 9 As, and 6 Ns, so I divided 24198 by 2 and by 9 and by 6 to get a normalized count of 224.1.) Sorting the words by the normalized word count gives:
vex, vow, ivy, van, vat, vet, vim, vie, rev, eve, veg, via, ova

When people have a V, they are more likely to make VEX than any other three-letter V word. (In this case, 70% more likely than even the closest word (VOW).) On the other end, VIA is used fairly often in terms of raw word counts, but sparingly when considering how often it could be made.

Below are more word lists, all sorted by this effective word usage rate, starting from the most common words and ending with the most neglected:

3-letter words that contain the hardest letters, sorted from common to rare:

J words:
joy, jug, job, jam, jaw, jog, jay, jab, jig, jar, jet, jot, jut, jag, jib

K words:
key, kid, sky, kit, yak, kin, keg, wok, ink, ark, ask, oak, irk, ski, ken, ilk, koi, uke, eke, ska, auk

V words:
vex, vow, ivy, van, vat, vet, vim, vie, rev, eve, veg, via, ova

X words:
box, fox, wax, fix, mix, vex, fax, hex, tax, pox, six, sex, tux, lax, axe, lox, sax

Z words:
zip, zit, zoo, zap, zig, zag, fez


3-letter words that begin with vowels, sorted from common to rare:

Words that begin with A:
axe, ark, ask, any, awe, ace, arc, and, age, aim, arm, act, ash, ape, ago, ant, aye, art, air, all, aft, aid, ate, add, are, ail, ale, ado, apt, ass, asp

Words that begin with E:
elk, egg, elf, eve, eye, end, ego, elm, emu, ewe, eat, ebb, ear, eel, eke, era, eon, ere, eta

Words that begin with I:
ivy, ink, icy, ice, irk, ilk, ill, imp, inn, ire, its, ion

Words that begin with O:
off, owl, orb, oak, own, oil, old, owe, ova, one, oft, out, orc, our, odd, ode, oaf, opt, oat, oar, obi

Words that begin with U:
use, urn, ump, uke

Words that begin with Y:
you, yak, yam, yes, yet, yew, yap, yip, yin, yep, yon, yea


3-letter words that end with vowels, sorted from common to rare:

Words that end in A:
via, ova, yea, boa, pea, tea, bra, sea, spa, ska, era, goa, baa, eta

Words that end in E:
axe, the, bye, cue, ice, hue, she, pie, vie, eve, ace, awe, foe, eye, age, dye, bee, owe, hoe, rye, due, ape, woe, die, sue, aye, toe, tie, ewe, ore, rue, use, ate, lye, are, doe, ale, ode, pee, wee, see, ire, fie, uke, tee, eke, lee, ere

Words that end in I:
chi, ski, koi, phi, poi, obi, psi

Words that end in O:
zoo, who, ego, ago, boo, goo, moo, two, woo, too, coo, ado, pro, loo, bro, fro, rho, tao

Words that end in U:
you, flu, emu, gnu, tau

Words that end in Y:
joy, jay, why, key, guy, boy, buy, coy, way, toy, fly, cry, day, gay, sky, hay, bay, ivy, pay, shy, may, soy, hey, fry, lay, dry, say, ray, try, ply, sly, spy, icy, pry, nay, thy, fey, any, sty, ley


It's interesting to look at how different words fare. Auks and asps are nearly forgotten, but foxes and owls are quite popular. Greek letters (eta, phi, psi, rho, tau) and abbreviations for musical instruments (sax, uke) and for formal wear (tux) do not see much grid time. I am glad to see that everybody loves joy.

If you want to learn useful new words to add to your active Bananagrams vocabulary, the ends of those lists might be a good place to start.



Further reading:

Monday, June 20, 2011

Words deleted from the new British Scrabble dictionary

One point in favor of the British approach to Scrabble dictionaries is that they appear to actually delete words from the list once they stop appearing in their current source dictionaries. Some of the deleted words are words that the sources corrected, either by capitalization (Freon), splitting into two words ("jet plane", not "jetplane"), or elimination of abbreviations ("arccos" is not a word; it's an abbreviation for "arccosine").

When I last checked in on the UK Scrabble dictionary committee, they were talking about doing away with some obscure or erroneous words in the Collins Scrabble Words list. Frequently singled out were "smoyle" (an obsolete form of the verb "smile") and "Pernod" (a brand name for a French liqueur which also appears in the American Scrabble dictionary).

While nearly 400 words have been deleted, somehow both "smoyle" and "Pernod" survived the cuts. Here are some that did not:

APFELSTRUDEL

[The Anglicized form, "apple strudel", appears to have taken over for the original German form.]

ARCCOSES

[This is supposed to be the plural of "arccos", itself a deleted word since it is merely an abbreviation for the arccosine function in trigonometry. Including "arccos" as a word is a somewhat understandable mistake, but pluralizing it as "arccoses" is fairly egregious, as no one ever writes such a thing. This is probably one of the glaring problems in the previous edition of the Collins Official Scrabble Words that caused the world Scrabble tournament people to reject it and retain their old list.]

AWESTRIKE
AWESTRIKING

[AWESTRUCK and AWESTRICKEN are apparently still fine. It turns out that no one awestrikes. The Chambers Dictionary has switched to a hyphenated form: "awe-strikes". From surveying the Internet, I'd say it's more popular to "strike awe".]

BARRACOOTA

[This is an obsolete spelling for "barracuda".]

BELLPUSH    a button used in ringing a bell

["Bellpull" is still fine.]

BRICKSHAPED

BROADMINDED    incapable of being shocked. Opposite of shockable.

CARDCASTLE

["Cardcastle" is apparently an obsolete synonym for a house of cards. The last three words have all switched to hyphenated forms.]

CARPARK    a space for parking cars

[I was a bit disappointed by this deletion until I looked up the one instance I know this phrase from (The Restaurant at the End of the Universe), and discovered that Douglas Adams also preferred writing it as two words:
"I'm in the car park," said Marvin.
"The car park?" said Zaphod, "what are you doing there?"
"Parking cars, what else does one do in a car park?"
"OK, hang in there, we'll be right down."
In one movement Zaphod leapt to his feet, threw down the phone and wrote "Hotblack Desiato" on the bill.
"Come on guys," he said, "Marvin's in the car park. Let's get on down."
"What's he doing in the car park?" asked Arthur.
"Parking cars, what else? Dum dum."
]

CHILIOI    one thousand

[Greek word meaning "thousand"; it can be singular or plural; seems to come up most often because it appears in the Book of Revelation]

CORNRENT    rent paid in corn

[Naturally.]

DEPENDACIE

[This was already a very rarely used word, meaning "submissiveness". Shakespeare used it in Antony and Cleopatra, but modern printings have substituted the word "dependency".]

EUROPEANISE
EUROPEANIZE    To cause to become like the Europeans in manners or character; to habituate or accustom to European usages.

[Someone realized that these words are almost always capitalized. On the other hand "Francization" and "Francisation", the noun forms of Francize and Francise (meaning to make something French), have just been added to the CSW.]

FLASHFORWARD

[I have a feeling that LOST fans will have something to say about this. FLASHBACK is still on the list.]

GRENZ    as in grenz rays, X-rays of long wavelength produced in a device when electrons are accelerated through 25 kilovolts or less [adj]

[Grenz rays (ultrasoft X-rays with wavelengths between 0.07 nanometers and 0.4 nanometers)) were discovered by German physician Gustav Bucky. Bucky noted that the effects of this radiation on biological tissue were somewhat like ultraviolet light and somewhat like the adjacent X-ray part of the spectrum, so he called them "Grenz rays" from the German word Grenz, meaning "boundary". The term seems to have been confined to medicine and is now falling out of usage as Grenz ray therapy is giving way to other techniques.]

HAMBURGHER    a patty of ground beef

HEROE    a man revered for his bravery, courage etc, also HERO

HOWSOMEVER

[Apparently this is an archaic form of "however".]

PARAMAECIUM
PARAMOECIUM    Any of various freshwater ciliate protozoans of the genus Paramecium, usually oval and having an oral groove for feeding.

PLAYBUS    a bus with activities for children

[This seems to be a British concept. As far as I can tell, it's a bit like a bookmobile, except that rather than being a mobile library, it's more like a mobile playground with possibly some educational elements or facilities. From photos I've found, I'd define a playbus as a double-decker bus with ball pits, slides, tunnels, all with lots of padding and primary colors. "Playbus" has apparently transitioned to a capitalized form.]

POCKETPHONE

SHOTPUT

SIDESTREET

["Shot put" and "side street" are now standard.]

STOCKHORN

[This now extinct musical instrument is similar to the better known "hornpipe" and the less well known "pibgorn". It was a single-reed woodwind constructed from a sheep's shin bone and used a cow's horn for the flared part at the end that amplifies the sound. The stockhorn is the Scottish version of this instrument. It also goes by the name "stock-and-horn".]

SWONE    a fainting fit

UPSWARM    to send up in a swarm

[Now you "up-swarm" something (e.g., bees). Shakespeare used this one too, but he wasn't up-swarming bees.]

WASM    an outmoded policy

[This is apparently a portmanteau word, resulting from the combination of WAS and ISM. Or looked at differently, an outdated ISM becomes a WASM. This is one of the words that was removed because it was dropped from the Chambers dictionary (the other UK source dictionary) due to lack of usage.]

WYSIWYG    what you see is what you get, matching computer display with what will be printed (adj)

YOS

[The shortest deleted word was deemed to be an incorrect pluralization of the noun "yo", where "yo" is defined as
an expression of calling for attention
]

I find this list intriguing. It's like a graveyard for forgotten words. (Here lies "Grenz rays".)

Remember, if you want to keep your favorite words alive, you have to use them. Write books about them! Insert them gratuitously into blog comments! Or the ideas they represent may become wasms.

Wednesday, June 8, 2011

The new Scrabble words (if you use the British Scrabble dictionary)


A couple of people asked for my opinion on the "new Scrabble words", so I looked them over. The first and most important thing to point out is that these new words have been added to Collins Official Scrabble Words (CSW), effectively the Scrabble tournament dictionary for most of the world, but not for the United States, Canada, or Thailand. Since Collins Official Scrabble Words (equivalent to the "SOWPODS" word list) automatically includes the American Scrabble tournament word list, new words are only added from the British side when they are absent from the most current American list.

There were about 2800 words added to this list. I've picked out the most interesting ones to discuss:

The good

One category of additions that I found most welcome are the many new computer and Internet terms: autosave, blogosphere, inbox, linkrot, metadata, overclock, permalink, timestamp, and whitelist.

(Less welcome is the inclusion of "readme" as an adjective (as in referring to files named "README.TXT" as "readme files").)

Other terms that I have heard frequently and seem appropriate for such a word list are: afterparty, arthouse, beestung, breadstick, buzzkill, edamame, fanboy, nunchucks, regift, ribeye, spork, and upsell.

The absence of "spork" and "nunchucks" from the American Scrabble dictionary had bothered me, so I am glad to see these additions.

The not-so-good

Other new words I am more skeptical about. "VoIP", which is clearly an acronym (Voice over Internet Protocol) is listed as a new word. Apparently some pronounce it like /voyp/ rather than spelling it out (/vee oh eye pea/), but as long as it is spelt with any capital letters, it seems clear that it can't be played in a word game without risking fisticuffs.

And they've added "XRAY", even though any sensible spelling would be "X-ray" even when it's used in a phonetic alphabet (Alpha, Bravo, Charlie...). Adding this entire phonetic alphabet has also resulted in the inclusion of "India", "Juliet", "November", "Quebec", and "Yankee".

The word grok comes from Robert Heinlein's Stranger in a Strange Land where he defines it as "to understand so thoroughly that the observer becomes a part of the observed — to merge, blend, intermarry, lose identity in group experience." Grokking is a profound, transformative understanding of something. Unfortunately, the new Collins Scrabble Words list has added to the past and present participles ("grokked" and "grokking") some alternate spellings which are clearly wrong ("grocked", "groked", "grocking", and "groking"). Maybe these will get fixed in a future version.

Also, the CSW indicates that the word "quantum" has grown a second pluralization ("quantums" as opposed to the standard "quanta"). This may turn out to be one of the words that the Collins word list editors wind up eating (the traditional punishment for any quickly recalled words).

The improper verbs

The new words FACEBOOK and MYSPACE are both listed as verbs, meaning (depending on who you ask) 1) to search for someone's profile on the respective web sites, 2) to post something on these sites, or 3) just to generally use these sites. Since the words in the word list are written in all capital letters, it's not possible to tell whether these words are supposed to be retaining or dropping their capitalization in verb form. There is a history of brand names becoming lowercase verbs: Hoover ⇒ hoover (to clean with a vacuum cleaner; also, to suck up like a vacuum cleaner), Xerox ⇒ xerox (to photocopy), Velcro ⇒ velcro (to fasten together the two fabric pieces of a hook-and-loop fastener). If I had to extract or propose a rule of thumb, I'd say that a brand name can become a lowercase verb when it has been generalized beyond the original brand. The definitions of FACEBOOK and MYSPACE as verbs seem both overly broad (in the range of actions they can denote) and overly narrow (in each only referring to one particular site). In contrast, another brand name that has just appeared on this word list as a verb is PHOTOSHOP. This seems totally appropriate to me since it's been around long enough that "to photoshop" means (in my mind) to manipulate an image using graphics software, without being restricted to Adobe's Photoshop.

The rest

The definitions below are quotations from the Zyzzyva word study program, and my comments are in brackets.

BEATBOXING    a form of hip-hop music in which the voice is used to simulate percussion instruments

BETCHA    a spelling of 'bet you' representing colloquial pronunciation

BLOKART    a land vehicle with a sail

[While we can welcome this word for the whimsicality it embodies, a different spelling ("BLOWKART") has left the building. Most likely, heading in the direction of the prevailing wind.]

BOBBLEHEAD    a type of collectible doll, with head often oversized compared to its body

CATFLAP    a small opening in a door to let a cat through

CHEESESTEAK    a sandwich filled with grilled beef and cheese

CHESSBOXING    a hybrid sport which combines the sport of boxing with games of chess in alternating rounds

[While this was originally a made-up sport, there are now regular international chessboxing tournaments in London.]

CATAPHOR    a word that has the same reference as another word used later

[A cataphor is a phrase for which the meaning only becomes clear later in the sentence. Example: "Although he worked very hard at his wall-balancing lessons, Humpty Dumpty ultimately had to contend with the fact that he was still egg-shaped." He cataphorically refers to Humpty Dumpty.]

CROWDSOURCE    to outsource work to an unspecified group of people, typically by making an appeal to the general public on the Internet

CRIA    the offspring of a llama

[Apparently this word is often used in crossword puzzles.]

CUSPY    of a computer program, well written and easy to use

DISEMVOWEL    to remove the vowels from (a word in a text message, email,etc) in order to abbreviate it

EMERSED    (Of leaves) rising above the surface of water

[Plants that grow out of the water are said to be emersed. Contrast with "immersed".]

ENURN    to put into an urn, also INURN

EXERGY    a measure of the maximum amount of work that can theoretically be obtained from a system

GLAMPING    a form of camping in which participants enjoy physical comforts associated with more luxurious types of holiday

MONOTASKING    the act of performing one task at a time

MWAH    a representation of the sound of a kiss (interj)

PAREIDOLIA    a psychological phenomenon involving a vague and random stimulus being perceived as significant e.g. seeing faces in clouds

PORLOCK    to hinder by an irksome intrusion or interruption

[This term comes from Samuel Taylor Coleridge's story of how he emerged from a dream with the poem that would have been Kubla Khan fully formed in his mind. He claims to have written down the first 54 lines (the only ones that were eventually published) before being interrupted by a visitor from Porlock. Some scholars doubt this story, but "to Porlock" makes for a great new verb. It seems to be mainly used in British English, but I vote that everyone start using it.]

RISORIUS    a facial muscle situated at the corner of the mouth

[The risorius is the muscle people use when they fake a smile (smiling with upturned lips, but not with their eyes). An authentic smile uses the zygomaticus major and zygomaticus minor muscles to pull up the corners of the mouth and also uses the orbicularis oculi muscles to raise the cheeks and form crow's feet around the eyes. Other primates (like lemurs, macaques, orangutans, gibbons, and chimpanzees) do not even have a well-defined risorius muscle.]

SKYLESS    without a sky

[It is listed in the 1911 version of the Century Dictionary with the definition: "Without sky; cloudy; dark; thick." I first thought that this word meant literally without a sky (as in a planet that has no atmosphere), and while it is occasionally used that way in science fiction, it's more generally used figuratively, in melancholy descriptions.]

SPARTICLE    a shadow particle such as a SQUARK believed to have been produced at the time of the Big Bang

[There is an theory called "supersymmetry" which would tidy up a lot of little mathematical problems with the current physics theories of how fundamental particles and forces work. Supersymmetry says that every fundamental particle has a supersymmetric partner. This scheme of adding an S to the beginning of the names of some fundamental particles to denote their hypothetical supersymmetric partners has produced such words as "sfermion", "stau sneutrino", "smuon", and "sstrange squark". These must be fun to pronounce! Sparticles are the sorts of things that physicists would love to find evidence for in particle accelerators like the Large Hadron Collider.]

SPLISH    to splash

[The Wiktionary currently has two definitions:
splish, the noun: "(onomatopoeia, humorous) splash"

and

splish, the verb: "(intransitive) To make a light splashing sound."

The "light splashing" definition rings true to me.]

STOOZE    to borrow money at an interest rate of 0%, a rate typically offered by credit card companies as an incentive for new customers

[The Wikipedia entry indicates that "stoozing" money includes, not just borrowing money at a 0% interest rate, but then investing it (for instance, in a high interest savings account), and then paying it back. This is a sneaky technique for earning money, apparently named for Stooz, a user of the Motley Fool's Credit Card discussion board in the UK, who used and posted about this technique often. While it was originally referred to as "doing a Stooz", a variant spelling has developed that drops the capitalization and adds a silent E.]

STORMSTAYED    isolated or unable to travel because of adverse weather conditions, esp a snowstorm

[This is useful as a more general term than "snowed in". I suggest we import this as "stormstuck".]

SUNGAZING    the practice of staring directly at the sun at sunset or sunrise, esp in the belief that doing so allows one to survive without eating food

[Bananagrammer.com recommends stargazing or moongazing, if you value your retina. Also, eating food occasionally is a good idea, unless you can photosynthesize.]

TRUTHINESS    the quality of being considered to be true because of what the believer wishes or feels, regardless of the facts

[Coined by Stephen Colbert, this is a word more loaded with connotation than a line of text can easily convey.]

TURDUCKEN    a dish consisting of a partially deboned turkey stuffed with a deboned duck, which itself is stuffed with a small deboned chicken

VELLUS    as in vellus hair, short fine unpigmented hair covering the human body

[The opposite of "terminal hair" (dark, thicker body hair).]

WHOLPHIN    a hybrid of a whale and a dolphin

[At Sea Life Park in Hawaii, a bottlenose dolphin and a false killer whale that were being kept together, unexpectedly produced offspring. The false killer whale is actually another species of dolphin, but the discrepancy in sizes (the false killer whale mother was 14 feet long and weighed 2000 pounds while the father was 6 feet long and massed 400 pounds) and the fact that such a combination had never before been seen made the world's first known false-killer-whale/dolphin hybrid a surprise. The fully grown wholphin is 10 feet long and weighs 600 pounds. She is also midway between her parents in shape and color and number of teeth. (Bottlenose dolphins have 88 teeth, false killer whales, 44, and the wholphin has 66.) She, in turn, has mated with a dolphin and given birth to another wholphin. This is another surprise, as hybrid animals (like the mule) are usually infertile.

"Wholphin" is sometimes also spelt "wolphin", although this variant did not make it into the Collins Scrabble Words list.]

Two words that are not new additions, but that I learned from looking through these word lists are: SCOPA (the hair on the legs of many bees, which transport pollen from flower to flower) and UPTALKING:
UPTALKING the practice of speaking with a rising intonation at the end of each statement, as if one were asking a question

Collins provides a little Flash-based word checker you can use to see what is in their Scrabble word list. I've embedded it below.
At present, it is using the older Collins word list from 2007, but at some point, it should switch to the new 2012 list.


It makes sense that a word list that aspires to represent a more international flavor of English be larger. And I have heard on more than one occasion that British English is actually less conservative and is changing more rapidly than American English, so a faster-growing British Scrabble word list is not unexpected. Reading through all these words has been educational and, at times, fascinating, but I'm sure glad that I don't have to memorize them all!



Review of the new version of WordSquared

The brand new version of Word2 is now out. It comes with a different design (lots of wood grain and most squares on the board are off-white) and a host of different rules which definitely change the game.
One of the problems with the original version of WordSquared is that once you developed a big enough word structure, you could always find a place to put almost any letter. Consequently I quickly stopped needing the "Swap Tiles" option, and so accumulating lives (by building on the stars dispersed around the board) became irrelevant.

The new version solves this by no longer giving you blanks for free. You start off with three, and then if you want more, you have to buy them. (Five stars buy one blank tile, and swapping your seven tiles for a new rack now costs twenty stars.) Word formation is consequently more challenging. Your rack now consists of seven non-blank tiles and one slot for an arbitrary number of blanks.

Stars are acquired in a variety of ways. Stars are granted for logging in once a day and for achieving certain score benchmarks (called "Levels"). Stars may also be acquired by building a word over a square with a star on it. While you will often get just 1 star for building over a Star Square, occasionally you will get more. I've gotten 2 stars in a couple of instances and 3 stars once. (When beta-testing, I once got 8 stars.) Since the number of stars dispensed is random, this is an example of operant conditioning: varying the reward (as when training an animal) results in more rapid acquisition of the skill being trained for. In Word2, randomizing the reward should make you crave the stars more...

A new feature called "teleporting" is supposed to solve the problem of players being boxed in. On the one hand, it takes away the perpetual threat that you are going to be trapped somewhere if you don't check in with the game frequently enough. But on the other hand, such fear is really not a good way to motivate players. We ought to play games because they are fun, not to prevent others from building impenetrable double-walled word structures around us.

The new version also alters the tile values and tile distribution from the Scrabble parameters, based upon how Word2 players have played in the past.
The vowels are all still worth 1 point, except for the U which is two (logically, as the U is the hardest vowel to use). A Y is worth 5 points.

This table compares Scrabble tile values with those for WordSquared:

ABCDEFGHIJKLMNOPQRSTUVWXYZ
Scrabble1332142418513113101111448410
Word21543154419624214102112558510

There are a number of interface changes:
  • When you play a blank, instead of having to type in the letter you want to play, you are presented with a complete alphabet of tiles from which you may choose by clicking.
  • Placemarks are placed, not by double-clicking, but by dragging a placemark icon from the bottom of the placemark list.
  • For now at least, your word history is hidden in your profile, which you can get to by clicking on your avatar icon.
The best change is the new transition between the full map and the mini-map which is now a smooth animation, making it easier to maintain a sense of where you are in the WordSquared world.

This new version of Word2 breaks with the Scrabble rules and model in favor of making Word2 a better massively multiplayer game. The creators seem full of ideas, and I am interested to see what they do next.


Further clicking:

Monday, May 9, 2011

Shady Characters: essays for punctuation lovers

There is a famous quote attributed to physicist John Wheeler:
Time is what prevents everything from happening at once...
Space is what prevents everything from happening to me!

After reading a cool article on punctuation, I realized something else:

Spacesarewhatpreventeverythingfrombeingwrittenlikethis.

A new blog, Shady Characters: The secret life of punctuation, is publishing a series of well-researched essays tracing the development of various punctuation marks. The first is about the pilcrow (the paragraph symbol which looks like this: ¶), but also includes a great history of the use of punctuation and spacing in written language, showing how primordial versions of commas, periods, and colons began as markers for showing where to pause when reading aloud. It touches on the advent of lowercase letters; it references a typography manifesto; it spans millennia; wheels turn; cogs mesh. For edifying multi-part essays on punctuation, I recommend it.

Reading:

Wednesday, April 20, 2011

Innergrams and what they can tell us about word favoritism

So I was playing Word2, and I had a rack of A R T T U _ _ and a desire to build down from the word LEND. Reflexively, I spelled DART and was about to move on when I paused and asked myself "Why didn't I make DRAT?". DRAT is a perfectly fine word, and unlike DART, I don't recall ever playing it before. After building off the end of LEND, I was planning to build another word off the end of DART to extend my slaloming, weaving string of words off toward the horizon. This style of wordcrafting is not atypical, so many people must have previously encountered a similar choice between two words that start and end with the same letters. I began to wonder how they chose.

Fortunately WordSquared has a new feature that allowed me to find out. By clicking on a word on the board, you can pull up a pop-up box containing a little information about the word including definitions, who has recently played the word, and how many times it has been played (since statistics have been kept... about a month ago as of late-March).

DART had been played 4789 times.

DRAT had been played 1723 times.

So it was not just me.

I decided to study this a little more. I compiled a long list of anagrams that share the same first and last letters (e.g., FORTH and FROTH, SEAHORSE and SEASHORE). Since it is only the inner letters that are scrambled, I decided to call them "innergrams".

The table below shows the resulting innergram pairs with each word's respective word count (as taken from Word2 statistics pages like this one.) The words are sorted so the more frequently used one is always in the first column. The fifth column shows the ratio of the two word counts.

Since I wanted to identify data that was not strong enough to draw conclusions from, I used formal hypothesis testing. The chi-square goodness of fit test I used is described in detail here. The essence of it is that the more data you have and the farther the ratio of word counts is from 1, the stronger the evidence is that one word is preferentially being used over the other. The chi-square parameter (in column 6) measures how strong this evidence is. I've sorted the table by increasing evidence strength.

Admittedly, there are lots of situations where one of these words would be favored over another for in-game reasons (like, CRAVE was already on the board and CRAVEN was made by just adding an N, or maybe a triple-letter score square made CAVERN a higher scoring choice). Averaged over many instances, some of these effects should cancel out.

The first three rows have such a small chi-square value that it's pretty certain that people are not (on average) favoring one of these words over another. (Maybe for every person who makes CRAVEN by adding an N to CRAVE, there is someone else making CAVERN by adding an N to CAVER.) The gray rows are weakly supported. The rest of the rows have a big enough chi-square parameter that we can say with greater than 95% certainty that Word2 players favor the first word over the second word. In the last column of the table, I suggest reasons why.



Essentially, this is a listing of possible word blind spots. DART is a far more popular choice than DRAT, and unlike many of the examples on this list, this asymmetry cannot be explained by ART being a more frequently available hook than RAT. (RAT has been played 19,000 times and ART only 12,000 times.)

I have highlighted with orange the rows where there is strong evidence that the second word is a blind spot word. The green rows indicate that I suspect a blind spot exists, but other explanations could also account for the imbalance.

All innergrams are potentially useful tools for Bananagrams players since the ability to most quickly rearrange your grid can be the difference between winning and losing a game. Blind spot words are just the innergrams that you are most likely to not have immediately at hand... until now!

Blind spot words:
blot, causal, citric, coral, drat, garb, labile, prefect, reserve, rogue, slat, sloe, snag, stanch.

Other possible blind spot words:
brunt, clod, froth, gird, median, recuse, sidle, spilt



In the interest of completeness, below are the innergram pairs that I left out of the table because their word counts were too low. (No count exceeded 8.)

scalarsacral
martialmarital
converseconserve
eternityentirety
preserveperverse
coagulatecatalogue
seashoreseahorse
parentalpaternal
compliantcomplaint
observeobverse
repriserespire
metronomemonotreme
perceptprecept

The last two pairs had word counts of zero. Build one of these words in Word2 and you may be the first!

Sunday, April 10, 2011

Lexicographer: the iPhone version of Guess My Word

As an addendum to my review of the online Guess My Word game, I'm reviewing the iPhone companion game, Lexicographer. Just as with Guess My Word, Lexicographer allows you to guess words and tells you whether your guess is before or after the secret word ("my word").

But rather than having two daily words to guess, Lexicographer will let you keep playing new rounds of "Guess my word!" all day long.


Once you start, there is a running timer (counting tenths of a second!). I found that this totally changed the guessing experience for me. Without a timer, I try to minimize the number of guesses I need to guess the word, choosing guesses in a calculating but leisurely fashion. With the timer constant ticking away, I guessed words in a more frenzied manner.

There is an option that allows you to see how many words are left in the range that you have bracketed. This is a useful way of sharpening your sense of how words are distributed in the alphabet and where the best bisection point is.

We increased the difficulty level from 1 (where a typical word was "talent") to 10 and then struggled on the last few guesses until we narrowed the word down to four possible words (between "parapets" and "paraphrase"). But we were totally stumped at that point. After guessing "paraphobia" (which I hoped to be defined as the fear of parallel lines) and finding that it was not actually a real word, I randomly guessed "paraph". And it was, to my stunned triumph, correct.

(It turns out that a paraph is a flourish someone adds below or to the end of their signature. The example that first comes to mind is below John Hancock's signature on the Declaration of Independence:


It's believed that paraphs originated during the Middle Ages to discourage forgery. I wondered how effective this might have been until I read in Joe Nickell's Detecting Forgery (browsable in Google Books) the following:
It might also be noted that the concept of individuality that today may be expressed in a distinctive signature was less valued in the penmanship of an earlier time, when adherence to strict copybook form was regarded as a virtue. As Jonathan Goldberg notes in Writing Matter: From the Hands of the English Renaissance, "in fact, what differentiated one italic signature from another is more often a paraph, flourish, than the letter itself." [...] Indeed, sometimes so distinctive was the eighteenth-century paraph ([...] like that of John Hancock's or Benjamin Franklin's) that it was sometimes used instead of the signature, thus concealing one's identity except to the initiate.

That was a really interesting word to learn about, but given that I got it through sheer luck, I think I'll be reducing the Lexicographer difficulty level to something more like 5 for now...)

I think that each version of the game has its merits. Guess My Word is fun because you can compete against other people (mostly just names on the leaderboard, unless you happen to know them or invite your friends to play) and see what sequence of words they chose by mousing over their guess history. Guess My Word words often feel more special, as though I can develop an intuition about the kind of words that are likely to be chosen; it's possible that Lexicographer words have a similar property, and I just haven't played enough to pick up on it. Lexicographer's strength is that it can be pulled out anytime and played with a group of friends, as many times as you like. Showing the number of words that you have bracketed allows you to get better at picking the word to bisect the range. Not only does this improve your Word-Guessing ability, it also adds an extra layer to the game, making for a nice complement to Guess My Word.



Update:
  • You may notice that Lexicographer no longer seems to be available on its former iTunes Store page. My guess is that the developer decided not to renew his ($99/year) iOS developer license, and Apple has nixed his apps. Which is a pity.
  • You can still play the original Guess My Word game online. You can also read my previous post about Guess My Word.