1. Index
  2. Introduction
  3. Plan
  4. Pronunciation
  5. Dictionaries
    1. Wiktionary
    2. FEWL
    3. CMUDict


On this page I basically try to predict how English would naturally improve on its own. Overhauling language is uncomfortable. I suggest picking more regularized forms of a couple words that don't seem too strange to you, and using them.

This idea upsets some people. Living languages, languages that are in active use, change. Would you prefer those changes to be entirely accidental? Wouldn't it be better if some of those changes were carefully thought out?

Many constructed languages have been created in hopes of providing a somehow more useful alternative for international communication. The main problem is getting people to actually use them. English is the de facto international language, and takes a long time to learn because of how internally inconsistent it is. What if we created a language that was as easy to learn as possible, by being as internally consistent as possible, while also being as easy to read as possible for people who already know English? Make adoption easier, and encourage people to use pieces of it to make their English easier to understand, as they are comfortable. Possibly entirely, in places where people are particularly concerned about international accessibility, where things like Basic English, Special English, and Simplified English are used. It might also be useful for fiction which requires a futuristic language.


On this payj Ai baysikali trai too [predict] how Ingglish woud nacharali improov on its own. [overhauling] langgwij iz unkumfterbal. Ai sagjest piking mor [regularized] forms uv ay kupal wurds that downt seem too straynj too yoo, and yoosing them.

This aidia upsets sum pursans. Living langgwijs, langgwijs that ar in aktiv yoos, chaynj. Woud yoo prifur thowz chaynjs too bi intairli aksidentl? [wouldn't] it bi beder if sum uv thowz chaynjs wur [carefully] thingkd owt?

Meni kanstruktd langgwijs hav bin kreeaytd in howps uv pravaiding ay sumhow mor yoosfal awlturnativ for internashanal kumyoonikayshan. Thi mayn prablam iz geting pursans too akchooali yoos them. Ingglish iz thi [de] [facto] internashanal langgwij, and tayks ay lawng taim too lurn bikawz uv how [internally] inkansistant it iz. Whut if wi kreeaytd ay langgwij that wuz az izi too lurn az pasabl, bai biing az [internally] kansistant az pasabl, whail awlsow biing az izi too rid az pasabl for pursans hoo awlredi now Ingglish? Mayk [adoption] izier, and inkurrij pursans too yoos peess uv it too mayk theer Ingglish izier too understand, az thay ar kumftabal. Pasabli intairli, in playss wheer pursans ar patikyalerli kansurnd abowt internashanal aksesabilati, wheer things laik Baysik Ingglish, Speshal Ingglish, and Simplifaid Ingglish ar yoosd. It mait awlsow bi yoosfal for fikshan wich rikwairs ay [futuristic] langgwij.


All of the changes I'm suggesting are changes which I believe could happen from natural regularization, given enough time. I'm not certain of any of these choices, and am interested in suggestions.

Verb (past tense) and noun (plural) regularization happens naturally over time. For example, the plural of "cow" changed from "kine" to "cows", and the past tense of "help" changed from "holpe" to "helped". There are many of these straight forward changes left to do.

Another area where substantial consistency can be added is the relationship between sounds (phonemes) and spellings (graphemes). A popular example that is at least 140 years old: If the 'gh' sound in 'enough' is pronounced 'f', and the 'o' in 'women' makes the short 'i' sound and the 'ti' in 'nation' is pronounced 'sh' then the word 'ghoti' is pronounced just like 'fish'. Those problems can be fixed with these changes:


Then "fish" is pronounced like "fish".

My thoughts, so far, are:

  1. For every sound (phoneme) pick a single spelling (grapheme), preferably the one easiest to read for people who already read English, avoiding conflicting with other phonemes. For now, I want to stick to the existing 26 letter alphabet. (Purely phonetic spelling.)
  2. Regularize all verbs by adding "-d" (if it ends in 'e') or "-ed" to the present tense to make the past tense.
  3. Regularize all nouns by adding "-s" or "-es" (for words ending in "s" or IPA "ch") to make their plurals. If the word ends in "y" (IPA "i"), drop the "y" and add "-ees" (equivalent of English "-ies", IPA "iːs").
  4. Don't change the spellings of the previous two steps for phonetics (spell the end of "cats" and "dogs" the same, even though "dogs" is pronounced like "dogz").
  5. I think it would be good to pick one standard spelling based on some kind of ideal (most popular?) pronunciation, but continue to accept regional accents differing from the pronunciation matching that spelling (apparently exactly how Turkish works, in practice). General American may be useful.

Similar things have been done. I think it's important to mention the reason for all the decisions involved, so they can be discussed, and improved on.

For concrete results, I would like to create a few web browser spell checking dictionaries for people to use:

  1. One that only adds these more consistent spellings, leaving the old ones in place, so that people can choose whichever they wish without their spell checker telling them it's wrong, and stunting the natural progress of regularization.
  2. One that adds these more consistent spellings, and removes the old spellings of maybe a dozen ideal candidates, for those who wish to actively change their habits.
  3. One that contains only these new spellings, for enthusiasts who wish to play with it.


Here are all the IPA diaphonemes, with example words on the first line, and possible graphemes on the following lines, with my preference listed first.

æ   cat, black / trap lad bad cat
 a - don't think it's worth distinguishing from ɑ:
 ae - if it is worth distinguishing from ɑ:
ɑ is US for ɑ:? https://en.wiktionary.org/wiki/not#Pronunciation
ɑ:   arm, father / palm father
 a, same as ə?
ɒ   hot, rock / lot not wasp / cloth off loss cloth long dog chocolate
ɔ is US for ɔ:?
ɔ:     call,  / thought law caught all halt talk  (not using "four")
 aw   cawll  thawt law cawt awll hawlt tawk
 au   caull  thaut lau caut aull hault tauk
 ou   coull  thout lou cout oull hoult touk 
 ol   colll  tholt lol colt olll hollt tolk 
 al   call   thalt lal calt all  halt  talk 
 a    call   that  la  cat  all  halt  tak 
 o    coll   thot  lo  cot  oll  holt  tok 
ə   away, cinema / comma about  
 a, same as ɑ:? (ə is the shcwa sound)
ɨ    / kit spotted year
 i                kit spottid yir  - new, not updated in dictionaries, was using "y"
 y, same as ɪ,i   kyt spottyd yyr - an old spelling of this sound, problematic spelling of "year" - "yyr" 
 e                ket spotted yer
ɪ   hit, sitting   /         sit english guitar (short i)
 i,              hit sitting sit inglish gitar - in use for aɪ (long i)
 y, same as ɨ,i  hyt syttyng syt ynglysh gytar
 e,              het setteng set englesh getar
i   / happy city be bee
 ee              happee citee bee bee
 i,              happi  citi  bi  bi
 y, same as ɪ,ɨ  happy  city  by  by
 e,              happe  cite  be  be
i:   see, heat / fleece see meat
 ee see  heet   fleece see meet
 ea sea  heat   fleace sea meat
eɪ   say, eight / face date day pain whey rein
 ay say   ayt  fayce dayte day payn whay rayn
 ai sai   ait  faice daite dai pain whai rain
 ei sei   eit  feice deite dei pein whei rein
 ey sey   eyt  feyce deyte dey peyn whey reyn
ɛ  / dress bed met
ɜr  / nurse burn herd earth bird
 ur   nurse burn hurd urth burd
 er   nerse bern herd erth berd
 ear  nearse bearn heard earth beard
 ir   nirse birn hird irth bird
ər  / letter winner massacre
 er   letter winner massacer
ɚ    weird
əɹ  - another form of /ɚ/ according to wiktionary
ʌ   cup, luck / strut run won flood
 u  cup  luck   strut run wun flud
ʊ   put, could / foot  hood  book
 ou  pout  coud  fout  houd  bouk  - used for u:
 oul poult could foult hould boulk
 oo  poot  cood  foot  hood  book  - used for u:
u may be same as u:
u:   blue, food / goose through you threw yew 
 oo  bloo  food   goose throo   yoo throo yoo
 ew  blew  fewd   gewse threw   yew threw yew
 ue  blue  fued   guese thrue   yue thrue yue 
aɪ   five, eye / price  my  wise  high flight mice  I  by  like  time (long i)
 ai faive ai     praice mai waise hai  flait  maice Ai bai laike taime - new, not updated in dictionaries - what does this do to words with "a" + "i" sounds?  there are none!?
                 prais  mai waiz  hai  flait  mais  Ai bai laik  taim - full regularization examples with "ai"
 i  five  i      price  mi  wise  hi   flit   mice  I  bi  like  time
 y  fyve  y      pryce  my  wyse  hy   flyt   myce  Y  by  lyke  tyme - in use for ɨ,ɪ,i (short i), and consonant y
 ii fiive ii     priice mii wiise hii  fliit  miice Ii bii liike tiime
 ay fayve ay     prayce may wayse hay  flayt  mayce ay bay layke tayme 
 oy foyve oy     proyce moy woyse hoy  floyt  moyce Oy boy loyke toyme
 oi foive oi     proice moi woise hoi  floit  moice Oi boi loike toime
 oe foeve oe     proece moe woese hoe  floet  moece Oe boe loeke toeme
 oa foave oa     proace moa woase hoa  float  moace Oa boa loake toame
ɔɪ   boy, join / choice boy hoist
 oi      join   choice boi hoist
 oy      joyn   choyce boy hoyst
oʊ   go, home  / goat no toe soap tow  folk soul roll cold
 ow gow howme  gowt now tow sowp toww fowk sowl rowll cowld
 oe goe hoeme  goet noe toe soep toew foek soel roell coeld
 o  go  home    got  no to  sop  tow  fok  sol  roll cold
 oa goa hoame  goat noa toa soap toaw foak soal roall coald
 ol gol holme  golt nol tol solp tolw folk soll rolll colld
 ou gou houme  gout nou tou soup touw fouk soul roull could
aʊ   now, out / mouth now trout
 ow      owt   mowth now trowt
         out   mouth nou trout
ɑr  / start arm car
ɪər / near deer here
 ear  near dear hear  
ɛər  / square mare there bear where air
 air   sqair mair thair bair
 are   sqare mare thare bare
 ere   sqere mere there bere
 ear   sqear mear thear bear
 uare  square muare thuare buare
ɔr  / north sort warm
 or   north sort worm
 ar   narth sart warm
ɔɹ may be same as ɔər https://en.wiktionary.org/wiki/for#Pronunciation
ɔər  / tore boar port
 or    tor  bor  port - same as ɔr
ʊər  / cure tour moor
 oar   coar toar moar
 our   cour tour mour
 ure   cure ture mure
 oor   coor toor moor
 eur   ceur teur meur
jʊər / cure pure europe your 
 ure   cure pure ureope ure
 eur   ceur peur europe eur
 your cyour pyour yourope your

p   pet, map / pen spin tip
b   bad, lab / but web
t   tea, getting / two sting bet
ɾ              alveolar flap
(d) may be same as d
d   did, lady    / do odd
t͡ʃ / tʃ check church / chair nature teach 
 ch     check church   chair nachure teach
 c      ceck  curc     cair  nacure  teac - not using it alone anywhere else?
d͡ʒ / dʒ  just, large / gin joy edge language
 j       just  larj    jin joy ej   languaj
k   cat, back    / cat kill skin queen unique thick
 k  kat  bak       kat kill skin kueen unik   thik  
g   give, flag   / go get beg
f   find, if / fool enough leaf off photo
 f  find  if   fool enouf  leaf of  foto
v   voice, five / voice have of
 v  voice  five   voice hav  ov
θ   think, both / thing teeth
 th same as ð
ð   this, mother / this breathe father
 th same as θ
s   sun, miss / see city pass mice
    sun  mis    see sity pas  mis
z   zoo, lazy / zoo rose
 z  zoo  lazy   zoo roze
ʃ   she, crash / she sure session emotion leash
 sh she  crash   she shure seshon emoshon leash  
ʒ   pleasure, vision / pleasure beige equation seizure
 zu                    pleazure beizu equazuon seizure
 su                    pleasure beisu equasuon seisure
 ge                    pleagere beige equageon seigere
 ti                    pleatire beiti equation seitire
x   / loch (Scottish) ugh
h   how, hello 
m   man, lemon / man ham
n   no, ten / no tin
n̩ hidden
 in hiddin
ŋ   sing, finger / ringer sing finger drink
 ng sing  finger   ringer sing finger dringk
 ng  - this eliminates /ŋg/ -> "ngg"  
l   leg, little / left bell
 l  leg  little   left bel
r   red, try / run very
w   wet, window / we queen
 w  wet  window   we qween
j   yes, yellow, year
 y  yes, yellow, year
 j  jes, jellow, jear - would probably be better, y always a vowel, but looks weird, and used for dʒ
 x  xes, xellow, xear - long term plan?
hw what

I was previously using "y" for the short "i" sound, and "i" for the long "i" sound, resulting in "I" -> "I", and "English" -> "Ynglysh". And the problem of 'y' being used for both a consonant and a vowel, causing "year" to become "yyr". I changed the short 'y' to 'i', and the long 'i' to 'ai', based on it's IPA /aɪ/, resulting in "I" -> "Ai", and "English" -> "Inglish", and "year" -> "yir".

I should look at Lojikl Inglish again.



Dictionary built from wiktionary.org.
Most common 100 words.

This is the one I'm currently working on.

Phonetic spelling, and regularization of nouns and verbs. 94,327 words. Columns are:

  1. Old spelling
  2. New spelling
  3. IPA
  4. Is it a noun?
  5. Is it a verb?
  6. Type of noun or verb

I've made a first pass at handling regularization of nouns and verbs. It wasn't as bad as I expected. I think I have some tweaking of rules to do. (When to ad "d" vs "id" for past tense verbs, etc..) The IPA column shows the IPA for the base word, not all the noun / verb forms.

The entire contents of wiktionary.org are available for download. It's not the most convenient to parse.

Problems I've found parsing wiktionary.

Parsing wiktionary is inconvenient, dealing with stressed vs. unstressed, and senses. But I've made good progress on it.

To do:


Dictionary built using FEWL. It handles noun plurals, and verb past tenses. I stopped using it because wiktionary has much more data on irregular nouns and verbs, and more words. Columns are:

  1. Old spelling
  2. New spelling
  3. IPA
  4. FEWL

I'm pretending "data", "those", and "these" are not plural. I also haven't handled be/am/is/are,was/were,been. Ew. Perhaps one (set of) exception(s) is reasonable. Be / beed?

"Myself" gets a hyphen because the author of FEWL doesn't indicate the difference between words that were originally hyphenated, and compounded words.

Words missing from FEWL: states caused bothered converting sounds definitions letters handled emailed noticed apostrophes handling contractions

FEWL is useful because it provides pronunciation, noun plural, and verb past tense info. But I could get the same stuff by downloading wiktionary. Which would probably give me more words.


Old dictionary built using cmudict, lacking handling of verb past tenses and noun plurals.


Be/am/is/are/was/were/been may be the greatest example of how much of a mess English is. Seven distinct spellings for what is effectively a single word. This is exactly why English takes so much longer to learn to read than other languages. It's also not surprising that this word in particular is so problematic, because research has shown that more commonly used words take longer to naturally become regularized. This also provides me with a great example to use in detailing how I encourage you to use as much or as little of my spelling methods as you're comfortable with. This is going to look weird. Keep in mind that just as it now seems odd that the past tense of "help" was once "holpe" before it was naturally regularized into "helped", if English remains in use long enough, people will one day wonder what madness caused us to use seven apparently random different spellings of the same word. Also note that I'm not changing the pronunciation of the word "be", just its spelling.

Leave them alone will be I am it is you are she was they were it has been
Consistent spelling will by I am it yz you ar she wuz they wur it has byn
Reduce to three words will by I by it by you by she wuz they wuz it has byn
Full regularization will by I by it by you by she byd they byd it has byd

This causes problems with the contraction "I'm".

The language toki pona doesn't include these words at all.

I haven't handled cases where multiple words are spelled the same but pronounced different (heteronym homographs). I'd like to spell them differently. Like "read".

More extreme possibilities I can think of are a couple cases where this language could be made more like Lojban, while remaining mutually intelligible with English: 1) Lojban doesn't have verbs and nouns, it has primitives which are converted to verbs and nouns by adding suffixes. Perhaps this way we could eliminate some redundant words. 2) Lojban grammar / sentence structure? ("First they came for the verbs, and I said nothing because verbing weirds language. Then they arrival for the nouns, and I speech nothing because I no verbs.")

What else would make useful improvements? I think I've read that poets would love if "you" and "I"/"me" rhymed.

I should try typing with diacritics, maybe it's not as bad as I think, and they'd improve the number of phonemes available without using multiple letters.

I would strongly support adding new words from other languages, by mapping them to the provided phonemes.

I'd like to create a web application that makes it easy for anyone to play with selecting their own graphemes for all the phonemes, interactively displaying the resulting words' spellings.

It would probably be more important to create a web app where you can paste the IPA spelling of a word, and get it converted.

"Data", "those", "these" are plural?

Might be good to do something with she/her, he/him. I've heard young kids get these confused, and maybe they're redundant? (Always using "her", never "she", as in "Her was [doing something].") These are subject and object pronouns? Are there other similarly redundant words? What other words do small children and adults learning English tend to get "wrong"? "Us"/"we"? Subject pronouns: I, we, you, he, she, it, or they. Object pronouns: me, us, you, him, her, it, and them. Old Norse "hann" means "him" or "he", so apparently this happens elsewhere.

Make movie with this language for promotion, used similar to A Clockwork Orange, where they wanted a futuristic language?

In an ideal situation, every sound would correspond to a single letter. There are over 40 sounds in English, and 26 letters. Often people interested in this goal add more letters. I'm curious about taking away sounds, to have only 26 of them. Somebody must have tried this, but I have not yet found an example. On 2014-11-17, Allan Kiisk said that in the Estonian spelling reform, pronunciations adjusted to match new phonetic spellings.

There was a successful Turkish spelling reform. (Apparently made pronunciation more regular, but spelling less predictable.)

English language spelling reform on Wikipedia
"If you are interested in English spelling reform, you should check out the Yahoo! Saundspel group."
English spelling reform on wyrdplay.org
George F. Lahey's Inglish
Wijk's Regularized English

Some ideas

These are ideas people have submitted, not output from my program:


These are output from my program:


Creating a spell check dictionary add-on for firefox.

Bug preventing me from using it.

Firefox English spelling dictionary.

Untested because of above bug, Firefox spelling dictionary, with the following six changes:

Based on English United States Dictionary version 60 from Jul 23, 2018.

Tue Sep 25 21:57:11 EDT 2018