LINGUIST List help Unicode and Character set troubleshooter
Do you get funny boxes instead of normal characters? Is the data you entered not displaying correctly? Read on...
  Problem Diagnosis Solution
1.  On the new-style LINGUIST List pages, all the characters show up as little squares
In some cases, they may appear correctly for a second until you move the mouse over them
This is caused by a character set mis-match between our page and your internet browser. In order to display non-ASCII characters (such as IPA characters), all our new pages are now in Unicode. However, if the browser is set to override the page settings, it may display incorrectly. In these cases, it is necessary to change the browser character encoding settings. This problem seems to occur only in Netscape version 4.x. The instructions that follow are specific to that browser, though you can change the encoding on other browsers in a similar way. Some problems with Internet Explorer can be solved by refreshing the screen.

1. In Netscape 4, go to the 'view' menu at the top, then to 'Character Set'.
2. Choose 'Unicode UTF-8' from the Character Set menu.
3. Go back to the Character Set menu and click 'Set Default Character Set' at the bottom.
4. Refresh the page.
2.  Only a few characters are not showing up correctly. The bad ones seem to be IPA symbols, or European characters with diacritics. While your machine is generally showing unicode characters correctly, it does not have the right font to show them all. The best to use is the Arial MS Unicode font. It is a hugely extended version of the normal Arial font (newer versions of which are unicode-compatible also), and includes Chinese, Japanese and Korean characters, IPA and so on. For this reason, it is 23Mb! It is the fullest unicode font widely available. TITUS is good too, and much smaller at 900k. A big list of unicode-compatible fonts for both the Mac and the PC is available at The fonts you need are the 'general purpose fonts' near the top.
For PC users, TITUS can be downloaded at

The Arial MS Unicode font, for both Mac and PC, is no longer available for download from the web. However, it is included with Microsoft Office 2000 (or later), so you may have this font on a CD somewhere. Its size means it can slow your computer down.
3.  I just used one of your forms to enter some data. Some of the characters I put in, such as à, appeared differently on the 'You have entered...' confirmation page. This is probably because the text you entered was in a different character set from Unicode UTF-8. Even if you cut and paste from the latest version of Microsoft Word, some characters may display differently. Currently, there is no convenient way to enter unicode for all characters via the internet.
Many of our forms incorporate a Unicode input device which can be invoked by either double clicking or right clicking in the input box. A double click brings up an IPA chart: click on the characters you need, then click 'send back' to paste them into your input box. A right click brings up a context-sensitive menu of characters that are similar to the one you typed. After typing 'e', right clicking will bring up a menu including é, ê and ë.

If you need a different character, you can use Unipad ( which is a basic Unicode text editor, or cut and paste from a web site that already displays, in Unicode, the characters you want. IPA characters can be found at, and our page contains almost all characters used in Latin-based scripts.

Other issues

• You will need at least Netscape 4 or Internet Explorer 4 to view all unicode characters properly. Earlier versions were written before the standards for unicode were fully established. You should be able to display most LINGUIST List pages with these earlier browsers, but you may run into odd problems. It's better just to upgrade.

Why did LINGUIST List change to Unicode?
We've made this change because of the huge advantages that Unicode gives linguists: it allows us to display IPA, and all the diacritics which linguists use in their work. This is not all it does, however. It also allows us to encode almost all scripts now used in the world, and many of the scripts used by ancient languages. In short, it's possible now to put up pages not just in basic Latin characters, but in IPA, Cyrillic, Greek, Chinese, Devanagari, and a host of other scripts.

More help is available from these resources, below:

• Alan Wood's Unicode Resources site includes unicode test pages so you can check our system, more places to fund unicode fonts, and instructions on how to make your browser unicode-compatible:


If you are still having trouble, please send an email to and describe to us the problem. Please let us know what sort of computer you're using, what operating system it runs, which browser you are using, whether it is a private computer or a public computer (those in libraries, for example, are often very restricted) and how often the problem occurs. Thank you.