Writing non-English languages with a QWERTY keyboard

QWERTY and nothing else

My first experience on a keyboard was on a Teletype Model 33. This was to learn how to program, not how to type blindly, which I still haven't mastered... But I did develop a preference for mechanical keyboards. When IBM introduced the IBM PC AT (Advanced Technology), they also introduced the IBM Model M keyboard. I have been using these keyboards ever since. My Model M's have 101 keys in QWERTY layout and no key with a logo on it (some people call this the Windows key).

Picture of QWERTY layout

Microsoft International keyboard (with dead keys)

My native language is Dutch, but I read and write a few others. Most of these languages have 'foreign' (accented) characters, which obviously are not available on the keyboard. Starting with Windows 3.0, there was a solution: The US International layout. It uses dead keys. A dead key no longer generates a character, it's a prefix for the next character. An apostrophe (') followed by the letter e will generate é (e-acute). Great stuff! But what if one needs to type an apostrophe? Just type ', followed by a space. Easy but very annoying for programmers (like me), who have to type quotes and double quotes all the time.

Picture of International with dead keys layout

The US International layout does have another feature: Most keys will generate special characters when pressed while the right Alt key is held down. (The right Alt key is marked AltGr on most localized keyboards.) No sequence of ' and e needed, but just AltGr-e. The number of accented characters (in most languages) is rather small. The letter é is common in French, but its frequency is only 1,5%. I started wondering what would happen if I changed the Microsoft design so that the dead keys were no longer dead (allowing me to program freely) but still would have é at my fingertips through AltGr-e. Removing the dead keys posed a problem: ë (which I use in Dutch) is not available through an AltGr combination but only through a dead key.

International keyboard (with AltGr dead keys)

I experimented with how awkward it would be if I used AltGr to get to the dead keys as well: for à, I would have to type Shift-AltGr-`, followed by a. Awkward indeed, but à is not very common in Dutch (fortunately). I got rid of the very annoying " followed by a space to get a single doublequote AND still had access to all characters! I used this layout for a number of years, modifying the .../X11/symbols/us file after every new installation of Linux.

Picture of International with altgr dead keys layout

Some people around me noticed my layout, which made me think of submitting it to ... (I didn't know). With some help, my proposal was accepted. Within one or two releases, I was able to stop modifiying files, but just select "International AltGr dead keys" from a menu. Great.

Some people started using the layout, some of them helping others to find it in Linux. Some got accustomed to the layout and wanted it to work on Windows too.

Western European keyboard (with AltGr dead keys)

Years later I received an e-mail from user Enno, who writes German a lot more than I do. His issue (obviously) was with the (common) letters ä, ë, ö and ü. They're available, but all over the keyboard. A somewhat logical layout would be nicer, but that would require breaking the Microsoft 'standard'.

Which made me think. What makes the Microsoft International keyboard eh... international? What rationale is behind having support for the letter ð (eth) - a letter used in Old English, Middle English, Icelandic, Faroese - languages with few speakers. The letter ã (atilde) is very common in Portugese, but that is only accessible through a dead key. Or, as Enno put it: © (AltGr-c) is only common in Redmond.

We tried to come up with a layout that would support more languages a little better. In the MS layout, each vowel has the acute version 'on' the vowel. AltGr-a produces á, AltGr-e becomes é. Makes sense. But where to put the other variants so that the user will remember?

Eventually, we decided on a few 'ground rules':

That left us with a bunch of other accented letters used in languages we don't speak (or write). Stefan, who makes letter frequency tables for many languages, prepared a combined frequency table for 10 Western European languages (apart from English). We made all accented characters in this table available as a single AltGr- keystroke, with the highest frequency letters closest to the middle row. Œ ended up on the '.' key, but æ (used a lot in Danish, appearantly) is on 'x'.

This layout supports English (of course), Danish, Dutch, Finnish, French, German, Italian, Norwegian, Portugese, Spanish and Swedish in the sense that all accented characters in these languages are available as one AltGr keystroke.

Picture of Western European with altgr dead keys layout

The layout is called altgr-weur. For languages of Eastern Europe, someone (who understands those languages) could make an altgr-eeur.

Compose keys

The altgr-intl layout has (AltGr-) dead keys for the Western European languages, but also (AltGr-) dead keys for other languages. The altgr-weur layout eliminated those dead keys, but that does not mean one cannot correctly type (let's say) an Hungarian name. The compose key can be enabled to have virtually all UTF-16 characters available. Start

gnome-tweaks

go to Keyboard and Mouse, and select a Compose Key. (I chose the RightCtrl key. This key sometimes is called the Multi_Key.) Composing a character works as follows: press the RightCtrl key (and release), then press apostrophe ('), then press e to get é. Almost any character can be generated this way, for example ⅜ (RightCtrl, '3', '8'), which is probably not (natively) available on any keyboard. AltGr-'a' (for á) is faster - that's why altgr-weur is based on frequency tables.

Still present on the proposed (altgr-weur) layout:
X11 name Description Dead Usage
dead_acute right pointing apostrophe above  
dead_cedilla comma below ,  
dead_circumflex upward chevron above ^  
dead_diaeresis two dots above  
dead_grave left pointing apostrophe above `  
dead_tilde approximation sign above ~  
Not available on the proposed layout (removed from altgr-intl):
X11 name Description Dead Usage
dead_abovedot dot, above . Transcriptions (into Roman script) of historic languages
dead_abovering ring above o Scandinavian Å and with 4 other letters in Slavic languages
dead_belowdot dot below ! Transcriptions and phonetic notations
dead_breve rounded caron above ( Languages around the Black Sea
dead_caron inverted circumflex above < Finnish (only for transcriptions), Italian (for Slavic names)
dead_doubleacute double acute = Hungarian
dead_hook questionmark above ? Vietnamese
dead_horn comma right above + Vietnamese (usually in combination with other accented vowels)
dead_iota mini iota below   Greek
dead_macron bar above _ Balkan, Baltic, Polynesian, transcriptions
dead_ogonek inverted cedilla ; Poland, Lithuania, Native American
dead_stroke (various) strokes through / Scandinavian Ø and special purposes

The character in the Dead column is the key that 'dies' after you press (and release) the Multi_Key (the one you set in gnome-tweaks as 'Compose Key', above): it will produce no output. You then follow with the letter that needs to have that accent. For instance: RightCtrl, '_', 'e' gives ē (Hungarian e-macron). Similarly, RightCtrl, '+', 'o' gives (Vietnamese) ơ. For a list with many examples, see Ubuntu help.

Unicode input

As a last resort, you could input the hexadecimal code for the character you need. The altgr-weur layout has currency symbols for Dollars and cents ($, ¢), British Pound (£), Japanese Yen (¥) and the Euro (€). The Israelian Shekel has unicode U+20AA. Type Shift-Ctrl-u, followed by 2, 0, a, a and Enter (₪). Copyright is U+00A9: © (example provided to keep Redmond happy).

Would you like to try the altgr-weur layout?

Two methods for a test-run. But: please check if the Western European layout is (already) available in the keyboard configuration tools (gnome-control-center, at the very bottom of this page) before you unnecessarily start modifying files. This is last updated in November of 2020 (when we were still working on this proposal).

Try altgr-weur just this session, for the current user

Make two directories (inside the hidden .config directory) to hold two files. Download the layout (file: altgr-weur) and the keyboard setup (file: map):

mkdir -p $HOME/.config/xkb/symbols/
wget -O $HOME/.config/xkb/symbols/altgr-weur \
    https://www.choam.eu/altgr-intl/altgr-weur
wget -O $HOME/.config/xkb/map \
    https://www.choam.eu/altgr-intl/map

Next, compile and activate the new map for the current display ($DISPLAY):

xkbcomp -w 0 -I$HOME/.config/xkb $HOME/.config/xkb/map $DISPLAY

"Success" means there is no further output. Press AltGr+'d': you should see ë (and not ð - if you were using altgr-intl before). The right Ctrl key serves as Multi_Key (or Compose) key. Once you're done testing: log out, log in again and your keyboard layout will be back the way it was. (The two small files will remain - for the next session, repeating just the xkbcomp command will suffice.)

You could do all of the above in one move:

wget -q -O - https://www.choam.eu/altgr-intl/altgr-weur.sh | bash

That last command will download a small script ('program') with the manual procedure above and start excuting it right away. Do this only if you think that altgr-weur.sh does not contain malware. It doesn't, but I wouldn't take anyone's word for it (hint: use the hyperlink to see what's in it).

Use altgr-weur every session, for all users that want it

You will need superuser access and you will need to know where the X11 directory is on your distribution for this to work (this directory is not the same on all distros). Change your current directory and start editing the file 'us' as superuser (you are now warned). (Substitute 'nano' with any other text editor you like: gedit (graphical), vi (die-hards), ...)

cd /usr/share/X11/xkb/symbols
sudo nano us

Insert altgr-weur just below altgr-intl (or at the very bottom of the file)

cd /usr/share/X11/xkb/rules
sudo nano evdev.xml

Insert altgr-weur_evdev.xml just below the section of altgr-intl

sudo nano xorg.lst

Add the following line just below altgr-intl

altgr-weur us: English (Western European AltGr dead keys)

Now, you have modified the config files, but the different tools need to pick up the changes. On some (Debian based) distros you could:

sudo dpkg-reconfigure xkb-data

...but a reboot seems to work on all distros. Next, you will need to select the new layout for your keyboard:

gnome-control-center

To select the layout, find "Region & Language", click "+" to add an input source, choose "English(US)" as your QWERTY keyboard, followed by "Western European AltGr dead keys". The layout(s) can be activated by pressing the Super (or Windows) key and the space bar. (An IBM model M lacks that key: head over to keyboard settings in Devices->Keyboard and select another keyboard shortcut for "Switch to next input source".)

Note that this will only change your keyboard layout in the graphical environment, but not on the console (and also not on the login screen). It should, however, suffice for testing this layout until it's properly integrated.