Writing non-English languages with a QWERTY keyboard
My first experience on a keyboard was on a Teletype Model 33. This was to learn how to program, not how to type blindly, which I still haven't mastered... But I did develop a preference for mechanical keyboards. When IBM introduced the IBM PC AT (Advanced Technology), they also introduced the IBM Model M keyboard. I have been using these keyboards ever since. My Model M's have 101 keys in QWERTY layout and no key with a logo on it (some people call this the Windows key).
My native language is Dutch, but I read and write a few others. Most of these languages have 'foreign' (accented) characters, which obviously are not available on the keyboard. Starting with Windows 3.0, there was a solution: The US International layout. It uses dead keys. A dead key no longer generates a character, it's a prefix for the next character. An apostrophe (') followed by the letter e will generate é (e-acute). Great stuff! But what if one needs to type an apostrophe? Just type ', followed by a space. Easy but very annoying for programmers (like me), who have to type quotes and double quotes all the time.
The US International layout does have another feature: Most keys will generate special characters when pressed while the right Alt key is held down. (The right Alt key is marked AltGr on most localized keyboards.) No sequence of ' and e needed, but just AltGr-e. The number of accented characters (in most languages) is rather small. The letter é is common in French, but its frequency is only 1,5%. I started wondering what would happen if I changed the Microsoft design so that the dead keys were no longer dead (allowing me to program freely) but still would have é at my fingertips through AltGr-e. Removing the dead keys posed a problem: ë (which I use in Dutch) is not available through an AltGr combination but only through a dead key.
I experimented with how awkward it would be if I used AltGr to get to the dead keys as well: for à, I would have to type Shift-AltGr-`, followed by a. Awkward indeed, but à is not very common in Dutch (fortunately). I got rid of the very annoying " followed by a space to get a single doublequote AND still had access to all characters! I used this layout for a number of years, modifying the .../X11/symbols/us file after every new installation of Linux.
Some people around me noticed my layout, which made me think of submitting it to ... (I didn't know). With some help, my proposal was accepted. Within one or two releases, I was able to stop modifiying files, but just select "International AltGr dead keys" from a menu. Great.
Some people started using the layout, some of them helping others to find it in Linux. Some got accustomed to the layout and wanted it to work on Windows too.
Years later I received an e-mail from user Enno, who writes German a lot more than I do. His issue (obviously) was with the (common) letters ä, ë, ö and ü. They're available, but all over the keyboard. A somewhat logical layout would be nicer, but that would require breaking the Microsoft 'standard'.
Which made me think. What makes the Microsoft International keyboard eh... international? What rationale is behind having support for the letter ð (eth) - a letter used in Old English, Middle English, Icelandic, Faroese - languages with few speakers. The letter ã (atilde) is very common in Portugese, but that is only accessible through a dead key. Or, as Enno put it: © (AltGr-c) is only common in Redmond.
We tried to come up with a layout that would support more languages a little better. In the MS layout, each vowel has the acute version 'on' the vowel. AltGr-a produces á, AltGr-e becomes é. Makes sense. But where to put the other variants so that the user will remember?
Eventually, we decided on a few 'ground rules':
That left us with a bunch of other accented letters used in languages we don't speak (or write). Stefan, who makes letter frequency tables for many languages, prepared a combined frequency table for 10 Western European languages (apart from English). We made all accented characters in this table available as a single AltGr- keystroke, with the highest frequency letters closest to the middle row. Œ ended up on the '.' key, but æ (used a lot in Danish, appearantly) is on 'x'.
This layout supports English (of course), Danish, Dutch, Finnish, French, German, Italian, Norwegian, Portugese, Spanish and Swedish in the sense that all accented characters in these languages are available as one AltGr keystroke.
The layout is called altgr-weur. For languages of Eastern Europe, someone (who understands those languages) could make an altgr-eeur.
The altgr-intl layout has (AltGr-) dead keys for the Western European languages, but also (AltGr-) dead keys for other languages. The altgr-weur layout eliminated those dead keys, but that does not mean one cannot correctly type (let's say) an Hungarian name. The compose key can be enabled to have virtually all UTF-16 characters available. Start
go to Keyboard and Mouse, and select a Compose Key. (I chose the RightCtrl key. This key sometimes is called the Multi_Key.) Composing a character works as follows: press the RightCtrl key (and release), then press apostrophe ('), then press e to get é. Almost any character can be generated this way, for example ⅜ (RightCtrl, '3', '8'), which is probably not (natively) available on any keyboard. AltGr-'a' (for á) is faster - that's why altgr-weur is based on frequency tables.
Still present on the proposed (altgr-weur) layout: | |||
X11 name | Description | Dead | Usage |
---|---|---|---|
dead_acute | right pointing apostrophe above | ‘ | |
dead_cedilla | comma below | , | |
dead_circumflex | upward chevron above | ^ | |
dead_diaeresis | two dots above | “ | |
dead_grave | left pointing apostrophe above | ` | |
dead_tilde | approximation sign above | ~ | |
Not available on the proposed layout (removed from altgr-intl): | |||
X11 name | Description | Dead | Usage |
dead_abovedot | dot, above | . | Transcriptions (into Roman script) of historic languages |
dead_abovering | ring above | o | Scandinavian Å and with 4 other letters in Slavic languages |
dead_belowdot | dot below | ! | Transcriptions and phonetic notations |
dead_breve | rounded caron above | ( | Languages around the Black Sea |
dead_caron | inverted circumflex above | < | Finnish (only for transcriptions), Italian (for Slavic names) |
dead_doubleacute | double acute | = | Hungarian |
dead_hook | questionmark above | ? | Vietnamese |
dead_horn | comma right above | + | Vietnamese (usually in combination with other accented vowels) |
dead_iota | mini iota below | Greek | |
dead_macron | bar above | _ | Balkan, Baltic, Polynesian, transcriptions |
dead_ogonek | inverted cedilla | ; | Poland, Lithuania, Native American |
dead_stroke | (various) strokes through | / | Scandinavian Ø and special purposes |
The character in the Dead column is the key that 'dies' after you press (and release) the Multi_Key (the one you set in gnome-tweaks as 'Compose Key', above): it will produce no output. You then follow with the letter that needs to have that accent. For instance: RightCtrl, '_', 'e' gives ē (Hungarian e-macron). Similarly, RightCtrl, '+', 'o' gives (Vietnamese) ơ. For a list with many examples, see Ubuntu help.
As a last resort, you could input the hexadecimal code for the character you need. The altgr-weur layout has currency symbols for Dollars and cents ($, ¢), British Pound (£), Japanese Yen (¥) and the Euro (€). The Israelian Shekel has unicode U+20AA. Type Shift-Ctrl-u, followed by 2, 0, a, a and Enter (₪). Copyright is U+00A9: © (example provided to keep Redmond happy).
Make two directories (inside the hidden .config directory) to hold two files. Download the layout (file: altgr-weur) and the keyboard setup (file: map):
Next, compile and activate the new map for the current display ($DISPLAY):
"Success" means there is no further output. Press AltGr+'d': you should see ë (and not ð - if you were using altgr-intl before). The right Ctrl key serves as Multi_Key (or Compose) key. Once you're done testing: log out, log in again and your keyboard layout will be back the way it was. (The two small files will remain - for the next session, repeating just the xkbcomp command will suffice.)
You could do all of the above in one move:
That last command will download a small script ('program') with the manual procedure above and start excuting it right away. Do this only if you think that altgr-weur.sh does not contain malware. It doesn't, but I wouldn't take anyone's word for it (hint: use the hyperlink to see what's in it).
You will need superuser access and you will need to know where the X11 directory is on your distribution for this to work (this directory is not the same on all distros). Change your current directory and start editing the file 'us' as superuser (you are now warned). (Substitute 'nano' with any other text editor you like: gedit (graphical), vi (die-hards), ...)
Insert altgr-weur just below altgr-intl (or at the very bottom of the
file)
Insert altgr-weur_evdev.xml just below the section of
altgr-intl
Add the following line just below altgr-intl
Now, you have modified the config files, but the different tools need to pick up the changes. On some (Debian
based) distros you could:
...but a reboot seems to work on all distros. Next, you will need to select the new layout for your keyboard:
To select the layout, find "Region & Language", click "+" to add an input source, choose "English(US)" as your QWERTY keyboard, followed by "Western European AltGr dead keys". The layout(s) can be activated by pressing the Super (or Windows) key and the space bar. (An IBM model M lacks that key: head over to keyboard settings in Devices->Keyboard and select another keyboard shortcut for "Switch to next input source".)
Note that this will only change your keyboard layout in the graphical environment, but not on the console (and also not on the login screen). It should, however, suffice for testing this layout until it's properly integrated.