Pinyin Zhou logo Pinyin Joe's
Chinese Computing Help Desk

Click. Work. Collect.

Pinyin Macros for MS Word and Excel

FAQs & Technical Notes

Tone Placement Rules ... Neutral Tone... Unicode Characters...

Word VBA Macro... Excel VBA Macro... AppleScript...

Troubleshooting


Downloads
To download the macros, please
visit the main pinyin macros page.

 

The Rules for Placing Pinyin Tone Marks

It's easy to know where to place the tone mark when there is only one vowel: over that vowel. When there is more than one vowel then it always appears over the first vowel, unless the first vowel is "i", "u" or "ü" in which case the tone mark goes over the second vowel. The letters "y" and "w" don't count as vowels under these rules.

To confirm these rules and learn more, see the Wikipedia pinyin article section on rules for placing pinyin tone marks. To start work on these macros I had to look this up myself, and now that I've got my macros done I'm prone to forget it again...

I'm not entirely certain the rules for pinyin tone mark placement were ever carefully and systematically taught to us in my US university Chinese courses. Maybe it's just my middle-aged memory. I don't have an old copy of my original DeFrancis textbooks to verify that we were not taught this. I do recall that we were never quizzed on it, but my professors knew exactly where to put the diacritics when asked.

«Top

 

What About the Neutral or "Fifth" Tone?

The neutral tone is often indicated by a dot over the vowel in language teaching, but this tone mark does not appear in standard dictionaries or other mainstream print usage of pinyin. It is not entirely necessary outside of school, but the main problem is that the mark is difficult to show in print.

The numeric representation of this tone is sometimes "0" and sometimes "5". I could have the macros convert either one to the same neutral tone, but the character sets included in the fonts bundled with Office do not offer what we need for every letter. In fact, I'm not certain any one font includes them all. For example, in your Insert Symbols dialog you will find that for "a" there is "å" but the "e" would look like "ė". Although an "a" with a small dot above it exists in Unicode it is outside the range of most fonts, and most web browsers for that matter. And so on.

I have seen some teachers in Taiwan use a "middle dot", also known as an "interpunct". To preserve the integrity of the character it must be placed in front of the entire syllable, like this: "·ma".  The full width version of this same dot is also used to separate Chinese characters in transliterated foreign or minority language names (乔治 · 华盛顿). It often shows up there in the narrow-width version (乔治·华盛顿) and it is also used in punctuation.

Modifying the macro to place a middle dot in front of the initial consonant in every possible syllable would require more lines of code than you may realize, so I'm loath to spend time on this given that I've never received a request for a middle dot (and only a couple questions about the dot above the vowel as well). Let me know what you think.

As I mentioned above, you can insert many of these from the Symbols list in Word, Excel and many other applications. There is also a predefined keyboard shortcut for the "å" (alt-0229 on the number pad with NumLock engaged), but for "ė" and most others you must assign custom shortcuts or use the Symbols list.

«Top

Unicode Characters

The Uncode Consortium website offers tables of all Unicode characters. I also recommend FileFormat.info. Although as of this writing FileFormat.info is not perfectly indexed, when it works that site is much more user-friendly for seeking the codes for Unicode, macros, html, Microsoft keyboard shortcuts or many other purposes.

On FileFormat you will find nicely laid-out tables and representations of the characters in both text and images. There is some jargon to learn, however. Here are examples using the the letter "a":

  • ā - Unicode Character 'LATIN SMALL LETTER A WITH MACRON' (U+0101)
  • á - Unicode Character 'LATIN SMALL LETTER A WITH ACUTE' (U+00E1)
  • ǎ - Unicode Character 'LATIN SMALL LETTER A WITH CARON' (U+01CE)
  • à - Unicode Character 'LATIN SMALL LETTER A WITH GRAVE' (U+00E0)

"Grave" is pronounced "grahv", I've learned. The Unicode number is on the end in parantheses.

«Top

Making the Word Macro in VBA

Macros are all about taking care of repetitive actions for you, and my plan required exactly that: finding and moving numbers repeatedly until they are next to the vowel that should be replaced, then replacing that vowel and the number with the appropriate Unicode character.

Having studied various versions of Basic in undergrad and grad school (along with Pascal, and some languages so long out of use I won't even mention them...), I felt confident I could figure this out quickly. Riiight. Humbled but not deterred with what I found while grazing through free information available on the MSDN site, plus various forums and blogs, I realized there were many new conventions to learn and this wasn't going to be a simple if/then thing. I decided to structure the macro as a long list of With statements, and then I set about learning what functions to use for find and replace.

Nothing ever being simple, Visual Basic's Unicode-to-character conversion function (ChrW) requires a prior conversion from Unicode notation, replacing the "U+" and all leading zeros with "&H" to indicate that the remaining number is hexadecimal:

  • ā - ChrW(&H101)
  • á - ChrW(&HE1)
  • ǎ - ChrW(&H1CE)
  • à - ChrW(&HE0)

Therefore this relatively simple macro winds up being a long list of statements like this:

With Selection.Find
    .Forward = False
    .Text = "a1"
    .Replacement.Text = ChrW(&H101)
    .Execute Replace:=wdReplaceAll
End With

Word automatically converts upper case letters by itself, so I only had to give it the lower case code points and did not have to look up the capital letter versions, or worry about initial vowel mixed-case situations like "Ai4" or "Ou1yang2". I would have no such luck when I finally turned to the Excel version.

«Top

Making the Excel Macro in VBA

I found myself eating more humble pie when trying to recreate this macro in Excel. It was not going to be as simple as pasting the Word macro in and hitting Run. Although the Word version will, without asking, convert selected text or else all the text behind the cursor, Excel is an entirely different animal and needs to be told whether to convert specific cells, a worksheet or an entire workbook.

Another problem, as I mentioned above, was that although Word automatically converts to upper-case letters when it sees them, Excel needs to be provided specific code points for each character in both upper- and lower-case. This required not only looking up the code points for the capital letter versions and doubling the number of sorting and replacement statements, but I also had to think through all the possible initial vowel mixed-case situations:

  • Ai
  • Ao
  • Ei
  • Ou

Is that all of the possible "proper case" initial double vowel situations? Let me know if I missed any.

I then had to decide if I really wanted to learn more about the Visual Basic for Applications commands necessary for Excel. I use Excel in all sorts of MBA-type activities, but I had no real interest in learning more VBA just to create this one freebie.

Instead I got some help from Tushar Kapila, an Excel macro maven who quickly and easily created an elegant solution. Please Contact me and not Tushar for technical support with this macro, but I do highly recommend his expertise for any Excel or other VBA-related development.

This was child's play for Tushar. He converted my With statements to Selection.Replace, and created three subroutines for three versions calling the same main code to convert cells, worksheet or workbook. This is an example:

Selection.Replace What:="a1", Replacement:=ChrW(&H101), LookAt:=xlPart, _

SearchOrder:=xlByRows, MatchCase:=False

Tushar had done the hard part, but only with lower case letters. I then changed MatchCase to True and went through duplicating all those statements for upper case. Lastly, I carefully added sort and replace statements for proper case double-vowel situations like "Ou".

Did we get it right? Please let me know how this new macro is working for you, via my contact page.

«Top

Converting to AppleScript for Mac Office 2008

Office 2008 for Macintosh does not support VBA macros, nor do do the iWorks applications Pages and Numbers. I hope to get these macros converted to AppleScript soon. I'm not sure if iWorks will be able to do handle these, but I'm sure Office 2008's Word and Excel will. Stay tuned...

«Top

Troubleshooting

If it won't run: Usually when someone tells me they can't get one of these macros to run I just ask them to delete and recreate the macros and that usually solves the problem:

  • In Word, be very careful not to erase the "Sub" and "End sub" lines that you will see when you first open a blank macro. If that doesn't solve the problem, you may be getting a "compile error" because there is something wrong with your copy of Word's "normal.dot" file. If so, please contact me and I'll try to help further.
  • In Excel, be very careful not to erase the "Sub" and "End sub" lines when you paste in the macro. Then rename the macro by changing the name after "Sub" as explained in the first comments in the file. If Excel is set to add "Option Explicit" at the top of each macro, make sure there is only one instance of those words in the entire macro.

Fonts: I list the fonts that include the necessary Unicode characters on the main pinyin macro page, and have never received a question about that. But I can't stress enough: if you use Office 2007's default Calibri font, some of the letters will display in Arial and your words will look very strange.

Security: Enabling macros requires more and more bravery due to the increasingly strident warnings each new version Word and Excel throw at you, as if Microsoft is driven by some kind of PTSD hyper-vigilance caused by past battles against the macros of evildoers. But even a quick glance at the contents of my macros seems to set everyone at ease. They just move and replace a few letters and numbers. Excel is particularly bad about giving scarey warnings, but just follow the instructions in any dialogs that pop up or any Help instructions you need to read, and you'll be fine.

Excel formulas: Version 1.x of the Excel macro will also convert variables that look like pinyin. (For example, in "=a1+b1" the "a1" will get converted.) We're working on a version that first checks for formulas. Please check back regularly for version 2. For now, if you have formulas in a spreadsheet please use the "selected cells" version of the macro and be very careful not to select any cells with a formula.

Compatibility: I've tested these macros in the Windows versions of Office 2007 (use one of the fonts listed above, not the default Calibri), Office 2003, Office XP 2002 and Office 97. I am told they also work in Macintosh Office 98 through 2004, and maybe even the mid-nineties version 4.2, but Mac Office 2008 is not compatible with this type of macro. I hope to get these VBA macros converted to AppleScript soon. Please send questions, comments or suggestions anytime via the contact page.

«Top
 

Vista Chinese Setup Vista Pinyin Setup Vista Zhuyin Setup    Vista Chinese Fonts Vista Language Packs (MUI)
XP East Asian Setup XP Pinyin Setup XP Zhuyin Setup    XP Chinese Fonts XP Chinese User Interface (MUI)
Vista Features Review Frequently Asked Questions 3rd Party Fonts/Apps    Free Downloads Home...About...Contact

Copyright © 2005 PinyinJoe.com.  All Rights Reserved.
"Microsoft", "Windows", "Vista" and any other trademarks on this site are the sole property of their respective owners.