| ||||||
|
|
Pinyin Macros for MS Word and ExcelFAQs & Technical Notes
Tone Placement Rules
Word VBA Macro
The Rules for Placing Pinyin Tone Marks It's easy to know where to place the tone mark when there is only one vowel: over that vowel. When there is more than one vowel then it always appears over the first vowel, unless the first vowel is "i", "u" or "ü" in which case the tone mark goes over the second vowel. The letters "y" and "w" don't count as vowels under these rules. To confirm these rules and learn more, see the Wikipedia pinyin article section on rules for placing pinyin tone marks. To start work on these macros I had to look this up myself, and now that I've got my macros done I'm prone to forget it again... I'm not entirely certain the rules for pinyin tone mark placement were ever carefully and systematically taught to us in my US university Chinese courses. Maybe it's just my middle-aged memory. I don't have an old copy of my original DeFrancis textbooks to verify that we were not taught this. I do recall that we were never quizzed on it, but my professors knew exactly where to put the diacritics when asked. «Top
What About the Neutral or "Fifth" Tone? The neutral tone is often indicated by a dot over the vowel in language teaching, but this tone mark does not appear in standard dictionaries or other mainstream print usage of pinyin. It is not entirely necessary outside of school, but the main problem is that the mark is difficult to show in print. The numeric representation of this tone is sometimes "0" and sometimes "5". I could have the macros convert either one to the same neutral tone, but the character sets included in the fonts bundled with Office do not offer what we need for every letter. In fact, I'm not certain any one font includes them all. For example, in your Insert Symbols dialog you will find that for "a" there is "å" but the "e" would look like "ė". Although an "a" with a small dot above it exists in Unicode it is outside the range of most fonts, and most web browsers for that matter. And so on. I have seen some teachers in Taiwan use a "middle dot", also known as an "interpunct". To preserve the integrity of the character it must be placed in front of the entire syllable, like this: "·ma". The full width version of this same dot is also used to separate Chinese characters in transliterated foreign or minority language names (乔治 · 华盛顿). It often shows up there in the narrow-width version (乔治·华盛顿) and it is also used in punctuation. Modifying the macro to place a middle dot in front of the initial consonant in every possible syllable would require more lines of code than you may realize, so I'm loath to spend time on this given that I've never received a request for a middle dot (and only a couple questions about the dot above the vowel as well). Let me know what you think. As I mentioned above, you can insert many of these from the Symbols list in Word, Excel and many other applications. There is also a predefined keyboard shortcut for the "å" (alt-0229 on the number pad with NumLock engaged), but for "ė" and most others you must assign custom shortcuts or use the Symbols list. «TopUnicode Characters The Uncode Consortium website offers tables of all Unicode characters. I also recommend FileFormat.info. Although as of this writing FileFormat.info is not perfectly indexed, when it works that site is much more user-friendly for seeking the codes for Unicode, macros, html, Microsoft keyboard shortcuts or many other purposes. On FileFormat you will find nicely laid-out tables and representations of the characters in both text and images. There is some jargon to learn, however. Here are examples using the the letter "a":
"Grave" is pronounced "grahv", I've learned. The Unicode number is on the end in parantheses. «TopMaking the Word Macro in VBA Macros are all about taking care of repetitive actions for you, and my plan required exactly that: finding and moving numbers repeatedly until they are next to the vowel that should be replaced, then replacing that vowel and the number with the appropriate Unicode character. Having studied various versions of Basic in undergrad and grad school (along with Pascal, and some languages so long out of use I won't even mention them...), I felt confident I could figure this out quickly. Riiight. Humbled but not deterred with what I found while grazing through free information available on the MSDN site, plus various forums and blogs, I realized there were many new conventions to learn and this wasn't going to be a simple if/then thing. I decided to structure the macro as a long list of With statements, and then I set about learning what functions to use for find and replace. Nothing ever being simple, Visual Basic's Unicode-to-character conversion function (ChrW) requires a prior conversion from Unicode notation, replacing the "U+" and all leading zeros with "&H" to indicate that the remaining number is hexadecimal:
Therefore this relatively simple macro winds up being a long list of statements like this: With Selection.Find Word automatically converts upper case letters by itself, so I only had to give it the lower case code points and did not have to look up the capital letter versions, or worry about initial vowel mixed-case situations like "Ai4" or "Ou1yang2". I would have no such luck when I finally turned to the Excel version. «TopMaking the Excel Macro in VBA I found myself eating more humble pie when trying to recreate this macro in Excel. It was not going to be as simple as pasting the Word macro in and hitting Run. Although the Word version will, without asking, convert selected text or else all the text behind the cursor, Excel is an entirely different animal and needs to be told whether to convert specific cells, a worksheet or an entire workbook. Another problem, as I mentioned above, was that although Word automatically converts to upper-case letters when it sees them, Excel needs to be provided specific code points for each character in both upper- and lower-case. This required not only looking up the code points for the capital letter versions and doubling the number of sorting and replacement statements, but I also had to think through all the possible initial vowel mixed-case situations:
Is that all of the possible "proper case" initial double vowel situations? Let me know if I missed any. I then had to decide if I really wanted to learn more about the Visual Basic for Applications commands necessary for Excel. I use Excel in all sorts of MBA-type activities, but I had no real interest in learning more VBA just to create this one freebie. Instead I got some help from Tushar Kapila, an Excel macro maven who quickly and easily created an elegant solution. Please Contact me and not Tushar for technical support with this macro, but I do highly recommend his expertise for any Excel or other VBA-related development. This was child's play for Tushar. He converted my With statements to Selection.Replace, and created three subroutines for three versions calling the same main code to convert cells, worksheet or workbook. This is an example:
Tushar had done the hard part, but only with lower case letters. I then changed MatchCase to True and went through duplicating all those statements for upper case. Lastly, I carefully added sort and replace statements for proper case double-vowel situations like "Ou". Did we get it right? Please let me know how this new macro is working for you, via my contact page. «TopConverting to AppleScript for Mac Office 2008 Office 2008 for Macintosh does not support VBA macros, nor do do the iWorks applications Pages and Numbers. I hope to get these macros converted to AppleScript soon. I'm not sure if iWorks will be able to do handle these, but I'm sure Office 2008's Word and Excel will. Stay tuned... «TopTroubleshooting If it won't run: Usually when someone tells me they can't get one of these macros to run I just ask them to delete and recreate the macros and that usually solves the problem:
Fonts: I list the fonts that include the necessary Unicode characters on the main pinyin macro page, and have never received a question about that. But I can't stress enough: if you use Office 2007's default Calibri font, some of the letters will display in Arial and your words will look very strange. Security: Enabling macros requires more and more bravery due to the increasingly strident warnings each new version Word and Excel throw at you, as if Microsoft is driven by some kind of PTSD hyper-vigilance caused by past battles against the macros of evildoers. But even a quick glance at the contents of my macros seems to set everyone at ease. They just move and replace a few letters and numbers. Excel is particularly bad about giving scarey warnings, but just follow the instructions in any dialogs that pop up or any Help instructions you need to read, and you'll be fine. Excel formulas: Version 1.x of the Excel macro will also convert variables that look like pinyin. (For example, in "=a1+b1" the "a1" will get converted.) We're working on a version that first checks for formulas. Please check back regularly for version 2. For now, if you have formulas in a spreadsheet please use the "selected cells" version of the macro and be very careful not to select any cells with a formula. Compatibility: I've tested these macros in the Windows versions of Office 2007 (use one of the fonts listed above, not the default Calibri), Office 2003, Office XP 2002 and Office 97. I am told they also work in Macintosh Office 98 through 2004, and maybe even the mid-nineties version 4.2, but Mac Office 2008 is not compatible with this type of macro. I hope to get these VBA macros converted to AppleScript soon. Please send questions, comments or suggestions anytime via the contact page. |
|||||
| ||||||