Today: Fun with Unicode, Regex and Java.
Some would say, i have 3 problems
private final static Pattern placeholder = Pattern.compile("#\\{(\\w+?)\\}");
won’t match “Mot#{รถ}rhead” for example.
To replace the word character \w you either need the list of possible unicodeblocks like [\p{InLatin}|\p{InEtc}] (you get the codes for the blocks through “Character.UnicodeBlock.forName” or you’re lazy like me and just use the dot:
private final static Pattern placeholder = Pattern.compile("#\\{(.+?)\\}");
Oh what a day… :/
Share This
— Trackback URI
This entry (permalink) was posted on Tuesday, November 3, 2009, at 4:32 pm by Michael, tagged with Code Snippets and categorized in English posts, Java.
The following post could be of some interest: Unicode substrings in Ruby 1.8.x, Create ZIP Archives containing Unicode filenames with Java, regex: URL thingy with username, password, host and port, Javas String.replaceAll, Java stuff, Comments are evil?, Raid 0 or 1 with Mac OS X, Mint 2.x advanced preferences, Enabling tooltips on a JTree, Remote hdd cloning
Post a Comment