<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>info.michael-simons.eu &#187; Encoding</title>
	<atom:link href="http://info.michael-simons.eu/tag/encoding/feed/" rel="self" type="application/rss+xml" />
	<link>http://info.michael-simons.eu</link>
	<description>Just another nerd blog</description>
	<lastBuildDate>Tue, 07 Sep 2010 09:00:01 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Create ZIP Archives containing Unicode filenames with Java</title>
		<link>http://info.michael-simons.eu/2010/01/05/create-zip-archives-containing-unicode-filenames-with-java/</link>
		<comments>http://info.michael-simons.eu/2010/01/05/create-zip-archives-containing-unicode-filenames-with-java/#comments</comments>
		<pubDate>Tue, 05 Jan 2010 14:01:54 +0000</pubDate>
		<dc:creator>Michael</dc:creator>
				<category><![CDATA[English posts]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Encoding]]></category>
		<category><![CDATA[Zip]]></category>

		<guid isPermaLink="false">http://info.michael-simons.eu/?p=380</guid>
		<description><![CDATA[It is harder than i thought to create a simple Zip Archive from within Java that contains entries with unicode names in it. I&#8217;m actually to lazy to read all the specs, but it says something that the entries in a zip archive are encoded using &#8220;Cp437&#8243;. The buildin Java compressing api has nothing to [...]]]></description>
			<content:encoded><![CDATA[<p>It is harder than i thought to create a simple Zip Archive from within Java that contains entries with unicode names in it.</p>
<p>I&#8217;m actually to lazy to read all the specs, but it says something that the entries in a zip archive are encoded using &#8220;Cp437&#8243;. The buildin Java compressing api has nothing to offer for setting the encoding so i tried <a href="http://commons.apache.org/compress/">Apache Commons Compress</a>.</p>
<p>The manual says the following about interop :</p>
<blockquote><p>For maximum interop it is probably best to set the encoding to UTF-8, enable the language encoding flag and create Unicode extra fields when writing ZIPs. Such archives should be extracted correctly by java.util.zip, 7Zip, WinZIP, PKWARE tools and most likely InfoZIP tools. They will be unusable with Windows&#8217; &#8220;compressed folders&#8221; feature and bigger than archives without the Unicode extra fields, though.</p></blockquote>
<p>That didn&#8217;t work for me.</p>
<p>After some cursing, this is my solution:</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">final</span> ZipArchiveOutputStream zout <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> ZipArchiveOutputStream<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399;">BufferedOutputStream</span><span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399;">FileOutputStream</span><span style="color: #009900;">&#40;</span>fc.<span style="color: #006633;">getSelectedFile</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
zout.<span style="color: #006633;">setEncoding</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;Cp437&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
zout.<span style="color: #006633;">setFallbackToUTF8</span><span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">true</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
zout.<span style="color: #006633;">setUseLanguageEncodingFlag</span><span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">true</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>								
zout.<span style="color: #006633;">setCreateUnicodeExtraFields</span><span style="color: #009900;">&#40;</span>ZipArchiveOutputStream.<span style="color: #006633;">UnicodeExtraFieldPolicy</span>.<span style="color: #006633;">NOT_ENCODEABLE</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>I specifying explicitly the encoding but instead of using utf-8, which didn&#8217;t work for my utf-8 strings (wtf??), i&#8217;m using the Cp437 from the specs and some other magic options and it works for me in 7zip, WinZip and even Windows&#8217; &#8220;compressed folders&#8221;. </p>
<p><ins datetime="2010-01-06T08:02:58+00:00">Edit:</ins> Unfortunately, in Mac OS X&#8217;s Unzip utility, the non Cp437 are broken. If anyone has a good idea, feel free to leave a comment.</p>
<p class="akst_link"><a href="http://info.michael-simons.eu/?p=380&amp;akst_action=share-this"  title="E-mail this, post to del.icio.us, etc." id="akst_link_380" class="akst_share_link " rel="nofollow">Share This</a>
</p>]]></content:encoded>
			<wfw:commentRss>http://info.michael-simons.eu/2010/01/05/create-zip-archives-containing-unicode-filenames-with-java/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>
