alt.hn

5/17/2026 at 12:26:04 AM

Make zip files smaller with zip shrinker

https://evanhahn.com/make-zip-files-smaller-with-zip-shrinker/

by zdw

5/19/2026 at 10:09:37 AM

I know you meant well but...

"It deletes empty folders" and "Let me know if this is a problem for you"

NEVER DO THAT. I know you meant well, but the first rule of any program is to NEVER automatically delete something without informing the user. NEVER. Users keep empty folders for structure, reminders, or placeholders because software will dump files into it later when it's run. If it was there when they zipped it up, it should be there when they unzip it. Otherwise they'll check the before and after and it will show some folders missing, create confusion, and the user will run off trying to find out if anything else is missing.

Example: A user zips up a program. Some programs are coded to look for a folder and dump files into it, if the folder is missing the program will fail. I've had that occasionally over the years. Not all programs will recreate a missing folder.

by ChrisNorstrom

5/19/2026 at 10:15:50 AM

One thing I dislike about git is that it really does not support empty folders well. Even though they might make sense lot of time. Either now or for future. There is decent reasons to have empty folders.

by Ekaros

5/19/2026 at 11:37:14 AM

I just work around it with a .gitkeep file.

by svth

5/19/2026 at 12:28:29 PM

Seems we need a .zipkeep file then.

Just kidding, I don't see how the overhead of the directory entry is even remotely enough to warrant removal. Most of the magic can be left to efficient DEFLATE compatible blocks and removing entries not in the central directory in the first place (ZIP files can support concatenation of new data so long as you re-write the central directory at the end of the file).

by ebolyen

5/19/2026 at 12:59:17 PM

Yeah that probably should just be an option. Basically the default is to least mangle the zip file. Where the most extreme is turned on by flags. One of those could be 'remove empty folders'.

by sumtechguy

5/19/2026 at 8:20:35 AM

While not very popular, ECT [1] is (still?) the best solution in this space and has been my go-to tool for this purpose.

[1] https://github.com/fhanau/Efficient-Compression-Tool

by lifthrasiir

5/19/2026 at 2:06:36 PM

I had no heard of ECT, but I'm not impressed. I've just benchmarked it against two others PNG optimizers, and here are the file sizes for default and max levels:

    1985457 oxipng-o6.png
    2030036 oxipng-o2.png
    2125459 ect-o9.png
    2144598 ect-o3.png
    2169351 optipng-o7.png
    2215086 optipng-o2.png
    2218326 original.png

    oxipng 9.1.5
    OptiPNG version 7.9.1
    Efficient Compression Tool Version 0.9.5
BTW, I could not compile ECT on my Linux system, because its CMake config was too old. I used the Windows release through Wine, but it shouldn't change the results above.

I tried to apply ECT to a few .gz files, but it complained it was not compatible, and I did not dig further.

[edited for a typo s/I/it/]

by idoubtit

5/19/2026 at 8:28:17 AM

I use ect on a monthly basis, at least. Especially for png files. It's pretty great!

by futune

5/19/2026 at 12:13:31 PM

Yeah, for how well it does with PNGs it really doesn't get nearly as much attention as the other tools for the same do.

by zamadatix

5/19/2026 at 11:56:33 AM

Thank you for the pointer!

by useyourloaf

5/19/2026 at 12:32:52 PM

Obviously, the purpose of this tool isn't to preserve 100% compatibility. Things like removing empty directories makes that clear.

But, why would you remove comments? Presumably, if those are there, they were added for a specific reason. And the author acknowledges the space savings are minimal.

by Wowfunhappy

5/19/2026 at 1:48:48 PM

> Things like removing empty directories makes that clear.

I hope that's disabled by default. Something like: "turning this option on may reduce file size by a small percent, but could impact compatibility."

I suspect the option will be much more useful with file formats that are zip under the hood, where it's easier to test the small subset of applications that read those files and/or update the file specification.

by gwbas1c

5/19/2026 at 2:23:23 PM

Ken Silverman (of Build Engine fame) has written a few deflate-centric compression utilities[0]. The PNGOUT recompressor is the most famous of these (and is very good – it practically always beats OptiPNG), but the suite also includes a .zip archive recompressor called KZIP. I'd be curious to see how ZIP Shrinker compares to this tool.

[0]: https://advsys.net/ken/utils.htm

by MrDOS

5/19/2026 at 6:51:26 AM

> Typically, other archives like .tar.bz2 can be smaller. But those aren’t backwards-compatible!

Is there any point for (new) .bz2 archives in the era of Zstd?

by akx

5/19/2026 at 7:53:22 AM

Tooling ?

It takes years for bzip2 be in every Linux Distro, and we _still_ doing gzip.

LZMA / xz tool are start to get more support, but they are nowhere near universal.

No idea when how long zstd will need.

by j16sdiz

5/19/2026 at 9:15:18 AM

xz is pretty universal across POSIX and clones though. It comes with any modern Linux distro, Busybox even has an .xz decompressor, so `tar xvJF file.tar.xz` does the right thing in *NIX land, which I presume includes MacOS with Brew.

For Windows systems, 7-zip (.7z, similar compression to .xz) is a free download for Windows 10, and Windows 11 can open up a .7z file with a simple double click.

.zip and .gz no longer need to be used here in 2026.

by strenholme

5/19/2026 at 10:34:04 AM

.zip is used as a seekable container with some compression. There is no replacement comparable in simplicity. 7z is overcomplicated, compressed tar is not seekable.

.gz/deflate is used when something very cheap and very fast is needed. xz/lzma is quite often too slow or requires too much memory even on decompression.

so no, .zip and .gz are very much needed in 2026.

by lstodd

5/19/2026 at 11:23:59 AM

Compared to xz and even parallel xz, gzip and parallel gzip are just better if speed is more important. The compression is not superior but already good if you consider just the uncompressed data. For long term storage, it makes sense, to invest the extra time for better compression but if it's about transfer time, you might end up with a overall longer processing time instead of just a longer transfer time because of a worse compression ratio. It's like with image formats: Pick the right one for your use case.

by adapiz

5/19/2026 at 5:55:21 PM

If you add zstd to the comparison matrix, it wins on both speed and compression ratio. Its adoption is quickly catching up to xz as a result, and I expect it to approach gzip in availability in a few years.

by MrDrMcCoy

5/19/2026 at 1:13:17 PM

GitHub won't let you upload a 7z file as an attachment for the issue tracker. Thus forcing me to use an inferior and obsolete compression format.

by Dwedit

5/19/2026 at 12:10:29 PM

gzip is very fast, universally supported, and good enough. It will be around for ever.

you need python 3.14 for zstd.

by jgalt212

5/19/2026 at 2:27:51 PM

Zstd is implemented in C?

by yjftsjthsd-h

5/19/2026 at 8:55:33 AM

Debian? Did they discover it yet?

by Am4TIfIsER0ppos

5/19/2026 at 10:18:30 AM

I think it's been in since debian 11... at least 12, it's been in my default ansible playbooks for a while.

by sigio

5/19/2026 at 8:46:23 AM

APKs need to be zipaligned, I don't see that mentioned.

by jurgenkesker

5/19/2026 at 10:08:36 AM

Fun fact, having page-aligned uncompressed .so files in the APK allows the dynamic linker to mmap them directly out of the ZIP.

by Retr0id

5/19/2026 at 11:21:27 AM

Ooh, with btrfs you could reflink an uncompressed zip entry to its own file on disk.

by jleedev

5/19/2026 at 11:22:29 AM

…and you’ve rediscovered .a (ar) files.

by teddyh

5/19/2026 at 1:04:41 PM

You can also make ZIP files smaller by switching the compression from Deflate to Zstandard. In the one case I tried this, this resulted in a 60% file size decrease [1]. Unfortunately Info-ZIP which provides the unzip command hasn't had a release in 18 years, so it doesn't support this newer compression/decompression method. You have to use 7-Zip instead.

[1] https://github.com/UKGovernmentBEIS/inspect_ai/pull/3145

by KerrickStaley

5/19/2026 at 2:25:24 PM

What is the open standard?

As far as I know, the ISO standard for zip only specifies two compression methods: "store" (no compression) and "deflate". If I follow that, when I create a zip file, I know it's not performant, but at least it's almost universal (except for file ownership, permissions, character encoding and anything modern).

The corporate PKWARE has added other compressions to their original zip software, but those are not in the standard. They will not work for an EPUB, a LibreOffice file, etc. If I want a good compression, I reach for zstd (often through `tar`) or 7z if I want more portability.

by idoubtit

5/19/2026 at 1:09:17 PM

Then it's not a zip file anymore.

Just like if you modified PNG files to use zstandard instead of deflate, but otherwise be identical, it's still not a PNG file anymore.

by Dwedit

5/19/2026 at 1:16:20 PM

That's not true. Zip files have supported other compression algorithms since the late 90s.

by tiagod

5/19/2026 at 1:23:27 PM

I guess its PNG v2 then? ;)

by giancarlostoro

5/19/2026 at 11:06:44 AM

Do any formats using ZIP as the underlying format use ZIP comments for metadata? Unless there's a lot of compressors leaving "Zip file generated by MySuperZipper™" then I imagine any comments left were probably done for a good reason.

by billpg

5/19/2026 at 12:31:17 PM

I'm not aware of any, but it wouldn't be insane to build a seekable deflate implementation by defining offsets in a zip comment. This would leave the zip file backwards compatible to usual decompression while allowing internal seeking within an individual file if the decompressor was aware of this index.

by ebolyen

5/19/2026 at 1:07:36 PM

For seekable gzip indexes in zip, there SOZip: https://github.com/sozip/sozip-spec . However, it stores the indexes as files succeeding the actual file entry. To hide these index files and avoid extraction, they are not listed in the central directory, but a linear scan of the local headers, which some wrongly-behaved ZIP tools do, or which might be necessary for recovering broken ZIP files, would find those hidden indexes.

by mxmlnkn

5/19/2026 at 9:53:46 PM

I made once a maven plugin which reprocesses jar files. It allows to remove extra content such as comments and directories. In addition, it handles nested zip files to increase their compress-ability. And all the features can be toggled individually.

https://luccappellaro.github.io/2015/03/01/ZopfliMaven.html

by luzifer42

5/19/2026 at 10:36:45 AM

> This has the side effect of removing empty directories

yeah, this will inevitably break things. excluding those from the directory stripping shouldn't be too hard (TM)

by seritools

5/19/2026 at 2:35:56 PM

so this tool: - Strips away comments, metadata and directories(!!) - re-compresses the data with deflate (on presumably higher setting)

makes me feel uneasy that sth. which does lossy compression(metadata is lost) is called "ZIP Shrinker". Hope nobody gets surprised by this.

The real solution is to use lzma(2).

by Sweepi

5/19/2026 at 3:58:23 PM

Cool project! Now, Zip-Ada's ReZip does much better, even if you stick with the Deflate compression scheme. For Zip archives, you have more compression schemes available (BZip2, LZMA, ...) and even much better results.

by etrez

5/19/2026 at 8:50:12 AM

Nice, interesting to see if it helps docx much.

by stuaxo