I use Zip Bombs to Protect my Server

some_guy@lemmy.sdf.org · 6 months ago

I use Zip Bombs to Protect my Server

palordrolap@fedia.io · 6 months ago

The article writer kind of complains that they’re having to serve a 10MB file, which is the result of the gzip compression. If that’s a problem, they could switch to bzip2. It’s available pretty much everywhere that gzip is available and it packs the 10GB down to 7506 bytes.

That’s not a typo. bzip2 is way better with highly redundant data.

just_another_person@lemmy.world · edit-2 6 months ago

I believe he’s returning a gzip HTTP response stream, not just a file payload that the requester then downloads and decompresses.

Bzip isn’t used in HTTP compression.

sugar_in_your_tea@sh.itjust.works · edit-2 6 months ago

Brotli is an option, and it’s comparable to Bzip. Brotli works in most browsers, so hopefully these bots would support it.

I just tested it, and a 10G file full of zeroes is only 8.3K compressed. That’s pretty good, though a little bigger than BZip.

bss03@infosec.pub · 6 months ago

For scrapers that not just implementing HTTP, but are trying to extract zip files, you can possibly drive them insane with zip quines: https://github.com/ruvmello/zip-quine-generator or otherwise compressed files that contain themselves at some level of nesting, possibly with other data so that they recursively expand to an unbounded (“infinite”) size.

sugar_in_your_tea@sh.itjust.works · 6 months ago

Brotli gets it to 8.3K, and is supported in most browsers, so there’s a chance scrapers also support it.

Aceticon@lemmy.dbzer0.com · 6 months ago

Gzip encoding has been part of the HTTP protocol for a long time and every server-side HTTP library out there supports it, and phishing/scrapper bots will be done with server-side libraries, not using browser engines.

Further, judging by the guy’s example in his article he’s not using gzip with maximum compression when generating the zip bomb files: he needs to add -9 to the gzip command line to get the best compression (but it will be slower). (I tested this and it made no difference at all).

sugar_in_your_tea@sh.itjust.works · edit-2 6 months ago

You can make multiple files with different encodings and select based on the Accept-Encoding header.

Aceticon@lemmy.dbzer0.com · 6 months ago

Yeah, good point.

I forgot about that.

some_guy@lemmy.sdf.org · 6 months ago

TIL why I’m gonna start learning more about bzip2. Thanks!

Xanza@lemm.ee · edit-2 6 months ago

zstd is a significantly better option than anything else available unless you need something specific for a specific reason: https://github.com/facebook/zstd?tab=readme-ov-file#benchmarks

LZ4 is likely better than zstd, but it doesn’t have wide usability yet.

palordrolap@fedia.io · 6 months ago

You might be thinking of lzip rather than lz4. Both compress, but the former is meant for high compression whereas the latter is meant for speed. Neither are particularly good at dealing with highly redundant data though, if my testing is anything to go by.

Either way, none of those are installed as standard in my distro. xz (which is lzma based) is installed as standard but, like lzip, is slow, and zstd is still pretty new to some distros, so the recipient could conceivably not have that installed either.

bzip2 is ancient and almost always available at this point, which is why I figured it would be the best option to stand in for gzip.

As it turns out, the question was one of data streams not files, and as at least one other person pointed out, brotli is often available for streams where bzip2 isn’t. That’s also not installed by default as a command line tool, but it may well be that the recipient, while attempting to emulate a browser, might have actually installed it.

Xanza@lemm.ee · 6 months ago

No. https://github.com/lz4/lz4

LZ4 already has a caddy layer which interprets and compresses data streams for caddy: https://github.com/mholt/caddy-l4

It’s also very impressive.