Webserver DOS, with linux move file - and broken file move semantics with webservers.

When a file is moved or removed on linux any processes with that file open still see the old file. So this means if you move a new 2 gig file over the top of an old 2 gig file, and some processes still have that file open there will be about 4 gigs of space used up - until the old file is closed.

Some webservers keep a file open for as long as the client is downloading it. Apache is one web server that does this. Some other webservers do not do this - like lighttpd.

The problem with reopening a file for smaller parts of a file as it is served to a web client - is that it breaks unix move semantics. The webclient will get a combination of both files, not one file or the other. This can be a problem in many cases. Consider a client downloading a html file that changes mid file. html tags won't balance up, and the client will download a syntactically invalid file.

So here is how a DOS can happen...

Say you have a big file mirror or something with lots of files that change fairly regularly. Perhaps a debian mirror, or a shareware mirror. If a DOS client wants to fill up a drive, and possibly cause corruption, or an incomplete mirror - all they need to do is start slowly downloading many files close to the time when the files are supposed to be updated. This requires very little resources on the client side to cause massive resource use on the server side. One client could take up less than 5 MiB of memory to make 2000+ connections and cause eg 2000 * 2 gig disk use - leading to disk empty situations.

However constantly reopening a file will help stop this type of DOS attack, it might corrupt downloaded files.

The changed file move semantics are something to watch out for with some webservers. Servers like lighttpd stop this form of DOS attack, but break file move semantics in the process. So you can not rely on them in your applications.

Comments

Popular posts from this blog

Draft 3 of, ^Let's write a unit test!^

Is PostgreSQL good enough?

post modern C tooling - draft 6