Too little bandwidth #2

sosherof · 2021-10-20T04:33:43Z

I'm setting up a new server with mod_bw-0.92-2.4.x-x64-vs16 from ApacheHaus on Apache 2.4.51, on Windows Server 2019. I just discovered that with mod_bw enabled, files download very very slowly, far slower than the specified bandwidth. The server has a 10 gig interface. The bandwidth specified is very large (250 MB/sec), as I'm just trying to prevent multiple simultaneous downloads from overwhelming the interface.

Any ideas on the cause? I've tried various values for the packet size and lowered the bandwidth to 250 KB/sec (just took 3 zeros off the end). As the server is not in production yet, my test download was the only active session.

These are my settings:
BandWidthModule On
ForceBandWidthModule On
#Limit overall server bandwidth to 3 Gpbs (which is actually 375 megabytes/sec)
BandWidth all 375000000
MinBandWidth all 0
BandWidthPacket 32768
#Limit large file (more than 10mb) transfers to 2Gbps
LargeFileLimit * 10240 250000000

I saw a post mentioning issues with timing resolution in Windows. Could that be the cause?

sosherof · 2021-10-20T23:14:31Z

I commented out your fixes for WIN32, which enforced a minimum 200ms delay between packets. It's working much better now. If I understand the fix and based on my bandwidth requirements above, the fix results in a 50 MB packet size for my large (400 MB) file (250000000 / 5 = 50 MBps). However, it looks like the problem is the bucket size is only 8000 bytes. So the Windows fix ends up limiting the bandwidth to 8000 bytes per 200ms.

Per the APR guide (https://apr.apache.org/docs/apr-util/trunk/group___a_p_r___util___bucket___brigades.html#ga82bf404af30875135c65e2c13ad035e5), the default bucket size is 8000 bytes.

I'm not seeing a quick way to make the buckets bigger for file reads so it seems like the better approach is to calculate how many buckets it takes to get to the minimum timer resolution, say 10 ms, based on desired bandwidth, then pass on that number of buckets before sleeping. I haven't had a chance to work on that further today though.

sosherof · 2021-10-21T23:06:17Z

I've worked around this problem by re-writing your Windows-only section such that if the calculated sleep time is less than 10ms, another calculation determines how many buckets could be passed in 10ms and sets up a counter. The counter is decremented for each bucket passed until it reaches zero, then sleeping happens (for 10ms). If the available bandwidth changes before the counter reaches zero, the counter is updated to reflect the new rate minus the number of packets already sent. If that's less than zero (i.e. more packets already sent than the newly available bandwidth), the counter is zeroed so sleep happens after the current bucket is sent.

This has two side-effects in Windows:

It's not really possible to rate limit transmissions with a single small bucket. In theory a mess of clients requesting small files won't get a limit applied (think DDoS attack). This is also true when running on a non-Windows machine, as a sufficiently large bandwidth results in sleep times less than 1 microsecond for small content (though it won't take many clients to bring the sleep value above 1 microsecond again).
Passing along X number of buckets may take longer than 10ms if a client's own bandwidth is a limiting factor. That's also true for non-Windows but definitely more pronounced when sleeping for 10ms as this introduces a "long" pause that might not be necessary. The code could benefit from trying to determine the time elapsed between bucket passing. If the client's calculated bandwidth is already less than available bandwidth, don't sleep.

I didn't do a pull as I understand you have a new version "coming soon" but I'd be happy to share my suggested code updates.

IvnSoft · 2021-10-25T18:59:12Z

Hi,
Its been a long time since compiling and trying on windows. So, @sosherof, if you managed improvements and want to share, do a pull :)

The new version has been frozen for a while (loooooooong while) due to work (or lack there of), but it is coming.

sosherof · 2021-10-25T22:50:15Z

I'll see what I can do.. My previous experience/knowledge of git/github has been limited to posting issues and/or commenting on them.

In summary, I've really made three changes at this point:

Large files will result in brigades where the buckets are fixed to 8K. Each bucket read fetches an 8K DATA bucket and creates a new FILE bucket with an updated offset. I'm not sure if this is new to Apach 2.4 and what size file causes this behavior. As I indicated, this causes a problem for Windows where you really can't split the bucket into a useful smaller size. So for Windows, the module now uses the calculated speed to determine how many packets it can pass along in 10ms without pausing. Why 10ms? Seemed like a good number. Since Windows seems to default to 15ms resolution, maybe 15ms would be a better number. I'll probably make that an easy-to-tweak constant, or even configurable, before posting my code updates. Each no-pause loops looks at the calculated sleep value and determines if the available bandwidth has changed. If it's changed, the number of packets to send before pausing (for 10ms) is adjusted up/down based on new packets-in-10ms minus packets already sent.
The module now makes an attempt to determine and track the client's bandwidth. This is done by storing the current time (in microseconds) at filter invocation, accumulating the number of bytes sent between client bandwidth calculations, accumulating the amount of time paused (so it can be subtracted out in the math) between client bandwidth calculations and then calculating the bandwidth. This is done roughly every 10ms. If the client's calculated bandwidth is already less than the available bandwidth, no pausing/sleeping occurs. This isn't specific to Windows but is especially import for Windows, since 1ms is the shortest sleep time and that introduces unnecessary bandwidth loss.
Per the documentation, and even the way this module behaves toward any downstream filters, a bucket brigade may contain only one bucket. The filter may instead get invoked multiple times. Based on that, it seemed necessary to track the above information across filter invocations using the ctx structure.

Anyway, I'm taking the crash course on git right now so hopefully I can upload my code soon.

IvnSoft · 2021-10-26T01:15:00Z

My github expertise is ... basic, so dont worry ! If you want, just post a diff here :)

sosherof · 2021-10-26T19:26:18Z

Ok.. I created a PR. I decided not to make the Windows configurable, leaving it at 10ms.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Too little bandwidth #2

Too little bandwidth #2

sosherof commented Oct 20, 2021 •

edited

Loading

sosherof commented Oct 20, 2021

sosherof commented Oct 21, 2021

IvnSoft commented Oct 25, 2021

sosherof commented Oct 25, 2021

IvnSoft commented Oct 26, 2021

sosherof commented Oct 26, 2021

Too little bandwidth #2

Too little bandwidth #2

Comments

sosherof commented Oct 20, 2021 • edited Loading

sosherof commented Oct 20, 2021

sosherof commented Oct 21, 2021

IvnSoft commented Oct 25, 2021

sosherof commented Oct 25, 2021

IvnSoft commented Oct 26, 2021

sosherof commented Oct 26, 2021

sosherof commented Oct 20, 2021 •

edited

Loading