DragonFly kernel List (threaded) for 2005-07
Sorry for my late response.
> The algorithm is described in /usr/src/sys/netinet/tcp_subr.c, starting
> at around line 1783. It basically calculates the bandwidth delay
> product by taking the minimum observed RTT and multiplying it against
> the observed bandwidth. It's virtually impossible to calculate it any
> other way, because most of the parameters are unstable and would cause
> a positive feedback loop in the calculation to occur (== wildly unstable
Of course, I've read those codes. But I doubt that bandwidth can be
estimated accurately by the mothod.
I made another experiences and found that, if HZ is increased, the
throughput is improved. Note that bandwidth=100Mbit/s=12.5Mbyte/s
and RTT=20ms in my experimental environment. Here are results of
time to send 512KB data without losses.
net.inet.tcp.inflight_enable = 0, HZ=100 167ms
net.inet.tcp.inflight_enable = 1, HZ=100 1305ms
net.inet.tcp.inflight_enable = 1, HZ=1000 398ms
net.inet.tcp.inflight_enable = 1, HZ=10000 189ms
(See http://www.demizu.org/~noritosi/memo/2005/0716/ )
In the second experience, HZ=100 (1/HZ = 10ms) and RTT=20ms.
In this case, the estimated bandwidth (tp->snd_bandwidth) at the end
of the transfer was about 530Kbyte/s, while it should be 12.5Mbyte/s.
(These values are stored in the kernel memory and printed out after
all data has been transfered and acked.)
On the other hand, in the fourth experience, HZ=10000 (1/HZ = 0.1ms)
and RTT=20ms. In this case, the estimated bandwidth at the end of the
transfer was about 12.9Mbyte/s. It is close to the actual bandwidth.
As a result, if one uses BDP limiting, I think 1/HZ should be set to
much smaller value than RTT.
By the way, I observed a very good trace. When HZ=10000 (1/HZ = 0.1ms)
and RTT=20ms, BDP limiting worked as expected.
(This is one of the best shots. In fact, it often experienced packet
losses due to router queue overflow. Nevertheless, since congestion
window did not grow too large, lost data were recovered quickly.
Note that, without BDP limiting, since congestion window grows
exponentially, it takes much time to recover lost data as shown in
> There are a number of possible solutions here, including storing the
> bandwidth in the route table so later connections can start from the
> last observed bandwidth rather then from 0.
I think FreeBSD does this.
> Another way would be to keep
> track of the number of bandwidth calculations that have occured and
> instead of averaging 1/16 in on each iteration the first few samples
> would be given a much bigger piece of the pie.
> Here is a patch that implements the second idea. See if it helps.
I am sorry I have not tested this patch.