Discussion:
[ath9k-devel] ath9k_htc kernel driver regression affecting throughput
UsuarioAnonimo
2016-08-25 19:27:22 UTC
Permalink
My primary conntection to the internet has been the Alfa AWUS036NHA
wireless adapter which uses the Atheros AR9271 802.11n chipset. The
applicable linux driver/module and firmware is ath9k_htc. Ever since
the upgrade from linux kernel 4.3.x to 4.4.x the throughput of the
adapter chipset has been severely affected making the adapter basically
unusable.

It will still readily connect to a distant access point. But instead of
previously giving me steady and reliable download throughput in the
range of 1/3 Mb/sec to almost a full Mb/sec with no packet loss, now
after a minute or two it will very quickly slow and stall down to 1 or
2 kbs/sec and 100% packet loss. If I reconnect, sometimes I will get
good throughput for another minute or two before it stalls to nothing
again.

I know this is a software issue, not hardware, because I have other
distros installed on the same computer or another computer with linux
kernel 4.3.x installed and the wifi adapter works beautifully as before.

I also now strongly suspect that this is not firmware related, but
kernel driver related, that there was a regression or bug introduced
into one of the .ko driver modules (whether ath9k, ath9k_htc, or other)
in kernel 4.4.x and it is still present in linux kernel 4.7.x (Arch and
Manaro linux).

I am surprised this basic functionality in managed mode has persisted
for so long without other people noticing it or addressing it. I've
been banging my head up against a wall for 6 months now with failed
workarounds. And finally have settled on using other wifi adapters that
have much lower throughput than the Atheros chipset, but which I can
work with to some degree.

If anybody is willing or able to address this, I would be happy to run
any tests and send any log output to try to help solve this. Thank you
for your attention.
bruce m beach
2016-08-26 02:40:50 UTC
Permalink
Hello

Just mention that I'm using kernel 4.2.2, a 35 ft usb cable going
up a 20 ft pole
into a parabolic trough, through a forest, across a river and through another
forest to my access point about a mile away and normally get about 1.2 mbits/s
The only problem I have is the USB overheats during the hot summer days
and the usb link goes down.

I have been working on the ar9271 firmware for 8 months now and that may
have something to do with it, although I doubt it.

Bruce
Post by UsuarioAnonimo
My primary conntection to the internet has been the Alfa AWUS036NHA
wireless adapter which uses the Atheros AR9271 802.11n chipset. The
applicable linux driver/module and firmware is ath9k_htc. Ever since
the upgrade from linux kernel 4.3.x to 4.4.x the throughput of the
adapter chipset has been severely affected making the adapter basically
unusable.
It will still readily connect to a distant access point. But instead of
previously giving me steady and reliable download throughput in the
range of 1/3 Mb/sec to almost a full Mb/sec with no packet loss, now
after a minute or two it will very quickly slow and stall down to 1 or
2 kbs/sec and 100% packet loss. If I reconnect, sometimes I will get
good throughput for another minute or two before it stalls to nothing
again.
I know this is a software issue, not hardware, because I have other
distros installed on the same computer or another computer with linux
kernel 4.3.x installed and the wifi adapter works beautifully as before.
I also now strongly suspect that this is not firmware related, but
kernel driver related, that there was a regression or bug introduced
into one of the .ko driver modules (whether ath9k, ath9k_htc, or other)
in kernel 4.4.x and it is still present in linux kernel 4.7.x (Arch and
Manaro linux).
I am surprised this basic functionality in managed mode has persisted
for so long without other people noticing it or addressing it. I've
been banging my head up against a wall for 6 months now with failed
workarounds. And finally have settled on using other wifi adapters that
have much lower throughput than the Atheros chipset, but which I can
work with to some degree.
If anybody is willing or able to address this, I would be happy to run
any tests and send any log output to try to help solve this. Thank you
for your attention.
bruce m beach
2016-08-31 03:52:37 UTC
Permalink
I'm starting to get concerned that it will never get fixed, especially since
I can't seem to convince anybody that it's real or that it's a problem. I've
got wireshark installed and I can watch the signal gum up quickly with
malformed packages, dup awks, spurious retransmissions, and terminations, but
I don't know how to translate that into identifying the problem in the
driver/module.
I tried to boot linux-4.7.2 to try to see if I have the same problem but the
video is broken on samsung exynos (again) so I can't do anything until
the video is fixed, unless there is someway to backport 4.7.x or 4.4.x
to 4.2.2 ( anybody? ) Meanwile have you tried getting a really good wireless
link within the immediate vicinity of your wireless device? Have you tried
a different link with a completely different ap.
I am surprised this basic functionality in managed mode has persisted for so
long without other people noticing it or addressing it
This is the thing: If it is true then why are there no other reports of
this behaviour. None the less I wouldn't give up, these impossible things
usually get resolved and become history as long as you stick to it.


Bruce
Oleksij Rempel
2016-08-31 15:02:58 UTC
Permalink
Post by bruce m beach
I'm starting to get concerned that it will never get fixed, especially since
I can't seem to convince anybody that it's real or that it's a problem. I've
got wireshark installed and I can watch the signal gum up quickly with
malformed packages, dup awks, spurious retransmissions, and terminations, but
I don't know how to translate that into identifying the problem in the
driver/module.
I tried to boot linux-4.7.2 to try to see if I have the same problem but the
video is broken on samsung exynos (again) so I can't do anything until
the video is fixed, unless there is someway to backport 4.7.x or 4.4.x
to 4.2.2 ( anybody? ) Meanwile have you tried getting a really good wireless
link within the immediate vicinity of your wireless device? Have you tried
a different link with a completely different ap.
I am surprised this basic functionality in managed mode has persisted for so
long without other people noticing it or addressing it
This is the thing: If it is true then why are there no other reports of
this behaviour. None the less I wouldn't give up, these impossible things
usually get resolved and become history as long as you stick to it.
Bruce
_______________________________________________
ath9k-devel mailing list
https://lists.ath9k.org/mailman/listinfo/ath9k-devel
here are test results which i get with recent kernel:

uname -a
Linux ultralex 4.8.0-rc3-00201-gaf56ff2 #200 SMP Sun Aug 28 14:14:36
CEST 2016 x86_64 x86_64 x86_64 GNU/Linux


./tests zwerg.local
tcp from client to server
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
zwerg.local () port 0 AF_INET : demo
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec

87380 16384 16384 10.04 24.80
tcp from server to client
MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
zwerg.local () port 0 AF_INET : demo
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec

87380 16384 16384 10.00 48.61
udp from client to server
MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
zwerg.local () port 0 AF_INET : demo
Socket Message Elapsed Messages
Size Size Time Okay Errors Throughput
bytes bytes secs # # 10^6bits/sec

212992 65507 10.02 1062 0 55.57
212992 10.02 1062 55.57

two tcp tasts at same time. both directions
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
zwerg.local () port 0 AF_INET : demo
MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
zwerg.local () port 0 AF_INET : demo
Interim result: 49.69 10^6bits/s over 5.001 seconds ending at
1472655100.562
Interim result: 50.81 10^6bits/s over 5.001 seconds ending at
1472655105.562
Interim result: 1.25 10^6bits/s over 19.644 seconds ending at
1472655115.205
Interim result: 25.92 10^6bits/s over 9.793 seconds ending at
1472655115.355
Interim result: 1.71 10^6bits/s over 5.059 seconds ending at
1472655120.264
Interim result: 50.22 10^6bits/s over 5.015 seconds ending at
1472655120.370
Interim result: 1.67 10^6bits/s over 5.036 seconds ending at
1472655125.300
Interim result: 49.50 10^6bits/s over 5.072 seconds ending at
1472655125.442
Interim result: 1.69 10^6bits/s over 5.036 seconds ending at
1472655130.336
Interim result: 50.05 10^6bits/s over 5.001 seconds ending at
1472655130.444


Since it is not my reference setup i can't absolute maximum, but what i
get in this busy network with 20HT is not bad - 50Mbit/s!

I would suggest to start with git bisect to track down actual source of
regression.
--
Regards,
Oleksij
bruce m beach
2016-09-01 04:01:58 UTC
Permalink
Oleksij

I managed to boot 4.7.2 and got on an AP. The signal strange was around -67 db
which about par for this connection. Download speed was about 290kb/s
a little bit
slow but it is raining, so It seems that 4828I'm getting a mostly
normal connection. The only
anonaly is
1 retry : not so bad
4828 of misc which is very unusual

These numbers are coming from /proc/net/wireless

Bruce
Post by Oleksij Rempel
Post by bruce m beach
I'm starting to get concerned that it will never get fixed, especially since
I can't seem to convince anybody that it's real or that it's a problem. I've
got wireshark installed and I can watch the signal gum up quickly with
malformed packages, dup awks, spurious retransmissions, and terminations, but
I don't know how to translate that into identifying the problem in the
driver/module.
I tried to boot linux-4.7.2 to try to see if I have the same problem but the
video is broken on samsung exynos (again) so I can't do anything until
the video is fixed, unless there is someway to backport 4.7.x or 4.4.x
to 4.2.2 ( anybody? ) Meanwile have you tried getting a really good wireless
link within the immediate vicinity of your wireless device? Have you tried
a different link with a completely different ap.
I am surprised this basic functionality in managed mode has persisted for so
long without other people noticing it or addressing it
This is the thing: If it is true then why are there no other reports of
this behaviour. None the less I wouldn't give up, these impossible things
usually get resolved and become history as long as you stick to it.
Bruce
_______________________________________________
ath9k-devel mailing list
https://lists.ath9k.org/mailman/listinfo/ath9k-devel
uname -a
Linux ultralex 4.8.0-rc3-00201-gaf56ff2 #200 SMP Sun Aug 28 14:14:36
CEST 2016 x86_64 x86_64 x86_64 GNU/Linux
./tests zwerg.local
tcp from client to server
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
zwerg.local () port 0 AF_INET : demo
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
87380 16384 16384 10.04 24.80
tcp from server to client
MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
zwerg.local () port 0 AF_INET : demo
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
87380 16384 16384 10.00 48.61
udp from client to server
MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
zwerg.local () port 0 AF_INET : demo
Socket Message Elapsed Messages
Size Size Time Okay Errors Throughput
bytes bytes secs # # 10^6bits/sec
212992 65507 10.02 1062 0 55.57
212992 10.02 1062 55.57
two tcp tasts at same time. both directions
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
zwerg.local () port 0 AF_INET : demo
MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
zwerg.local () port 0 AF_INET : demo
Interim result: 49.69 10^6bits/s over 5.001 seconds ending at
1472655100.562
Interim result: 50.81 10^6bits/s over 5.001 seconds ending at
1472655105.562
Interim result: 1.25 10^6bits/s over 19.644 seconds ending at
1472655115.205
Interim result: 25.92 10^6bits/s over 9.793 seconds ending at
1472655115.355
Interim result: 1.71 10^6bits/s over 5.059 seconds ending at
1472655120.264
Interim result: 50.22 10^6bits/s over 5.015 seconds ending at
1472655120.370
Interim result: 1.67 10^6bits/s over 5.036 seconds ending at
1472655125.300
Interim result: 49.50 10^6bits/s over 5.072 seconds ending at
1472655125.442
Interim result: 1.69 10^6bits/s over 5.036 seconds ending at
1472655130.336
Interim result: 50.05 10^6bits/s over 5.001 seconds ending at
1472655130.444
Since it is not my reference setup i can't absolute maximum, but what i
get in this busy network with 20HT is not bad - 50Mbit/s!
I would suggest to start with git bisect to track down actual source of
regression.
--
Regards,
Oleksij
bruce m beach
2016-09-01 04:42:08 UTC
Permalink
No. I've been online for ahtc_9271 hour or so and everything
seems to be fine. I downloaded a 329Mbyte file and a
169Mbyte file, got on IRC talked for a While and so on.

As for firmware I am using a personal tree but all of
changes over the last 8 months always ended with a

cmp htc_9271 htc_9271.reference

with no differences. I.e. I'm not actually changing how it works.

As far as I know the firmware has been static for a year or
so, so you shouldn't worry about it. It really sounds like
you have some strange anomally with your OS. As I say the
only thing that is strange is the very high miscellaneous
errors which is currently 14975.

For ifconfig I have:
RX packets:373883 errors:0 dropped:15 overruns:0 frame:0
TX packets:217048 errors:0 dropped:0 overruns:0 carrier:0
RX bytes:562,229,747 TX bytes:19,932,871
Oleksij Rempel
2016-09-01 07:21:29 UTC
Permalink
To initial report.

Please test latest linux master branch:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/

no body will investigate old and probably fixed bugs. (except of
companies which get paid for this.)
Post by bruce m beach
No. I've been online for ahtc_9271 hour or so and everything
seems to be fine. I downloaded a 329Mbyte file and a
169Mbyte file, got on IRC talked for a While and so on.
As for firmware I am using a personal tree but all of
changes over the last 8 months always ended with a
cmp htc_9271 htc_9271.reference
with no differences. I.e. I'm not actually changing how it works.
As far as I know the firmware has been static for a year or
so, so you shouldn't worry about it. It really sounds like
you have some strange anomally with your OS. As I say the
only thing that is strange is the very high miscellaneous
errors which is currently 14975.
RX packets:373883 errors:0 dropped:15 overruns:0 frame:0
TX packets:217048 errors:0 dropped:0 overruns:0 carrier:0
RX bytes:562,229,747 TX bytes:19,932,871
_______________________________________________
Loading...