Retries…

The dreaded retry.  In wireless networks we can not avoid them.  Retries happen for several reasons from non-responsive clients to collisions and more. The retry is an integral part of ensuring the frame gets through. But at what point do retries become a problem and how do we know? This is where we take a closer look through packet analysis and some statistical analysis of those packets.

Wireshark is a packet analysis tool that is widely used within the industry and I will be referencing it here. Additionally, I am using some packet coloring rules developed by MetaGeek that can be downloaded here (shout out to Joel Crane for posting them!). I also use the technique of exporting the data into CSV from Tony Fortunato (view the YouTube Video here) for further manipulation and maybe adding the statistics into a report for my clients.

First, we open the PCAP file in Wireshark (I am using the latest version 2.4.5 found here). Next, we need to filter on just the Data frames and identify the retry events within those frames. For this part, we can use the Display filter at the top, as shown below:

Wireshark_filter_bar

In this example I am using a compound filter expression so I can see just the frames that are data frames (using the wlan.fc.type_subtype == 0x20) and (by using the double ampersand “&&”) the ones that are retries (using the wlan.fc.retry ==1). This filter will give us this coloring within the decode window using the coloring rules I mentioned earlier as seen below:

Data_frame_retry

One neat thing about Wireshark is that it gives you some statistical analysis by default, in this case down on the right side of the bottom bar in the window as seen below:

Statistics_wireshark_window

What this tells us is that of the total number of packets in the loaded PCAP file and the total number of Data Frames that have the Retry bit set based on our applied Display filter is 1.4% of the total file.  This is a pretty low number and is a good indicator of the efficiency of the network from a data frame perspective, depending on the snapshot of time this capture was taken. But what happens when we add in the additional frames that are not Data only? Let’s take a look.

We remove the first part of the filter:

Wireshark_filter_bar_2

Now let us look at one frame within the result:

Probe_response_retry_1

Here we see a Probe Response frame that is a retry. The statistical analysis provided by Wireshark is shown below:

Statistics_wireshark_window_2

What happened? Why did the statistics jump from 1.4% to 20.9%? The answers lie in some of the ways the network is operating. First, keep in mind that for every SSID there are separate Management frames sent out by the infrastructure. As seen below, in this case there are 5 separate SSID’s in the environment, as shown by the 5 different BSSID addresses in the left colum:

Probe_response_SSID

Thanks to our coloring rules (purple) we know these are Probe responses and due to our filter we know these are retries. Therefore, one explanation in the jump of the number of retries is due to the responses to a Probe request of the “ANY” SSID from the client device and then the lack of acknowledgement from said client results in the retries of all the Probe responses.

Until now we have used the built in calculators within Wireshark but what if we want to display the data in a different format or include it in a report in a chart format?  This is where we use Tony Fortunato’s technique to pull the data out into a CSV format for adding into Excel and use Pivot Charts. An example of the above data in a chart is shown below:

Retry_Pivot_chart

All this data analysis is great, however it does not answer our original questions: at what point do retries become a problem and how do we know? Generally speaking the lower the retry rate the more efficient your network is. Think of it like a conversation between two people, the less you have to repeat yourself the more time you have to convey more information. I like to see somewhere around 10 to 15% at the most, if possible. In the example shown, 20.9% is a little higher than I’d like to see so I might take steps to try and bring that number down. Since we have 5 SSIDs being transmitted, I would look to see if one or more of those could be eliminated, cutting down on Management Retries. Depending on the situation, I might trying making the AP coverage cells smaller – lowering AP TX Power and, in a Cisco system possibly using RX-SOP changes so the AP responses to clients farther away are cut down. We can also look for “hidden nodes” and other issues within the design as well as using best practices to ensure our networks are operating as efficiently as possible.

In the end we must understand that retries serve a specific purpose and that not all retries are bad. By using packet analysis and some statistical information, we can quickly see if this is a problem or if it is just a matter of course for the network. By the way, did I tell you this example capture was extremely quiet from a data point of view, or in other words the predominance of the traffic was Management and Control frames?  Does that change the analysis? You decide.

2 thoughts on “Retries…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s