TCP Selective Acknowledgments (SACK)
By stretch | Thursday, June 17, 2010 at 2:34 a.m. UTC
Last week, we examined how TCP uses sequence and acknowledgment numbers to keep track of data in bidirectional transit between two end hosts. However, we observed only an ideal TCP stream, where all packets are delivered to either end successfully, and in order. What happens if one goes missing?
The diagram below illustrates a TCP connection taking place between a client and server separated by a network. Time progresses vertically from top to bottom as packets are sent.
The client sends some request to the server, and the server formulates a response broken into four TCP segments (packets). The server transmits all four packets in response to the request. However, the second response packet is dropped somewhere on the network and never reaches the host. Let's walk through what happens.
Response segment #2 is lost.
The client receives segment #3. Upon examining the segment's sequence number, the client realizes this segment is out of order; there is data missing between the last segment received and this one. The client transmits a duplicate acknowledgment for packet #1 to alert the server that it has not received any (reliable) data beyond packet #1.
As the server is not yet aware that anything is wrong (because it has not yet received the client's duplicate acknowledgment), it continues by sending segment #4. The client realizes that it is still missing data, and repeats its behavior in step three by sending another duplicate acknowledgment for packet #1.
The server receives the client's first duplicate acknowledgment for packet #1. Because the client has only confirmed receipt of the first of the four segments, the server must retransmit all three remaining segments in the response.
The second duplicate acknowledgment received from the client is ignored.
The client successfully receives and acknowledges the three remaining segments.
Enter Selective Acknowledgments
You've probably noticed that this design is inefficient: although only packet #2 was lost, the server was required to retransmit packets #3 and #4 as well, because the client had no way to confirm that it had received those packets.
This problem was originally addressed by RFC 1072, and more recently by RFC 2018, by introducing the selective acknowledgment (SACK) TCP option. SACKs work by appending to a duplicate acknowledgment packet a TCP option containing a range of noncontiguous data received. In other words, it allows the client to say "I only have up to packet #1 in order, but I also have received packets #3 and #4". This allows the server to retransmit only the packet(s) that were not received by the client.
Support for SACK is negotiated at the beginning of a TCP connection; if both hosts support it, it may be used. Let's look at how our earlier example plays out with SACK enabled:
Response segment #2 is lost.
The client realizes it is missing a segment between segments #1 and #3. It sends a duplicate acknowledgment for segment #1, and attaches a SACK option indicating that it has received segment #3.
The client receives segment #4 and sends another duplicate acknowledgment for segment #1, but this time expands the SACK option to show that it has received segments #3 through #4.
The server receives the client's duplicate ACK for segment #1 and SACK for segment #3 (both in the same TCP packet). From this, the server deduces that the client is missing segment #2, so segment #2 is retransmitted. The next SACK received by the server indicates that the client has also received segment #4 successfully, so no more segments need to be transmitted.
The client receives segment #2 and sends an acknowledgment to indicate that it has received all data up to an including segment #4.
Enough Theory, Here's a Capture
This packet capture contains a demonstration of SACKs in action. We know that both end hosts support selective acknowledgments by the presence of the SACK permitted option in the two SYN packets, #1 and #2.
Toward the end of the capture, we can see that packet #30 was received out of order, and the client has sent a duplicate acknowledgment in packet #31. This packet includes a SACK option indicating that the segment in packet #30 was received.
Of course, the SACK option cannot simply specify which segment(s) were received. Rather, it specifies the left and right edges of data that has been received beyond the packet's acknowledgment number. A single SACK option can specify multiple noncontiguous blocks of data (e.g. bytes 200-299 and 400-499).
We can see this duplicate acknowledgment repeated in packets #33, #35, and #37. In each, the SACK is expanded to include the noncontiguous segments the server has continued sending. Finally, the server retransmits the missing segment in packet #38, and the client updates its acknowledgment number appropriately in packet #39.
About the Author
Jeremy Stretch is a network engineer living in the Raleigh-Durham, North Carolina area. He is known for his blog and cheat sheets here at Packet Life. You can reach him by email or follow him on Twitter.
Posted in Packet Analysis
June 17, 2010 at 5:14 p.m. UTC
"We can see this duplicate acknowledgment repeated in packets #33, #5, and #37."
Should be #35
June 17, 2010 at 5:17 p.m. UTC
@Guest: Fixed, thanks for the heads up!
June 19, 2010 at 8:58 a.m. UTC
Thanks again stretch - your first TCP post was on the day of a job interview, where I needed to understand this (the company works in WAN acceleration products). I WAS asked about TCP and was able to provide an adequate reply.
I got the job!! Thanks.
Incredibly - they did also ask about SACK (you hadn't posted this yet) but I couldn't answer. They were happy enough though - they will be training me once I start.
Now I know what SACK is too.
July 1, 2010 at 5:38 a.m. UTC
but how can I enable selecctive acknowledge in my os
August 12, 2010 at 8:09 a.m. UTC
It's probably on by default, but on Linux you can check with:
# sysctl -a | grep sack net.ipv4.tcp_sack = 1 net.ipv4.tcp_dsack = 1
check the sysctl(8) man page for more info (or you can alter the contents of the files in the /proc virtual filesystem directly).
Also, note the dsack, which is a duplicate SACK.
October 29, 2010 at 1:54 a.m. UTC
Great tutorial. Thanks. But in TCP ACK is the next expected one. But why the recieved one given here?
December 13, 2010 at 3:38 p.m. UTC
Agreed with last one, it should show the on it expecting not the one already received.
March 6, 2011 at 8:05 p.m. UTC
I was wondering if the sender (in the capture example) actually uses the received SACK information in order to decide what to retransmit and when to do it.
I may be wrong but it seems to me that the retransmission of packet #38 is triggered by the fast retransmit algorithm. Wouldn't it have been the same even without the SACKs?
March 30, 2011 at 2:30 p.m. UTC
thanks for the nice article further to clear the word multiple :
"A single SACK option can specify multiple noncontiguous blocks of data"
according to RFC & TCP Header size limitation.
RFC 2018 - section 3. Sack Option Format
" A SACK option that specifies n blocks will have a length of 8*n+2
bytes, so the 40 bytes available for TCP options can specify a
maximum of 4 blocks. It is expected that SACK will often be used in
conjunction with the Timestamp option used for RTTM [Jacobson92],
which takes an additional 10 bytes (plus two bytes of padding); thus
a maximum of 3 SACK blocks will be allowed in this case."
June 10, 2011 at 6:44 a.m. UTC
Nice tutorial. And a cool CAPTCHA challenge! How did you implement this?
October 19, 2011 at 7:01 p.m. UTC
Nice Description. This is what a need. If you have more of these. That will be great.
August 7, 2012 at 11:54 a.m. UTC
Awesome dude...I want you to write the new technologies....
October 16, 2012 at 8:03 a.m. UTC
Great document, nicely described. I do have a question which could be off-topic. Why is that every TCP packet in the capture has ACK flag set. After 3rd packet Syn-Ack, I was not expecting the client to send http GET with ACK flag set. Can you clarify this?
April 13, 2013 at 11:50 p.m. UTC
Very good explanation with capture. Help me understand SACK. You are doing awesome job helping people.
May 17, 2013 at 9:00 a.m. UTC
Thanks for the great explaining.
just a question, when using SACK,as you demonstrated,the lost segment#2,is only re-transmitted after all the data is received first, doesn't that means that segment#2 is delayed for a long time assuming we have alot of packets ?
April 25, 2014 at 11:47 a.m. UTC
Good job with the diagram! reading the rfc was a struggle until I searched out your readymade diagram. Thanks!
May 6, 2014 at 9:12 p.m. UTC
June 19, 2014 at 8:39 p.m. UTC
Thanks a lot, a thorough explanation!!
June 20, 2014 at 7:45 p.m. UTC
Incredible post! This was so easy to understand the way it was presented. Thank you so much!
August 6, 2014 at 7:51 p.m. UTC
Thank you so much!
January 2, 2015 at 8:10 p.m. UTC
really helpfull to me. thanks guy.
February 24, 2015 at 5:44 p.m. UTC
Very good explanation about SACK. All these days I was hitting the arrows in dark understanding the sACK.
I am really excited to know how you generated this tcpdump to demonstrate SACK.
April 5, 2015 at 4:41 p.m. UTC
Stretch this is very good piece of work. It is very clear. Thanks for the time and effort you put into this. Well done.
@Vinodh Kumar Every TCP segment carries an Acknowledgement number an has the ACK flag turned on as this does not add any extra overhead to TCP, since the Acknowledgement number field is always present in the TCP header. The only exception is the SYN Segment used to initiate the connection (in the 3WHS). Just thought I'd add this for anyone else who was curious about this.