Basic Overview of sending over TCP\IP
Basic Overview of sending over TCP\IP :
TCP\IP Sends
Ver 0.8
Section 0: Preamble
The following is a list of the steps involved in sending data from one computer to another one over a TCP\IP network. This is just covering the travel of a single packet, this isn't meant to cover TCP\IP handshakes or the like. This is a quick and dirty covering of what happens, and I try not to get too much in depth, except for listing the actual header fields for each OSI layer. These sections may be skipped if the user wants. This is just to give someone a general understanding of how things work, it should not be taken as fact.
The information here, while mostly correct, should not be looked at as 100% truth, cause it's not. This is BASICALLY what happens, and will give you a better understanding of how it works. It's in truth much more complex then this, so I “lie” in a few spots to help make things seem less complex. None of this information is wrong, unless I got confused while writing a part, but a few parts aren't the “whole truth”. The order of the steps is one of the main things that is not 100% true. While the order of each of the layers is true, since they must be traveled in order, the steps inside each layer is just a rough telling.
Also note, when I got to the switching part, I started writing about routers, and got confused about what one was I was writing about. I think I cleaned it up, but if something seems out of place, that's why.
NOTE: This FAQ uses the OSI model, and while knowing how it works will help, it is not needed to understand this. A quick over view of the layers is all that's needed.
Section 1: Data Flow Layers of the OSI Model.
This section will cover layers two (Data Link), three (Network) and four (Transport) layers of the OSI model. Layer 1 is the Physical Layer, the hardware and wires used. It's not needed to be covered in this. The seven layers are as follows:
Layers 7~5: Application Layers:
Layer 7 – Application Layer. This is the program running on your computer. Be it MSN messenger or a paint program saving a file to a remote server.
Layer 6 – Presentation Layer. This layer is the data formating of the application. ASCII, Unicode, .bmp and .jpg are all examples from Layer 6.
Layer 5 – Session Layer. This layer is used to keep data separate between applications, and it controls what programs receive what data.
Layers 4~1: Data Flow Layers:
Layer 4 – Transport Layer. In a TCP\IP network, this is where the transport protocol, TCP, comes into play. Note also this is where UDP would be used if it used UDP. This is also the layer that ports come into play.
Layer 3 – Network Layer. This is where IP is used, and the layer where an IP address is used.
Layer 2 – Data Link Layer. This is where the MAC address is used.
Layer 1 – Physical Layer. This is the actual hardware it's self.
Except for layer 1, each layer adds a header to the data being sent. And as it hands it back up the ladder to the application, each layer strips off it's own header. For example, say I sent the data “Hello World”. In Layer 4, a TCP header is stacked onto it, making it [Layer4]”Hello World”. Then a layer 3 header is added, [Layer3][Layer4]”Hello World”. And finally a layer 2 header is applied. [Layer2][Layer3][Layer4]”Hello World”. It's then send across the wire (Layer 1) to it's target. Each layer reads the headers, and decides what to do with it.
Section 2a: Header Information
So what's in these headers? This part may be skipped if need be, as it covers the basic details of a TCP header.
The TCP header is normally 20 bytes, or 160 bits. The first 16 bits are for the Source Port, also called Calling port. The next 16 bits are for the Destination Port, or Called port. This is the port that the packet is going to (Like port 80 on http) This covers bits 0 ~ 31.
Bits 32~63 are used for the Sequence Numbering. It is used to keep packets in order. So if one packet were to reach the target slightly later then the packet sent just after it, the target can reassemble it into the right order. The next 32 bits, Bits 64~95, are used for the Acknowledgment Number. This is used for error control and handshaking, among other things. Then for 4 bits, bits 96~99, comes the Header Length field. It's the number of 32-bit words in the header. While the “header” is 20 bytes, there's a header options field that can be stuck on.
After the header length, comes the Reserved field, taking up bits 100~105 (6 bits). It's like it says, reserved, and normally set to 0. Next is code bits, again a 6 bit field, taking up bits 106~111. These control many things in a session, like when it's canceled\terminated. These are also called the Flag options, or other terms like this. Next comes the Window, a 16 bit field covering bits 112~127. This basically controls how big the data packets are that are being sent. After the window is a checksum field, this field is a 16 bit field, and is used to tell if the packet was damaged. This covers bits 128~143.
Bits 144~159 is the final field in the TCP header, the Urgent Pointer. Normally this is unused, unless the Urgent flag in the code bits is enabled. That tells the computer to process this data right now, and this Urgent Pointer tells the computer how much data must be processed right now. It's not normally a used field. After this, there may be a 32 bit options field. This is basically a field reserved for additions to the TCP\IP setup. After this field, then finally comes the data.
Ok, so after the TCP header is on the data, it gets handed down to the Network Layer, Layer 3. This is where the IP address is added to the packet. This means I get to write down another crappy bit map of the header, yay. And again, it's 20 Bytes, not counting options. But this one has even MORE fields.
So here's a list of the IP header. The first 4 bits, 0~3, is Version number. Like Version 4 or Ipv6.Next 4 bits is Header Length, just like the TCP one, it basically lets the device know if the options field is attached or not. It's bits 4~7. Bits 8~15 is Priority and Type of Service. The use of these fields aren't really covered here, but the name should describe the basics of them. Next comes Total Length, bits 16~31. It tells the length of the IP header, TCP Header, and Data. Next on bits 32~47, Identification field. Basically used if the packet becomes fragmented. To be honest, this field still confuses me. Next comes flags, basically they tell if fragmenting should occur. It's a 3 bit field, 48~50. After that is Fragment Offset, bits 51~63. Fragmentation will be mentioned later. Next it the 8 bit TTT field, or Time-To-Live. It's bits 64~71. After this comes protocol field. This is marked if it's a ICMP (like ping) or a IGRP (Routing protocol). Protocol is bits 72~79. This is not used in this FAQ. After that comes the header checksum, it's bits 80~95. Bits 96~127is the Source IP Address, and bits 128~159 is the destination IP. If IP Options is used, it's bits 160~191.
Layer 2, Data Link Layer, has 2 sub-layers. The layer closet to IP layer is the LLC Layer, or Logical Link Layer. There's not just one type of header for LLC , but two different ones.
The first type of LLC is a SNAP, or Subnetwork Access Protocol. The second is SAP, or Service Access Point. These are defined under the 802.2 standard. These are not used for this FAQ, and do not need to be covered here. Some links for more reading on LLC headers:
http://www.google.com/search?sourceid==navclient&ie==UTF-8&q=€2%2E2
http://www.google.com/search?hl==en&lr==&safe==off&q==LLC+Header
The final header is the MAC, or Media Access Control sublayer. This is the final header added onto a packet before it's placed on the wire. The first 8 bytes, 0~7, is a preamble, an alternating parent of 1's and 0's to let objects know a frame is coming. After the preamble, is the Destination Address. This is the MAC address, or physical address of network card. Next comes the Source Address, or the MAC address of who's sending the packet. After that comes a length field again (Except Ethernet II standards, it uses a TYPE field) Then is has the payload (Including the LCC header) and at the end of it all, it sticks on a 4 byte FCS field, this is basically a CRC check for the whole frame.
Section 2:
Sending of data:
NOTE: To summarize what's inside the TCP header (Layer 4, Transport), it holds the Source and Destination port numbers. All other data in the header has little bearing on this article. The IP header (Layer 3, Network) carries the Source and Destination IP Address. The Data Link Layer, Layer 2, holds the Source and Destination MAC addresses.
So, now that all the layers are covered, what happens when a computer talks to another computer? Let's try a simulation.
Section 2a: Sending to a Local Subnet
Setup Information:
Computer A:
IP Address: 192.168.0.200
Subnet Mask: 255.255.255.0
Default Gateway: 192.168.0.1
Mac Address: 10-00-00-00
Computer B:
IP Address: 192.168.0.205
Subnet Mask: 255.255.255.0
Default Gateway: 192.168.0.1
Mac Address: 10-00-00-AA
Computer A will be sending to computer B. After the data goes though the application layers, Layers seven (7) through five (5), it reaches the transport layer, where the TCP header is made. By looking at data from the application layers, Layer 4 decides if it should be sent using UDP or TCP. In this case, we'll be using TCP. After TCP is picked, it has to decide what port numbers to use. The destination port number is decided by the application layers, for example, port 80 if it was going to http. The source port is the port that the program will listen to for a reply. This port tells the Network layer what application gets what packet. It's generally a randomly generated number above 1024. We'll pick 2020 for the hell of it. So the TCP packet has ports 1024 and 2020.
Next it's handed to the Internet layer. First step, it has to figure out where the packet is going, so it adds on the destination IP address. If the computer has more then 1 network card, it needs to know what card to send it on, or needs to know if it's going to send the packets on more then one line. Since we have one card in this example, the source IP address is simple. It's the IP address of the card, 192.168.0.200. Now there's one last thing to find out. Is this data going to a computer on our subnet, or a different subnet?
That's where we use the default gateway. If you used a bitwise AND of the target IP, and our own subnet mask, you get 192.168.0.0, that's the network of the target computer. We then take our own IP address, and our own subnet mask, and found out our network by AND'ing again. And again, it's 192.168.0.0. so we know it's a local address.
The LLC layer does not come into play for this FAQ, so while it is used, it's not covered int his example. So now we use the MAC sublayer. The Source MAC address is simple, it's burned on our network card. The destination MAC address can be the hard part. We need to use ARP, Address Resolution Protocol, to get the MAC address of the target IP address. First thing, check the ARP cache, to see if we already know it. For this example, we'll say we don't. So, we'll send to all the computers on our local network, and ask who has the IP address of 192.168.0.205. This sending to all computers is known as a broadcast. In our case, the broadcast address is 192.168.0.255. When other computers see this, if the requested IP address matches it's own, then it replies to the sender, telling him the mac address for the requested IP. Notice that the sender can not send to the target computer via IP address, only mac address. This is how almost all networks work, even if it's not called a mac address.
So we now have the final field for the Data Link Layer, 10-00-00-AA in the destination field and 10-00-00-00 in the source field. Next the packet goes out onto the line. Note that it's not actually sent to 10-00-00-AA, cause at this point it's electricity. All the computers on this segment see the packet. This area is known as the collision domain. ( http://www.google.com/search?sourceid==navclient&ie==UTF-8&q==collision+domain ) It's called that because if two or more devices send at once, the data collides, destroying both sends. This is also how packet sniffers work ( http://www.google.com/search? hl==en&lr==&safe==off&q==%2Bhow+%22packet+sniffers%22+work ).
The network cards look at the layer 2 data, and if the destination field matches their MAC address, they read the packet. Otherwise they just ignore it. So, in this case, computer B reads the packet, and removes the now unneeded layer 2 data.
Next, it makes sure this IP address is addressed to it. This is very important step, you'll see why in next example. In this case they match, so it knows to keep the packet. So the Layer 3 data is stripped off, and there is now no more IP data in the send. It's on to the Transport Layer, Layer 4.
On the Transport Layer, TCP checks the destination port number, and sees if any programs are listening for that number. Think of this like picking a number at a deli. If a program is using that port, then the packet is handed to it, and it goes on through to the application layers. Otherwise the packet is dropped. And since application layers are beyond this FAQ, that's the end of this example.
Next example covers what happens if the packet is sent to a computer not on our local subnet.
Section 2b: Sending to non-local subnets
Well, the Transport Layer data, TCP in the case, is handled exactly the same as example A. It's the IP, layer 3, that knows about subnets. TCP doesn't have a clue what a subnet is even. So, let's say Computer A is sending to computer C:
Computer C
IP Address 64.233.161.99
Subnet Mask: 255.255.255.255
Default Gateway:
MAC Address: AA-AA-AA-AA
Again, the Network Layer, Layer 3, first finds the network of the target, computer C, by AND'ing it's IP address with Computer A's subnet mask and gets 64.233.161.0 for the other computers network. It then ANDs it's own network, and gets a 192.168.0.0 network. Since it's not the same place it needs to send the packet outside the local network. That means the packet is sent to the default gateway.
So, this is a tricky step. The IP address fields stay the same. The destination IP address is still for computer C, 64.233.161.99. Then it's handed down to the Data Link layer. The data link layer knows that this is going to the default gateway, and so checks for the mac address of the default gateway. Once it has it, it sends it to the default gateway.
When the default gateway gets it, what it does depends on it's setup. A basic switch (Not a layer 3,4,8 billion switch, just a switch) can only send via mac addresses. If this was a switch, and it saw that the packet's not addressed to it's own IP address, then it wouldn't be able to do anything with it. A switch doesn't divide broadcast domains, so there is no point in making a switch a default gateway. If a switch is between the default gateway and the sender, then the switch gets into play. It reads and discards the Data Link header, and makes a new one. The source mac is now that of the switch, and the destination is still the same as before, the default gateway.
The default gateway can be a router, or a computer that acts like a router, or something else, normally with the term router inside it. Routers work on layer 3, network layer, This is where IP addresses live, remember. The default gateway sees the mac address matches it's own, and reads the packet. The target IP address doesn't match it's own, so it knows it's got to send the packet somewhere else. So what's it do? It takes the IP address of it's interfaces (Gateway HAS to have 2 interfaces, or it's not going to do much) and once again, it finds the networks of all it's interfaces by anding the subnet for each nic to the IP address, and anding the destination IP address with the subnet of the nic card. If they match, it knows the target is next to it, and it sends to the target. The process is exactly the same as before, it makes a new Transport Layer (Layer 3) header to match the old one, makes a new Data Link layer header using it's own mac address, and sends the packet to the target.
And if it's NOT going to a computer connected to the gateway? Well, it first checks the routing table. This is basically a map, of where to send data. It tells the computers what interface leads to what subnets. If the routing table lists the targets subnet, it's sent to the default gateway for that interface. Otherwise, the router will “flood”, or send it to the default gateway on all interfaces except from where it came from. From there, the other default gateways will handle it just like this router, until it receives the packet.
If, in the case, it was a http request, then the reply header will be send back, the target port on the reply will be the source port (the randomly generated one) from the original sender. That's why google doesn't connect to your port 80. Also why you don't have to open port 80 on the OUTSIDE interface of your router if your just web browsing, and not hosting a web site.