The Data Link Layer
Table of Contents
Ethernet and MAC Addresses
Ethernet
Wireless and cellular internet access are quickly becoming some of the most common ways to connect computing devices to networks, and it's probably how you're connected right now. So you might be surprised to hear that traditional cable networks are still the most common option you find in the workplace and definitely in the data center.
The protocol most widely used to send data across individual links is known as Ethernet. Ethernet and the data link layer provide a means for software at higher levels of the stack to send and receive data.
One of the primary purposes of this layer is to essentially abstract away the need for any other layers to care about the physical layer and what hardware is in use. By dumping this responsibility on the data link layer, the Internet, transport and application layers can all operate the same no matter how the device they're running on is connected.
For the most part though, the Ethernet in use today is comparable to the Ethernet standards as first published all those years ago.In 1983, computer networking was totally different than it is today. One of the notable differences in land topology was that the switch or switchable hub hadn't been invented yet. This meant that frequently, many or all devices on a network shared a single collision domain. You might remember from our discussion about hubs and switches that a collision domain is a network segment where only one device can speak at a time. This is because all data in a collision domain is sent to all the nodes connected to it. If two computers were to send data across the wire at the same time, this would result in literal collisions of the electrical current representing our ones and zeros, leaving the end result unintelligible.
Ethernet, as a protocol, solved this problem by using a technique known as carrier sense multiple access with collision detection.
The way CSMA/CD works is actually pretty simple. If there's no data currently being transmitted on the network segment, a node will feel free to send data. If it turns out that two or more computers end up trying to send data at the same time, the computers detect this collision and stop sending data. Each device involved with the collision then waits a random interval of time before trying to send data again. This random interval helps to prevent all the computers involved in the collision from colliding again the next time they try to transmit anything. When a network segment is a collision domain, it means that all devices on that segment receive all communication across the entire segment. This means we need a way to identify which node the transmission was actually meant for. This is where something known as a media access control address or MAC address comes into play.CSMA/CD is used to determine when the communications channels are clear and when the device is free to transmit data.
MAC Address
A MAC address is a globally unique identifier attached to an individual network interface. It's a 48-bit number normally represented by six groupings of two hexadecimal numbers.
Hexadecimal: A way to represent number using 16 digits.
Octet:In computer networking , any number that can be represented by 8 bits.
A MAC address is split into two sections. The first three octets of a MAC address are known as the organizationally unique identifier or OUI. These are assigned to individual hardware manufacturers by the IEEE or the Institute of Electrical and Electronics Engineers. This is a useful bit of information to keeping your back pocket because it means that you can always identify the manufacturer of a network interface purely by its MAC address. The last three octets of MAC address can be assigned in any way that the manufacturer would like with the condition that they only assign each possible address once to keep all MAC addresses globally unique.
Ethernet uses MAC addresses to ensure that the data it sends has both an address for the machine that sent the transmission, as well as the one that the transmission was intended for.
In this way, even on a network segment, acting as a single collision domain, each node on that network knows when traffic is intended for it.
Unicast, Multicast, and Broadcast
Unicast
A unicast transmission is always meant for just one receiving address.
At the Ethernet level, this is done by looking at a special bit in the destination MAC address. If the least significant bit in the first octet of a destination address is set to zero, it means that Ethernet frame is intended for only the destination address. This means it would be sent to all devices on the collision domain, but only actually received and processed by the intended destination.
Multicast
If the least significant bit in the first octet of a destination address is set to one, it means you're dealing with a multicast frame. A multicast frame is similarly set to all devices on the local network signal. What's different is that it will be accepted or discarded by each device depending on criteria aside from their own hardware MAC address. Network interfaces can be configured to accept lists of configured multicast addresses for these sort of communication.
Broadcast
The third type of Ethernet transmission is known as broadcast. An Ethernet broadcast is sent to every single device on a LAN. This is accomplished by using a special destination known as a broadcast address. The Ethernet broadcast address is all Fs. Ethernet broadcasts are used so that devices can learn more about each other.
Ethernet broadcast address:
Ethernet Frame
Data Packet
A data packet is an all-encompassing term that represents any single set of binary data being sent across a network link.
The term data packet isn't tied to any specific layer or technology. It just represents a concept. One set of data being sent from point A to Point B.
Data packets at the Ethernet level are known as Ethernet frames.
Ethernet Frame
This way network interfaces at the physical layer can convert a string of bits, travelling across a link into meaningful data or vice versa.An Ethernet frame is a highly structured collection of information presented in a specific order.
A preamble: 8 bytes or 64 bits long and can itself be split into two sections.
Start frame delimiter(SFD): Signals to a receiving device that the preamble is over and that the actual frame contents will now follow.
Destination MAC address: The hardware address of the intended recipient.
EtherType field: 16 bits long and used to describe the protocol of the contents of the frame.
VLAN header: Indicates that the frame itself is what's called a VLAN frame.
Virtual LAN(VLAN). A technique that lets you have multiple logical LANs operating on the same physical equipment.
Any frame with a VLAN tag will only be delivered out of a switch interface configured to relay that specific tag. This way you can have a single physical network that operates like it's multiple LANs. VLANs are usually used to segregate different forms of traffic. So you might see a company's IP phones operating on one VLAN, while all desktops operate on another.
Payload: In networking terms is the actual data being transported, which is everything that isn't a header.
Frame Check Sequence: A 4-byte or 32-bit number that represents a checksum value for the entire frame.
This checksum value is calculated by performing what's known as a cyclical redundancy check against the frame. A cyclical redundancy check or CRC, is an important concept for data integrity and is used all over computing, not just network transmissions.
Ethernet itself only reports on data integrity. It doesn't perform data recovery.
References:
https://www.coursera.org/learn/computer-networking/lecture/z8FEX/ethernet-and-mac-addresses
https://www.coursera.org/learn/computer-networking/lecture/OpIS6/unicast-multicast-and-broadcast
https://www.coursera.org/learn/computer-networking/lecture/37kGv/dissecting-an-ethernet-frame