Internet Checksums: How the Internet Verifies Data Integrity
The Internet checksum is used in standard Internet Protocols such as IP, UDP, and TCP. This value is used to verify the integrity of data after transmission across the network. A client includes a checksum value in the segment header such that a receiver can use that value to directly verify data integrity.
How are Internet Checksums Calculated?
The means of calculating the Internet checksum are outlined by the 1988 RFC1071 and can be summarized as such:
- Convert data into a series of 16-bit integers;
- Calculate the sum of all 16-bit integers, allowing for carry bit wrap around;
- Take the 1’s complement of the final sum (flip the bits)
What are Internet Checksums Used For?
This value is then put into the header of a data segment sent across the network. When the segment reaches its final destination, the receiving machine can verify the integrity using the checksum as such:
- Convert the data segment into a series of 16-bit integers;
- Calculate the sum of all 16-bit integers, allowing for carry bit wrap-around;
- Add the checksum to the final sum total;
- If the final total is all 1’s the data is validated;
- If any 0’s are detected the data has been corrupted.
How are Internet Checksums Validated?
Checksum calculation is a very straight-forward means of verifying transmitted data. The use of binary arithmetic allows some additional flexibility as well, as noted in the original RFC:
- Sum calculation can be done the same regardless of machine endian-ness;
- byte-swapping can be used to avoid word-boundary issues;
- parallel summation possible on 32-bit machines (32 was modern at the time)
- Deferral of carry bits
- Combination of checksumming and data copying;
- Incremental updates to checksum
Note: Step #6 was updated in the later RFC1141
The theory of checksum calculation accounts for lots of use cases and can get quite complex for those unfamiliar with its implications. Below is a simple illustration of how the checksum can be calculated for a data segment of 8 bits, separated into 2 4-bit words.
The final 1101 value represents the 1’s complement of the total bit sum of the segment’s data. This value is inserted into the header for use in receiver-side verification. The receiver verifies the integrity of data similarly to how the checksum was created; by organizing data into 16-bit segments, adding all the values and accomodating wrap-around carry bits.
The checksum is then added to that total; if the result is all 1’s the integrity of the data is verified. If there are any 0’s the data is deemed to be corrupt. The following illustration shows the receiver-side verification process of the Internet checksum:
The Internet checksum is a relatively light-weight protocol that provides support for reliable data transport. This element is crucial to the guarantees made by TCP. The checksum