- maintain a window of the input stream that is divided into 2 parts:
search buffer | look-ahead buffer
(usually big).......(not-so-much)
- takes first part of the look-ahead buffer and scans backwards in the search window to find as much match as it can : returns with offset of longest, or most recently seen (last)
- tokens consist of [ offset, length, next symbol in look-ahead buffer]
The basic idea of CRC algorithms is simply to treat the message as an enormous binary number, to divide it by another fixed binary number, and to make the remainder from this division the checksum. Upon receipt of the message, the receiver can perform the same division and compare the remainder with the "checksum" (transmitted remainder).
Example: Suppose the the message consisted of the two bytes (6,23) as in the previous example. These can be considered to be the hexadecimal number 0617 which can be considered to be the binary number 0000-0110-0001-0111. Suppose that we use a checksum register one-byte wide and use a constant divisor of 1001, then the checksum is the remainder after 0000-0110-0001-0111 is divided by 1001. While in this case, this calculation could obviously be performed using common garden variety 32-bit registers, in the general case this is messy. So instead, we'll do the division using good-'ol long division which you learnt in school (remember?).