Comments on Better Embedded System SW: Seven Deadly Sins of CRCs and Checksums

Corruption of the CRC field itself is accounted fo...

2022-08-18T20:28:01.773-04:00

Corruption of the CRC field itself is accounted for in the HD. The HD accounts for any combination of codeword bit corruptions, including both the dataword and the Frame Check Sequence (CRC value)

What are the effects of corruptions in the CRC its...

2022-08-18T20:13:53.876-04:00

What are the effects of corruptions in the CRC itself?

I'm assuming that a single bit corruption in the CRC makes all HD promises invalid.

If that's the case, how does this affect the error detection analysis?

Thanks for the story. This sort of issue is why m...

2020-10-08T09:20:39.483-04:00

Thanks for the story. This sort of issue is why many protocols start with a length field in a known place.

Anecdote on length field importance and rule 6: Re...

2020-10-07T23:23:54.961-04:00

Anecdote on length field importance and rule 6: Recently used CRC16-CCITT for an embedded design. The packet length is known after there is a silence in the line.
I Assumed CRC would magically detect errors, wrong lengths and the possibility of an error was anything but remote... Turns out line silence isn't always properly detected and therefore 0x00 ends up sometimes in the packet as the last byte (erroneously).
Now check this out: ALL and EVERY valid packet with CRC16-CCITT with any extra (bogus) 0x00 at the end has also a valid CRC and passes the CRC check!!!!!!!!
Example:
D: DATA Byte
X: CRC Byte 1
y: CRC Byte 2

DDDDXY <- XY is the valid CRC for DDDD
DDDDXY0 <- Y0 is a valid CRC but X (CRC Byte 1) will be taken as data!!
DDDDXY00 <- 00 is a valid CRC but XY will be taken as data!!
DDDDXY000<- 00 is a valid CRC but XY0 will be taken as data!!
and so on...

At first I thought I had found a 1 in a 65536 test case (at least) and not so bad. Fortunately I wondered how likely was it and created a montecarlo program to find out, ohh was I surprised that every time it was the case!

I guess I will have to change my algorithm

Thank you for the information you have provided to the community

Guess what happens in IEEE 802.15.4 / Zigbee.... l...

2014-03-06T01:42:47.102-05:00

Guess what happens in IEEE 802.15.4 / Zigbee.... last I checked - no CRC or checksum protection of a leading length field. Nobody I mentioned this to seemed to find it a problem. It's a disaster waiting to happen.

The most robust protocols get around these kind of problems by use of things like illegal sequences (coding or symbol violations) to insert start / end of frame markers, and then use things like bit stuffing to ensure such symbols can never appear inside a valid data frame (eg HDLC).

The problem is always one of distinguishing control fields in the presence of bit errors. Ignoring the issue does not make it go away. Of course, redundant coding (and thus decent HD) can aid, but its not nice for bandwidth - you don't get something for nothing.

It does make the whole packet HD=3, and that is pr...

2013-07-10T08:20:15.402-04:00

It does make the whole packet HD=3, and that is probably why some protocols use the same HD for header protection as packet body protection.

I don't know why USB did it that way. But the obvious argument would be that the header is a smaller target to hit with random bit errors than the packet body. You might, for example, argue that a particular random independent bit error rate HD=3 on the header is good enough because it is very unlikely to get 3 errors there, while HD=4 is necessary for the packet body because the body is so big that 3 errors will happen often enough there to be a problem. To make this argument for your system you'd need to decide that random independent bit errors is the right fault model for your system, and then do the probability math to decide if number of undetected header errors is low enough to be acceptable to you.

Just one last question... The 'CRC on the mess...

2013-07-09T19:14:59.833-04:00

Just one last question... The 'CRC on the message header' approach, I noticed on USB that the header CRC is HD 3 yet the data is HD 4. I've seen this elsewhere as well. Is there some logic behind why the header can be weaker? This makes the whole packet HD 3 effectively..

Thanks

James

Right again -- this is a good example of how this...

2013-07-08T00:47:36.735-04:00

Right again -- this is a good example of how this topic gets pretty tricky if you want to plug all the gaps. (As I get time I'll post things I've run into... but time always seems to be scarce.)

Quick version: Yes you have to think of any way in which a message can be misunderstood that can affect the length. That includes length implicit in message headers. One way to solve this is make all messages the same length so the problem can't happen, fragmenting if necessary.

Another way is to put a CRC on the message header to detect header corruption (FlexRay does this; for a protocol that doesn't do this you could make the first byte of the payload a CRC on the header field).

A third way is to have headers that have a high hamming distance between each other. One way to do this is, for example, to use the first few bits of the header as the actual header and the rest of the header bits as a CRC protecting those first few bits. The network protocol will just see this as a sparsely populated header space, but you'll know that it is hard for a single bit flip (or more perhaps) to accidentally change the header and therefore change the implicit length.

The last way I've seen it done is with a high-hamming-distance set of synchronization bit patterns (I think this was Train Control Network), with the sync bit pattern determining which of a few message types are being used and thus implicitly the length.

Actually, I've been pondering this today and f...

2013-07-06T13:53:29.632-04:00

Actually, I've been pondering this today and framing seems to be real trouble...

For instance Modbus, a single flipped start bit after a packet would shift to half of CRC + FF, the 'idle 3.5 characters' having received an additional character.

If instead one used a high bit (address etc.) to indicate the first byte of a packet, and the CRC were directly after, two flips would get you checking the CRC at the wrong location as well.

Do you have any pointers/references/etc. for how to avoid this problem? (Or have I missed something that makes it less of an issue?)

Thanks

James

James -- this is a great point! Glad you found it...

2013-07-06T07:12:48.822-04:00

James -- this is a great point! Glad you found it sooner rather than later.

Cheers,
-- Phil

Hello, Length field is a subtle one: if your prot...

2013-07-05T23:53:20.009-04:00

Hello,

Length field is a subtle one: if your protocol has a 'command number' at the start and different commands have different lengths, one's effectively the other. Glad I realized this yesterday instead of next week when it'd have been too late...

Thanks

James