You are on page 1of 3

Friday, January 11, 2013

SunSpec Modbus/RTU Encapsulation over ZigBee XBee


A growing number of solar and renewable energy devices are sending SunSpec Modbus
maps through Digi ZigBee Xbee. By default, they are using a source/destination endpoints
of 232 (0xE8), with Digi's profile 0xC105 and cluster id 0x11.

How is packet fragmentation handled?


Since ZigBee will inherently limit single over-the-air radio packets to from 65 to 84 bytes
(depends on options), a maximum sized Modbus/RTU message of 255 bytes will be
fragmented into at least 4 packets with potentially dozens or hundreds of milliseconds gap
between each. This is more a problem on SLAVE RESPONSES, since the normal out
going request (or poll) is likely only 8 bytes.
If you wish to avoid fragmentation, then move data as 30 registers or less, as a 65-byte
Modbus/RTU message should generally move as a single ZigBee packet.
Sending Modbus/RTU in API Mode: Digi Xbee in API mode will use the ZigBee
fragmentation headers, which allow the remote receiving node to understand the
fragmentation and correct it. This also allows faster recovery of a single lost fragment,
which means the Modbus/RTU remote should receive the correct message without CRC16
failure and retry.
Sending Modbus/RTU in AT Transparent Mode: Digi Xbee in AT Transparent mode
will treat each fragment as an unrelated packet. The XBee will detect a full buffer and send
the packet with no regard to more data received in the future. A lost packet is completely
gone. So for example, a Modbus/RTU response of 255 bytes might be turned into 4 packets
and nothing stops the recreated Modbus/RTU message being the 1st, 2nd, and 4th packets,
and hopefully the CRC16 fails, which indicates the loss of the 3rd packet. In rare theory
(but not likely) the packets might even arrive in the wrong order, so the response gets
recreated as 1st, 2nd, 4th, and then 3rd segment of data.
Receiving Modbus/RTU in API Mode: The API receive frame depends on how the data
was sent. If the sender used API mode, then a single API receive should be seen of the
correct size. If the sender used AT Transparent mode, then multiple API receive frames
would be seen, with no hint that they are related.
Receiving Modbus/RTU in AT Transparent Mode: Like with API mode, the AT
Transparent serial stream depends on how the data was sent. If the sender used API mode,
then a single serial streamn without gaps should be seen - the Xbee even in AT Transparent
mode has honored the ZigBee fragmentation protocol. If the sender used AT Transparent
mode, then the serial data might have gaps.

What are Your Risks?


Stale response: the number one 'unexpected' risk is for stale responses. Given ZigBee can
include route-discovery events which literally delay a response for up to 5 seconds, it is
very possible for a master to receive a response to a request it has already discarded. This
conflicts with traditional serial Modbus/RTU masters which have a default timeout of 1
second. The solution is that masters must wait at least 5 seconds for a response before
calling it a 'time-out', and API-vs-AT Transparent has little impact here.
Mismatched response: the number one-and-a-half 'unexpected' risk is for mismatched
responses, which is really a side effect of the stale response risk above. Suppose a master
polls for 10 registers starting at 4x00001 from a remote ZigBee node MAC
00:13:a2:00:40:56:4d:47!, then the response will be 25 bytes long but contain NO reference
to it being from 4x00001. Suppose the master times out the response, and now sends a
second poll for 10 registers instead starting at 4x00027 from the same ZigBee node. The
response will still be 25 bytes long, but when it comes, it is IMPOSSIBLE to know if the
response is 10 registers starting at 4x00001, or starting at 4x00027! Therefore the wrong
data can be misapplied to the internal database with serious consequences. The
solution is that masters should wait long enough, but also attempt to poll different sizes. In
the above example, if the master polled for 10 registers from 4x00001, but 11 from
4x00027, then the response would be either 25 or 27 bytes respectively.
Unexpected gaps: as mentioned above, one risk is that subsets of the Modbus/RTU
message might have artificially high idle gaps within, which is a problem since
Modbus/RTU defines end-of-message by an idle gap of about 4 milliseconds. The solution
is to make sure the receiving device can support longer end-of-message settings, likely of at
least 500 milliseconds. Messages sent via Xbee API mode should not have gaps; those sent
by Xbee AT Transparent mode will likely have gaps.
Lost segments: given a large Modbus/RTU message is split into multiple segments, large
portions of the message might be lost. This could expose bugs in receivers since
traditionally they only see bit-errors which cause the CRC16 to be bad. It is rare that say
180 bytes of a 255 bytes message arrive. Messages sent via Xbee API mode should have the
lost segments retried a modest number of times, so this issue may be hidden; those sent by
Xbee AT Transparent mode will always suffer this.

Recommendations?
1. Use API Mode whenever possible.
2. Make sure receiving devices have a configurable idle-gap or end-of-message
timeout which can set to at least 500 milliseconds.
3. Default master requests to at least 5 seconds timeout.
4. Avoid sending read requests with different offsets, yet the same register count to the
same remote. If required, read a few extra registers and have your master confirm
the BYTE count matches the expected size, based on the requested register count.

5. To overcome the longer timeout, design your master to 'probe' the incoming
response to detect end-of-message. For example, a response with a read-multipleholding-registers will always have the byte count as the third byte. So this simple
paradigm can be considered:
o Wait for at least 3 bytes to be seen
o Take 3rd byte (byte-count) and add 5 to it (the slave address, function code,
byte count, and 2 bytes CRC), this is the estimated response length
o Wait for the estimated byte count. When seen, immediately test CRC16 and
exit, or handle a response timeout if required

Future Improvements?
Although I have not done any work on this yet, I have held some discussion with others
about creating a new Modbus 'dialect' to overcome the largest risks. For example, adding a
LENGTH field and SEQUENCE NUMBER would overcome most of the risks listed
above.
A secondary expansion would be to enable what I could call a sticky poll. So imagine I
wish to read 7 Modbus registers every 5 seconds. If I send a poll every 5 seconds, and I
receive 1 response, I have consumed 2-units of radio bandwidth. What if I could send the
poll once in a special format, then have the remote node refresh the response automatically
every 5 seconds until some future event (or time period) stops it. Where is the value?
Imagine I have 200 Modbus devices to poll - being able to immediately cut the traffic in
half means I can poll the data either twice as fast, or double the number of remotes.

Resources:
1. SunSpec ( http://www.sunspec.org/ ) creating standard Modbus and XML upload
specs for renewable energy.
2. Modbus Organization ( http://www.modbus.org/ )
3. Lynn's Modbus Protocol Information ( http://iatips.com/modbus.html )