A comms error is marked when a controller fails to communicate with a device multiple times, consecutively (IE, a controller may mark a comms error, after querying the device is present and failing to receive an answer in several consecutive attempts).
This issue will be removed immediately if the controller receives a successful answer from the device. If the device continues to cycle in an out of comms error, it may cause the controller to show the issue "Intermittent disappearing devices", which indicates that the bus, line or devices is experiencing a periodic fault.
Example of an intermittently disappearing device
Sources of comms error
Typically, comms errors occur for the following reasons:
- The device has no mains power
- The device has no dali power
- The device has experienced a firmware issue and is no longer responding
- The device has been damaged a result of adverse conditions (for example, we have witnessed dali circuitry being blown as a result of mistakenly wiring led dimmer output to the dali)
- There is a physical fault (induced power, cable length/gauge issue etc) in its area
- There is a fault on the line that allows devices to hear commands but NOT respond to queries. This happens!
For wireless controllers, there is an additional possibility of the device being out of range or experiencing interference.
Diagnosing the source of the comms error
If the device is found to be at a different address than marked in the database, zencontrol controllers will return the device to the correct address. Therefore, if the device is not resident at any address on the cloud, it is extremely likely that it is not responding or is responding with trash.
A device responding to broadcast commands is NOT the confirmation of zero faults on the line / device.
Physical issues:
Observe the amount of comms errors on the line. If all devices are in comms error or the error is localised to an area, the dali wire or mains power to the devices will be the most likely source.
If you find that the comms error seems to show at the end of a string of lights, please check to ensure that your cable length run is not exceeding the suggested document below:
As DALI is a communications interface, measurement of DC voltage of the bus at the end of the run is not a be all and end all measurement. If the cable gauge is insufficient or the run is excessive in length then the inductance of the line could cause the actual data packet to become unusable by the time it reaches the end of the run. Typically, this may manifest with failures to address, devices being marked as unrecognised, clones being detected (due to trash answers to queries).
Device issues
If the comms errors have no specific area, there may be individual faults in the drivers. From time to time, we do see devices crash and become unresponsive until the next cycle of mains power.
- Check if the controller has a dali short marked in the issues column against it. If so, located and rectify the short. The short will typically be either a line fault or a device causing the fault.
- Broadcast identify the line with your app to check if the devices in comms error are receiving. If they are receiving commands then there may be line issues not allowing them to speak.
- Cycle mains power for the lights and observe if they appear. If so, there may be a firmware issue with the device which causes it to become unresponsive.
Issues With Non Compliant Devices
For non compliant drivers (any driver that has the same EAN/serial number credentials as another device on the line) an additional possibility for comms error is that they have lost their dali scene information.
In the event the device cannot be uniquely identified by the EAN/serial number combination, our controllers will use a backup identification method of storing unique numbers in the last 3 scenes on a device. This allows us to return devices to their correct address, should they lose it.
From time to time, devices without unique identification values have other problems. Some devices will lose their scene information due to misinterpreting commands, or bad memory saving on power loss issues.
This is easily seen on the controller as growing comms errors. This is best described as an example:
Controller addresses 5 non unique devices at addresses 0-4 and gives each one a database entry, storing unique values on the last 3 scenes.
Over time, the devices lose these scenes (most often, they are found to have reset values of 255). They may also lose their address at the same time.
Controller encounters these devices with nothing in the scenes and interprets this as new devices, creating new 5 database entries and moving the devices to addresses 5-9.
Original 5 entries are marked as comms error and the device now appears to have 10 devices on registered.
If this occurs, the device in question has defective firmware that causes it to lose scene information. The controller does not reset scenes and maintains perfect functionality with thousands, if not millions of non unique devices daily.
It is important to note that the controller does not have permission to remove database entries without user intervention. Our devices carry full history, we cannot remove this without user permission.
We have a further mode to deal with this but it should be noted that the controller must operate with reduced functionality. If a device with this fault loses its address, it cannot be recovered.
Issues With End Device Firmware (E.G LED Drivers, Relays etc)
We encounter firmware bugs with devices from time to time that result in comms errors as a result of them becoming unresponsive.
This can often appear as devices that disappear over time. If you suspect that this is the case, resetting the power to the fitting will almost certain bring them back (though some can come back with faulty memory and require readdressing, which the zencontrol controller will automatically detect in due time).
It should be noted that whilst these issues are few and far in between, they are most often as a result of devices being designed and tested without exposure to dali2 control systems which tend to have a lot more communication occurring on the bus. Where possible, use dali2 certified devices - as the manufacturers are far more likely to have taken this into account when testing their firmware. Manufacturers may cut costs to reduce processing power which may impact their ability to correctly function. In situations like this, they may not answer reliably in due time or at all.
Typical sources of these bugs are:
- Devices going into standby mode and suffering faults during this process, consequently becoming unresponsive until they are reset
- Devices misinterpreting 24bit Dali Control Device commands for 16bit Dali Control Gear commands. This is most often the source of lights going to unexpected levels. For instance, a dai light sensor report event looks exactly like a direct arc lighting command if the dali processing of a device ignores the first 8 bits.
- Devices experiencing information loss as a result of power loss events.
Comments
0 comments
Please sign in to leave a comment.