Your comments
I would consider it resolved for our case. Thanks for your help
We have been testing this for the last few weeks and haven't seen the issue arise again
Hi Vladimir,
We have now split the server in two; one controlling just the KNX, the other controlling everything else.
With the KNX server using the version you sent me above, we haven't seen this issue in quite a while.
we are also seeing a problem is that the when the driver goes offline, sometimes it stays offline line until the server is restarted
Hi
We've run the server without a script and it fails in the same way
here's screenshot of Wireshark showing the disconnect
Just a thought, is it possible to run one instance of KNX IP interface on a different machine (maybe one of your embedded servers), but run our main server on the PC
That way we would isolate the real-time aspects of the KNX Tunnel protocol from other aspects of the iridium server
When the fault happens, it then repeats every minute or so, and is always triggered by an internal write on the KNX bus that the Iridium driver takes too long to acknowledge
When the server is in this state, then it doesn't recover - we have to restart the server
That is the ONLY thing that we do - restart the Iridium server
Everything else remains untouched
So it does point to fact that something has gone wrong in the server that is causing this issue
I dont believe there is some other PC service that is causing this issue, as it is resolved immediately on restarting the iridium server - it seems it is a state the Iridium server gets in after being running for a while
If the KNX IP device sends a request disconnect, then the Iridium server MUST handle it as the Tunnel is no longer valid
Currently you carry on trying (for 40-50 seconds) during which time all reads / writes fail
In this case, you can reconnect immediately
Polling all devices doesn't cause the problem
Its not that a device isn't responding
The problem is that the Iridium server is slow in sending an ACK to the KNX bus when there is a write transaction between 2 devices on the bus
Are there any advanced diagnostics that can be retrieved from your driver to help diagnose what is going wrong ?
Extract from the KNX/IP Tunnelling specification
The BAOS IP device is behaving exactly as expected
Looking at this in more detail, I believe the following is happening....
Line 901 - KNX bus does a write, which should be acknowledged by tunnel connection
Line 1023 - 1 second later KNX bus repeats this (as unacknowledged by tunnel)
Line 1102 - a further 1 second later (2 seconds after original write) the Tunnel connection issues a Disconnection request - since the internal write has been unacknowledged
Lines 6590 / 6591 - a further 200ms later the iridium server responds with ACK for both the transmissions
The Iridium server then doesn't respond to the tunnel disconnection request for 30 seconds
So there are 2 problems
1. The Iridium Server is slow to respond sometimes, resulting in tunnel disconnection.
** NOTE** the problem isn't with the KNX bus - its with the Iridium KNX driver
The fault occurs when the Tunnel connection reports an internal write on the bus, and the IP device is waiting for an ACK from the Iridium server - which is delayed
This problem repeats every minute (when there is an internal write on the bus)
If I restart the Iridium Server, then the behaviour goes away to 1-2 days and then re-occurs - so its clearly a fault in the Iridium KNX driver
2. The Iridium Server doesn't honour the KNX Disconnection request - it should handle this and open a new tunnel connection if it fully implements the KNX/IP protocol
Customer support service by UserEcho
That makes sense - Thank you