News:

  • May 18, 2024, 02:14:30 PM

Login with username, password and session length

Author Topic: Ethernet Failure - BRX Stops Communicating  (Read 6304 times)

AC

  • Jr. Member
  • **
  • Posts: 10
  • I know what I know, and I don't know, what I don't
Ethernet Failure - BRX Stops Communicating
« on: January 11, 2022, 10:08:36 AM »
We are using many of the BRX controllers in the field and are communicating Modbus with them from our software. Once a month or so the BRX will completely stop communicating and will have to be physically reset (Power cycled) to fix the issue.    I had spoken with a tech that said a similar issue had been fixed in the latest patch but still yet, once or twice a week (Three if Im really lucky), I get the luxury of going out to reset these bad boys.   Is there something that I can do or do any of you have any suggestions?   I have tried the following and it seems to cut down on it sometimes but I need a real fix, not a band aid.

I used an MRX instruction to read from another PLC, the Success bit is required to keep a timer from running. Once the success bit is off, the timer begins (10min)   upon the timer timeout, the following occurs

INC - Reboots (D0) Number of times this instruction has ran
INC - HoldOut (D1) Number of times this hour the PLC has tried to reboot
RSTT - Resets the Timer
REBOOT - Reboots the PLC

Once an hour I use the Delta ($Now.Hour) instruction to reset the HoldOUT register to MOVE 0 to D1 .   If it reaches 2 it will lock out the timer until the pass of the hour.

The problem with this solution is in most cases I am running pumps and other equipment.  Its a real pain in the butt to have to shut down pumps in the middle of what they are doing just because I cant rely on the communications. I have this problem with NO OTHER PLCS on our property.  We use Micrologix, PLCnext from Phoenix, and Sixnet RTU's, all of which work fine.  Just the BRX has this problem.   Thanks again for any suggestions!
« Last Edit: January 11, 2022, 10:16:56 AM by AC »
Never be too busy that you miss everything there is to be thankful for.

BobO

  • Host Moderator
  • Hero Member
  • *****
  • Posts: 5996
  • Yes Pinky, Do-more will control the world!
Re: Ethernet Failure - BRX Stops Communicating
« Reply #1 on: January 11, 2022, 02:52:09 PM »
We are using many of the BRX controllers in the field and are communicating Modbus with them from our software. Once a month or so the BRX will completely stop communicating and will have to be physically reset (Power cycled) to fix the issue.    I had spoken with a tech that said a similar issue had been fixed in the latest patch but still yet, once or twice a week (Three if Im really lucky), I get the luxury of going out to reset these bad boys.   Is there something that I can do or do any of you have any suggestions?   I have tried the following and it seems to cut down on it sometimes but I need a real fix, not a band aid.

I used an MRX instruction to read from another PLC, the Success bit is required to keep a timer from running. Once the success bit is off, the timer begins (10min)   upon the timer timeout, the following occurs

INC - Reboots (D0) Number of times this instruction has ran
INC - HoldOut (D1) Number of times this hour the PLC has tried to reboot
RSTT - Resets the Timer
REBOOT - Reboots the PLC

Once an hour I use the Delta ($Now.Hour) instruction to reset the HoldOUT register to MOVE 0 to D1 .   If it reaches 2 it will lock out the timer until the pass of the hour.

The problem with this solution is in most cases I am running pumps and other equipment.  Its a real pain in the butt to have to shut down pumps in the middle of what they are doing just because I cant rely on the communications. I have this problem with NO OTHER PLCS on our property.  We use Micrologix, PLCnext from Phoenix, and Sixnet RTU's, all of which work fine.  Just the BRX has this problem.   Thanks again for any suggestions!

If there is still an issue in the firmware, we very much want to fix it. We've tried many ways to duplicate it, but so far aren't able to do so.

Is this MRX/MWX instructions failing (PLC talking to other things)? Or Modbus/TCP server (other things talk to the PLC?

When it fails, are other comms possible, or is all Ethernet comm dead?

Would it be possible for us to get your PLC program and a basic description of what it is interfacing with that stops working?

"It has recently come to our attention that users spend 95% of their time using 5% of the available features. That might be relevant." -BobO

AC

  • Jr. Member
  • **
  • Posts: 10
  • I know what I know, and I don't know, what I don't
Re: Ethernet Failure - BRX Stops Communicating
« Reply #2 on: January 12, 2022, 09:32:23 AM »
It is doing this in simple programs to complex programs.  Some of them are on a tank site that literally is just one MRX instruction, and in most all cases with a very rare exception, I am unable to communicate even with the Do-More software.  If I am on site and connect, once out of every say 15 times this happens, I am able to connect and force a reboot then everything starts working again, but like I said most of the time I am unsuccessful in connecting to the PLC once this takes place. All ethernet comms lock up so I am unable to get any information from the PLC before a reboot takes place. 

I might add that on our pump stations and all the other sites alike... The PLC continues to function properly and will start/stop pumps according to pressure (Most of them are set up so that if they cannot see the tank they will fail over to pressure only)   but it is a safety issue to leave our pump stations running without comms. We need to be able to see them.

And yes, I have no problem sending a program over.
Never be too busy that you miss everything there is to be thankful for.

BobO

  • Host Moderator
  • Hero Member
  • *****
  • Posts: 5996
  • Yes Pinky, Do-more will control the world!
Re: Ethernet Failure - BRX Stops Communicating
« Reply #3 on: January 12, 2022, 11:02:56 AM »
After hearing more about your system from Mike, I think the beta firmware may fix the issues. Given that you are on various forms of unstable connections (radios, etc), I'm thinking you might be experiencing both of the two bugs that we've fixed. One would hang the Ethernet port totally, we think due to either the MAC hanging or due to a mismatch between the MAC and PHY, and the other that could cause an MRX/MWX to end up hung. We'll get you fixed.
"It has recently come to our attention that users spend 95% of their time using 5% of the available features. That might be relevant." -BobO

Bolt

  • Hero Member
  • *****
  • Posts: 550
Re: Ethernet Failure - BRX Stops Communicating
« Reply #4 on: January 13, 2022, 03:16:57 AM »
I have a BRX communicating with multiple Modbus TCP Devices, as both a Server and Client.  Yesterday, one of the Clients was taken offline (power removed) for repairs, and when the Client came back online (power restored), the PLC would not communicate with the Client.  It was still MRX/WRX'ing fine with all the other Clients, and still responding to Modbus TCP requests as a Server.

I could access the Client's web interface.  I could read and write to Client via command line Modbus TCP commands.

I created a simple MRX instruction with the Client in $Main, manually edge triggered, read 2 registers.  It would set neither the Success nor the Error bit.  It would not set a Warning in DmD's System Status.

Hours later, when the process allowed for the PLC to be shut down, I did a Program Mode transition to STOP, and back to RUN.  No change.

I went to the PLC, set Mode Switch to STOP, removed control power from the PLC, restored control power, Switch to RUN, Switch to TERM.

The Client's Device came alive instantly.  $Main's WRX instruction set it's long delayed Error bit, and upon each new edge trigger, it would set the Success bit.

I re-enabled the original WRX/MRX logic with the Client, and it worked as to be expected.

1.   What could cause this?
2.   Can it be prevented?
3.   Can it be restored without power cycling?
4.   Does a Program Mode transition not perform the same effect as a power cycle?
5.   Is a REBOOT instruction the same effect as a power cycle?

RBPLC

  • Hero Member
  • *****
  • Posts: 585
Re: Ethernet Failure - BRX Stops Communicating
« Reply #5 on: January 13, 2022, 07:40:41 AM »
For what it's worth, I suspect the similar issues that I've seen are somewhat tied to intermittent/failed and then restored comms with slave devices. I believe some of the changes in the last firmware helped the issue but from my re-post a couple of weeks ago, I'm still experiencing intermittent (albeit less frequent) issues. 

BobO

  • Host Moderator
  • Hero Member
  • *****
  • Posts: 5996
  • Yes Pinky, Do-more will control the world!
Re: Ethernet Failure - BRX Stops Communicating
« Reply #6 on: January 13, 2022, 10:32:46 AM »
I have a BRX communicating with multiple Modbus TCP Devices, as both a Server and Client.  Yesterday, one of the Clients was taken offline (power removed) for repairs, and when the Client came back online (power restored), the PLC would not communicate with the Client.  It was still MRX/WRX'ing fine with all the other Clients, and still responding to Modbus TCP requests as a Server.

I could access the Client's web interface.  I could read and write to Client via command line Modbus TCP commands.

I created a simple MRX instruction with the Client in $Main, manually edge triggered, read 2 registers.  It would set neither the Success nor the Error bit.  It would not set a Warning in DmD's System Status.

Hours later, when the process allowed for the PLC to be shut down, I did a Program Mode transition to STOP, and back to RUN.  No change.

I went to the PLC, set Mode Switch to STOP, removed control power from the PLC, restored control power, Switch to RUN, Switch to TERM.

The Client's Device came alive instantly.  $Main's WRX instruction set it's long delayed Error bit, and upon each new edge trigger, it would set the Success bit.

I re-enabled the original WRX/MRX logic with the Client, and it worked as to be expected.

1.   What could cause this?
2.   Can it be prevented?
3.   Can it be restored without power cycling?
4.   Does a Program Mode transition not perform the same effect as a power cycle?
5.   Is a REBOOT instruction the same effect as a power cycle?

That sounds like 1 of 2 bugs we think we've identified and fixed. While refactoring some of the lower level TCP code to support new features (making code common to be shared) we think a window got introduced where it was possible to hang permanently while establishing the TCP connection. It could cause isolated TCP features to hang, while others would work. We believe that is fixed in our current beta.

The other is related to low level Ethernet hardware. If would cause the Ethernet port to stop altogether. We believe that is also fixed in the current beta.

Hope to have software released soon.

A power on reset loads the FPGA and resets a few other peripherals, and the REBOOT only resets the processor itself. For almost anything that matters to execution, they are the same.

"It has recently come to our attention that users spend 95% of their time using 5% of the available features. That might be relevant." -BobO

WRT2

  • Newbie
  • *
  • Posts: 8
Re: Ethernet Failure - BRX Stops Communicating
« Reply #7 on: January 27, 2023, 01:50:51 PM »
After hearing more about your system from Mike, I think the beta firmware may fix the issues. Given that you are on various forms of unstable connections (radios, etc), I'm thinking you might be experiencing both of the two bugs that we've fixed. One would hang the Ethernet port totally, we think due to either the MAC hanging or due to a mismatch between the MAC and PHY, and the other that could cause an MRX/MWX to end up hung. We'll get you fixed.

Help. I am ready to pull BRX PLCs from the field for this very reason. Very hard to get hands on the PLC when the ethernet port stops functioning after a week or a month in the middle of rural Alaska. Only a power cycle restarts comms. No one seems to have reported this problem for the last year. As far as I know I am up to the latest firmware.

BobO

  • Host Moderator
  • Hero Member
  • *****
  • Posts: 5996
  • Yes Pinky, Do-more will control the world!
Re: Ethernet Failure - BRX Stops Communicating
« Reply #8 on: January 27, 2023, 02:40:14 PM »
Help. I am ready to pull BRX PLCs from the field for this very reason. Very hard to get hands on the PLC when the ethernet port stops functioning after a week or a month in the middle of rural Alaska. Only a power cycle restarts comms. No one seems to have reported this problem for the last year. As far as I know I am up to the latest firmware.

The only remaining issue that we are aware of is when radios are used to talk to TCP servers (generally Modbus). We have had no reports of anything else as of the most recent firmware. Can you give me details?

As for the Modbus server issue, we're 95% sure we have a fix. We've concluded that it really isn't broken exactly, it's just that when radios misbehave (become very unreliable) is it possible for the listening socket to enter a state with a very long (20+ minute) timeout, during which the server will not accept more connections. If the client were to stop (erratically) attempting connections, eventually the timeouts clear and it starts functioning. Unfortunately, once it gets in the state, if the radio continues to be dodgy, the system won't recover and it appears that it's locked up. Once we were able to duplicate it, we were able to change the timeout behavior for that state, reducing it to 15 seconds or so. This should prevent the issue.

If that is your situation, I'm happy to give you the beta firmware to try. If that isn't your situation, then please help us understand exactly what's failing.
"It has recently come to our attention that users spend 95% of their time using 5% of the available features. That might be relevant." -BobO

WRT2

  • Newbie
  • *
  • Posts: 8
Re: Ethernet Failure - BRX Stops Communicating
« Reply #9 on: January 30, 2023, 12:39:58 PM »
Hi Bob, that scenario might be close enough, although radios are not involved here, comms are at a distance using CAT cable and fiber transceivers and are sometimes noisy. One device being interrogated seems to become unresponsive at times. The PLC is interrogated remotely over a satellite link using Modbus TCP and a VPN tunnel. Maintains about a dozen Modbus TCP tasks running asynchronously.

BX-DM1E-36ED13-D, with a BX-06RTD (missing RTDs) and a BX-08TD1
Serial Port programmed for Modbus RTU Unit ID = 1 using @IntSerial RS485, but not used yet.
@IntEthernet RX/WX Ack = 100, RX/WX Cmd = 1000, Retries = 2
Modbus TCP server active with maximum 16 sessions, timeout = 60s
Interrupt on X0 OR rising edge
HSIO: Function 1 is freerun edge timer on X0, Function 2 same on X1.
No I/O Master or Scanning
Only Modbus TCP comms used + Do-more

I have another BX installation with a quite different setup which also hangs after a month or two.

I would be willing the settle for the REBOOT command clearing the hang-up instead of a power cycle. Not great but surviable.
« Last Edit: January 30, 2023, 01:33:57 PM by WRT2 »

BobO

  • Host Moderator
  • Hero Member
  • *****
  • Posts: 5996
  • Yes Pinky, Do-more will control the world!
Re: Ethernet Failure - BRX Stops Communicating
« Reply #10 on: January 30, 2023, 02:23:01 PM »
Hi Bob, that scenario might be close enough, although radios are not involved here, comms are at a distance using CAT cable and fiber transceivers and are sometimes noisy. One device being interrogated seems to become unresponsive at times. The PLC is interrogated remotely over a satellite link using Modbus TCP and a VPN tunnel. Maintains about a dozen Modbus TCP tasks running asynchronously.

BX-DM1E-36ED13-D, with a BX-06RTD (missing RTDs) and a BX-08TD1
Serial Port programmed for Modbus RTU Unit ID = 1 using @IntSerial RS485, but not used yet.
@IntEthernet RX/WX Ack = 100, RX/WX Cmd = 1000, Retries = 2
Modbus TCP server active with maximum 16 sessions, timeout = 60s
Interrupt on X0 OR rising edge
HSIO: Function 1 is freerun edge timer on X0, Function 2 same on X1.
No I/O Master or Scanning
Only Modbus TCP comms used + Do-more

I have another BX installation with a quite different setup which also hangs after a month or two.

I would be willing the settle for the REBOOT command clearing the hang-up instead of a power cycle. Not great but surviable.


The specific issue we found happens during establishment of the TCP connection. If the PLC receives a SYN packet, and then a second SYN packet from the same remote port, it drops into a state where it starts using the "connected" timeouts/retries, but it never got connected, so the higher level code (the server itself) never gets control from the stack. The timeout/retry sequence is painfully long by default, so the associated listening socket is hung until the timeout clears. If memory serves, we have the number of pending sessions set to 4, so once all 4 are hung, it loses the ability to receive new connections. The fix we have eliminates the long timeout from the session establishment.

If you have Modbus clients connecting and disconnecting often, and you have packet losses due to poor media, the basic conditions exist to get it in this state. I don't think it would be dependent on radios specifically, but anything unreliable with a client (or media) resending a SYN packet using the same source port number could cause it. If the client(s) stayed connected, rather than connecting and disconnecting, I think it would greatly reduce the potential for it.

If that is your issue, and it sounds like it could be, the new OS should fix it.


"It has recently come to our attention that users spend 95% of their time using 5% of the available features. That might be relevant." -BobO

WRT2

  • Newbie
  • *
  • Posts: 8
Re: Ethernet Failure - BRX Stops Communicating
« Reply #11 on: January 31, 2023, 03:51:20 AM »
Just to be clear, just reducing the establishment time-out should resolve these hang-ups? What OS version are you referring to? Remember that I thought I was completely up-to-date.

BobO

  • Host Moderator
  • Hero Member
  • *****
  • Posts: 5996
  • Yes Pinky, Do-more will control the world!
Re: Ethernet Failure - BRX Stops Communicating
« Reply #12 on: January 31, 2023, 09:05:21 AM »
Just to be clear, just reducing the establishment time-out should resolve these hang-ups? What OS version are you referring to? Remember that I thought I was completely up-to-date.

No. This is a change we've made to the TCP stack, but it hasn't been released yet. I was offering access to the beta, if you are interested in trying it.

The only thing that might affect it externally is for clients to remain connected, rather than connect/poll/disconnect, which when combined with an unreliable network, is what l believe to be causing the issue.
"It has recently come to our attention that users spend 95% of their time using 5% of the available features. That might be relevant." -BobO

WRT2

  • Newbie
  • *
  • Posts: 8
Re: Ethernet Failure - BRX Stops Communicating
« Reply #13 on: January 31, 2023, 04:31:57 PM »
Yes Please  (Happy Dance)

I assume that REBOOT has never cleared the TCP stack and still doesn't?

Tell me what I have to do.

BobO

  • Host Moderator
  • Hero Member
  • *****
  • Posts: 5996
  • Yes Pinky, Do-more will control the world!
Re: Ethernet Failure - BRX Stops Communicating
« Reply #14 on: January 31, 2023, 05:09:14 PM »
Yes Please  (Happy Dance)

I assume that REBOOT has never cleared the TCP stack and still doesn't?

Tell me what I have to do.

REBOOT absolutely resets the stack. It's as close to a power on reset you can get without actually powering off.

Gimme till tomorrow to make sure the current build passes the overnight testing. I'll send bits out to the email associated with your forum account. If you want me to send it elsewhere, just PM me with the alternate details.
"It has recently come to our attention that users spend 95% of their time using 5% of the available features. That might be relevant." -BobO