News:

  • June 28, 2026, 12:12:18 AM

Login with username, password and session length

Author Topic: Troubleshooting Ethernet Slave Timeouts  (Read 19076 times)

OrionHE

  • Sr. Member
  • ****
  • Posts: 90
Troubleshooting Ethernet Slave Timeouts
« on: October 06, 2016, 02:23:18 PM »


Every so often, one of our coolers (equipped with A DM1E) quits operating due to "a required Ethernet slave being offline". They run it every day, but sometimes it will go weeks without issue and other times it will be a day apart.

Based on the Retry Count in Ethernet I/O Monitor, is it safe to say the problem is not a particular slave, but rather perhaps a common Ethernet switch or the DM1E itself? What steps might I take to narrow my search for the culprit?

BobO

  • Host Moderator
  • Hero Member
  • *****
  • Posts: 6164
  • Yes Pinky, Do-more will control the world!
Re: Troubleshooting Ethernet Slave Timeouts
« Reply #1 on: October 06, 2016, 02:39:11 PM »
That's pretty balanced...and yeah, I agree that it points to an outside interference. Best thing is to Wireshark the network and look for junk. Any significant burst of broadcast traffic can overwhelm the EBC's FIFOs and cause dropped packets. PC's can spew an frightening amount of junk. We've also seen smart switches crank out large numbers of gratuitous ARPs cause retries.

Looking closer, the PLC isn't suggesting a SPAM storm (no missed frames or dropped packets), so network hardware might be most likely.

If you can tolerate the I/O dropping offline for 15 seconds, you can also disable to the shutdown when a base drops. Obviously you want to find and eliminate the cause, but keeping the PLC running may be preferable to not.
"It has recently come to our attention that users spend 95% of their time using 5% of the available features. That might be relevant." -BobO

OrionHE

  • Sr. Member
  • ****
  • Posts: 90
Re: Troubleshooting Ethernet Slave Timeouts
« Reply #2 on: October 06, 2016, 02:44:55 PM »
It's on the same network as 3 other units with identical network topography and hardware, yet none of the other units ever drops out, so I think I can rule out an overabundance of network traffic. I do appreciate the suggestion that it could be hardware specific. I can swap out any one part very easily. Unfortunately I may have to wait a month before I can begin to assume I have solved the issue.  :-\

BobO

  • Host Moderator
  • Hero Member
  • *****
  • Posts: 6164
  • Yes Pinky, Do-more will control the world!
Re: Troubleshooting Ethernet Slave Timeouts
« Reply #3 on: October 06, 2016, 02:51:30 PM »
You're set to 4 retries with 100ms timeout, so something is falling down for at least a half second. Do you have to reset anything to get back to RUN mode, or just switch the mode?

What does the status look like on the other system?
"It has recently come to our attention that users spend 95% of their time using 5% of the available features. That might be relevant." -BobO

OrionHE

  • Sr. Member
  • ****
  • Posts: 90
Re: Troubleshooting Ethernet Slave Timeouts
« Reply #4 on: October 06, 2016, 02:59:33 PM »
The operators are well versed in the "switch to RUN and back to TERM" method. Whatever is causing the dropouts remedies itself, for a while anyway. The other units have low retry counts that match each other. The logs show no slave timeouts since May. Nothing chronic for sure.

BobO

  • Host Moderator
  • Hero Member
  • *****
  • Posts: 6164
  • Yes Pinky, Do-more will control the world!
Re: Troubleshooting Ethernet Slave Timeouts
« Reply #5 on: October 06, 2016, 03:05:31 PM »
So none of the Host equipment requires a power cycle? That's good. Just trying to rule out general hardware issues.

My money is on something in the network infrastructure...switch, cable, etc., but I guess it could be the DM1E. The DM1E seems less likely if you aren't having to power cycle it though.
"It has recently come to our attention that users spend 95% of their time using 5% of the available features. That might be relevant." -BobO

ADC Product Engineer

  • Hero Member
  • *****
  • Posts: 270
Re: Troubleshooting Ethernet Slave Timeouts
« Reply #6 on: October 07, 2016, 08:19:35 AM »
My money would be on a bad cable or switch socket.  Having one or more bad sockets on a switch can do some pretty weird stuff.

Controls Guy

  • Internal Dev
  • Hero Member
  • ****
  • Posts: 3612
  • Darth Ladder
Re: Troubleshooting Ethernet Slave Timeouts
« Reply #7 on: October 07, 2016, 01:50:32 PM »
I've actually seen that issue where one port goes bad on a switch.  The first time it was tough to troubleshoot because I tended to think the switch would be either good or bad, globally, and that if the other connections were all working, then the problem must be a bad cable, but no, one port can die.
I retract my earlier statement that half of all politicians are crooks.  Half of all politicians are NOT crooks.  There.

OrionHE

  • Sr. Member
  • ****
  • Posts: 90
Re: Troubleshooting Ethernet Slave Timeouts
« Reply #8 on: October 07, 2016, 01:58:05 PM »
We swapped out one of the switches and saw a drop out within the hour. We have a couple cables and one other switch to try, but I have a hunch it's the CPU. Is there a way to copy all of the internal memory values from the CPU to disk? I'd love to NOT have to manually put all of those values in again.

BobO

  • Host Moderator
  • Hero Member
  • *****
  • Posts: 6164
  • Yes Pinky, Do-more will control the world!
Re: Troubleshooting Ethernet Slave Timeouts
« Reply #9 on: October 07, 2016, 02:53:04 PM »
We swapped out one of the switches and saw a drop out within the hour. We have a couple cables and one other switch to try, but I have a hunch it's the CPU. Is there a way to copy all of the internal memory values from the CPU to disk? I'd love to NOT have to manually put all of those values in again.

Memory Image Manager.
"It has recently come to our attention that users spend 95% of their time using 5% of the available features. That might be relevant." -BobO

OrionHE

  • Sr. Member
  • ****
  • Posts: 90
Re: Troubleshooting Ethernet Slave Timeouts
« Reply #10 on: October 07, 2016, 02:57:33 PM »
I'm definitely going to give that a try. Thanks!

OrionHE

  • Sr. Member
  • ****
  • Posts: 90
Re: Troubleshooting Ethernet Slave Timeouts
« Reply #11 on: October 27, 2016, 11:54:40 AM »
I've swapped out the CPU. A couple of hours later the PLC threw an error in the event log. "Program was halted due to a critical error."

What might this indicate?

BobO

  • Host Moderator
  • Hero Member
  • *****
  • Posts: 6164
  • Yes Pinky, Do-more will control the world!
Re: Troubleshooting Ethernet Slave Timeouts
« Reply #12 on: October 28, 2016, 12:35:42 PM »
PLC shut down due to a fatal error. Sounds like the same issue you've been seeing all along.

The specific error code is stored in a small stack in DST30-DST37, which is also displayed in System Status page of the Info dialog.
"It has recently come to our attention that users spend 95% of their time using 5% of the available features. That might be relevant." -BobO