• 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Profinet stopped working (spidev spi0.0: SPI transfer timed out)
#1
We do have several NIOT-E-NPI3-51-EN-RE running for over a year in a machine tool. Recently one machine stopped because of a profinet failure.
In the syslog (attached) we found following error:

Code:
<3>1 2020-03-31T00:58:33+02:00 NTB827EB5B7A86 kernel - - [meta sequenceId="1"] [726792.631399] spidev spi0.0: SPI transfer timed out
<3>1 2020-03-31T00:58:33+02:00 NTB827EB5B7A86 kernel - - [meta sequenceId="2"] [726792.633581] bcm2835-dma 3f007000.dma: DMA transfer could not be terminated
<3>1 2020-03-31T00:58:33+02:00 NTB827EB5B7A86 kernel - - [meta sequenceId="3"] [726792.841406] spidev spi0.0: SPI transfer timed out
<3>1 2020-03-31T00:58:33+02:00 NTB827EB5B7A86 kernel - - [meta sequenceId="4"] [726792.843638] bcm2835-dma 3f007000.dma: DMA transfer could not be terminated
<3>1 2020-03-31T00:58:33+02:00 NTB827EB5B7A86 kernel - - [meta sequenceId="5"] [726792.846249] bcm2835-dma 3f007000.dma: DMA transfer could not be terminated
<3>1 2020-03-31T00:58:33+02:00 NTB827EB5B7A86 kernel - - - spidev spi0.0: SPI transfer timed out
<3>1 2020-03-31T00:58:33+02:00 NTB827EB5B7A86 kernel - - - bcm2835-dma 3f007000.dma: DMA transfer could not be terminated
<3>1 2020-03-31T00:58:33+02:00 NTB827EB5B7A86 kernel - - - spidev spi0.0: SPI transfer timed out
<3>1 2020-03-31T00:58:33+02:00 NTB827EB5B7A86 kernel - - - bcm2835-dma 3f007000.dma: DMA transfer could not be terminated
<3>1 2020-03-31T00:58:34+02:00 NTB827EB5B7A86 kernel - - - bcm2835-dma 3f007000.dma: DMA transfer could not be terminated
<3>1 2020-03-31T00:58:34+02:00 NTB827EB5B7A86 kernel - - [meta sequenceId="6"] [726793.051407] spidev spi0.0: SPI transfer timed out
<3>1 2020-03-31T00:58:34+02:00 NTB827EB5B7A86 kernel - - [meta sequenceId="7"] [726793.053790] bcm2835-dma 3f007000.dma: DMA transfer could not be terminated
<3>1 2020-03-31T00:58:34+02:00 NTB827EB5B7A86 kernel - - - spidev spi0.0: SPI transfer timed out
<3>1 2020-03-31T00:58:34+02:00 NTB827EB5B7A86 kernel - - - bcm2835-dma 3f007000.dma: DMA transfer could not be terminated

We use following software:
Firmware version V1.1.6.0
https://hub.docker.com/r/schranz/netpi-n...ldbus-fram based on (HILSCHERNETPI_NODERED_FB_VERSION 1.0.0.0)

Is there something known about spidev spi0.0: SPI transfer timed out?
I realy would not want to update the firmware if all possible since its running incredible stable!


Attached Files
.zip   syslog.zip (Size: 35.36 KB / Downloads: 3)
  Reply
#2
First answer prior to anything else: I have not received any similar indication from other netPI customers yet having such a problem. Neither here in the forum, nor from anywhere else.

My second question is what did you do to get the system re-operative? Re-powering the device? Just restarting the container? Or in worst case the unit does not continue to work again and seems to be broken?
You never fail until you stop trying.“, Albert Einstein (1879 - 1955)

  Reply
#3
(September-10th-2020, 08:38 AM)Armin@netPI Wrote: First answer prior to anything else: I have not received any similar indication from other netPI customers yet having such a problem. Neither here in the forum, nor from anywhere else.

My second question is what did you do to get the system re-operative? Re-powering the device? Just restarting the container? Or in worst case the unit does not continue to work again and seems to be broken?

By Re-powering the device the unit continues to work since then.
  Reply
#4
The development team checked our own data base and has not found any known issue. Now they are checking the internet and kernel patches if there can be a notice to DMA or SPI be found.

Nevertheless if it is really a firmware related problem any new firmware we released in the past will not help, since no issue has been fixed in there related to this issue. So a firmware update on your side makes no sense at all right now.

And also I agree ... never touch a running system. You never know what new issues a new firmware would bring up. So I would keep it as long as possible.
You never fail until you stop trying.“, Albert Einstein (1879 - 1955)

  Reply
#5
Unfortuntely this 'SPI transfer timed out' happened again. This causes the profinet controller to stop working (e-stop on the machine tool). Could it be that only the device hardware is causing it. Is there a timout we could increase? Could we configure such, that at least not the profinet controller stops?


Here part of the log:
Code:
<37>1 2020-10-13T06:57:50+02:00 NTB827EB5B7A86 audit 18269 - - SYSCALL arch=40000028 syscall=11 per=800000 success=yes exit=0 a0=144f83c0 a1=145d7070 a2=14dbf580 a3=28 items=2 ppid=999 pid=18269 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="docker-runc" exe="/usr/bin/docker-runc" key="docker"
<37>1 2020-10-13T06:57:50+02:00 NTB827EB5B7A86 audit - - - EXECVE argc=2 a0="docker-runc" a1="--version"
<37>1 2020-10-13T06:57:50+02:00 NTB827EB5B7A86 audit - - - CWD cwd="/"
<37>1 2020-10-13T06:57:50+02:00 NTB827EB5B7A86 audit - - - PATH item=0 name="/usr/bin/docker-runc" inode=10273 dev=00:13 mode=0100755 ouid=0 ogid=0 rdev=00:00 nametype=NORMAL
<37>1 2020-10-13T06:57:50+02:00 NTB827EB5B7A86 audit - - - PATH item=1 name="/lib/ld-linux-armhf.so.3" inode=1255 dev=00:13 mode=0100755 ouid=0 ogid=0 rdev=00:00 nametype=NORMAL
<37>1 2020-10-13T06:57:50+02:00 NTB827EB5B7A86 audit - - - PROCTITLE proctitle=646F636B65722D72756E63002D2D76657273696F6E
<78>1 2020-10-13T07:01:01+02:00 NTB827EB5B7A86 CROND 18310 - - (root) CMD (       cd / && run-parts /etc/cron.hourly)
<37>1 2020-10-13T07:02:50+02:00 NTB827EB5B7A86 audit 18332 - - SYSCALL arch=40000028 syscall=11 per=800000 success=yes exit=0 a0=145f0960 a1=14e1c950 a2=148def60 a3=28 items=2 ppid=999 pid=18332 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="docker-runc" exe="/usr/bin/docker-runc" key="docker"
<37>1 2020-10-13T07:02:50+02:00 NTB827EB5B7A86 audit - - - EXECVE argc=2 a0="docker-runc" a1="--version"
<37>1 2020-10-13T07:02:50+02:00 NTB827EB5B7A86 audit - - - CWD cwd="/"
<37>1 2020-10-13T07:02:50+02:00 NTB827EB5B7A86 audit - - - PATH item=0 name="/usr/bin/docker-runc" inode=10273 dev=00:13 mode=0100755 ouid=0 ogid=0 rdev=00:00 nametype=NORMAL
<37>1 2020-10-13T07:02:50+02:00 NTB827EB5B7A86 audit - - - PATH item=1 name="/lib/ld-linux-armhf.so.3" inode=1255 dev=00:13 mode=0100755 ouid=0 ogid=0 rdev=00:00 nametype=NORMAL
<37>1 2020-10-13T07:02:50+02:00 NTB827EB5B7A86 audit - - - PROCTITLE proctitle=646F636B65722D72756E63002D2D76657273696F6E
<37>1 2020-10-13T07:07:50+02:00 NTB827EB5B7A86 audit 18391 - - SYSCALL arch=40000028 syscall=11 per=800000 success=yes exit=0 a0=144f8ec0 a1=145c7570 a2=14e2d620 a3=28 items=2 ppid=999 pid=18391 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="docker-runc" exe="/usr/bin/docker-runc" key="docker"
<37>1 2020-10-13T07:07:50+02:00 NTB827EB5B7A86 audit - - - EXECVE argc=2 a0="docker-runc" a1="--version"
<37>1 2020-10-13T07:07:50+02:00 NTB827EB5B7A86 audit - - - CWD cwd="/"
<37>1 2020-10-13T07:07:50+02:00 NTB827EB5B7A86 audit - - - PATH item=0 name="/usr/bin/docker-runc" inode=10273 dev=00:13 mode=0100755 ouid=0 ogid=0 rdev=00:00 nametype=NORMAL
<37>1 2020-10-13T07:07:50+02:00 NTB827EB5B7A86 audit - - - PATH item=1 name="/lib/ld-linux-armhf.so.3" inode=1255 dev=00:13 mode=0100755 ouid=0 ogid=0 rdev=00:00 nametype=NORMAL
<37>1 2020-10-13T07:07:50+02:00 NTB827EB5B7A86 audit - - - PROCTITLE proctitle=646F636B65722D72756E63002D2D76657273696F6E
<3>1 2020-10-13T07:07:55+02:00 NTB827EB5B7A86 kernel - - [meta sequenceId="1"] [85835.442589] spidev spi0.0: SPI transfer timed out
<3>1 2020-10-13T07:07:55+02:00 NTB827EB5B7A86 kernel - - [meta sequenceId="2"] [85835.444512] bcm2835-dma 3f007000.dma: DMA transfer could not be terminated
<3>1 2020-10-13T07:07:55+02:00 NTB827EB5B7A86 kernel - - [meta sequenceId="3"] [85835.652614] spidev spi0.0: SPI transfer timed out
<3>1 2020-10-13T07:07:55+02:00 NTB827EB5B7A86 kernel - - [meta sequenceId="4"] [85835.654543] bcm2835-dma 3f007000.dma: DMA transfer could not be terminated
<3>1 2020-10-13T07:07:55+02:00 NTB827EB5B7A86 kernel - - [meta sequenceId="5"] [85835.656844] bcm2835-dma 3f007000.dma: DMA transfer could not be terminated
<3>1 2020-10-13T07:07:55+02:00 NTB827EB5B7A86 kernel - - - spidev spi0.0: SPI transfer timed out
<3>1 2020-10-13T07:07:55+02:00 NTB827EB5B7A86 kernel - - - bcm2835-dma 3f007000.dma: DMA transfer could not be terminated
<3>1 2020-10-13T07:07:55+02:00 NTB827EB5B7A86 kernel - - - spidev spi0.0: SPI transfer timed out
<3>1 2020-10-13T07:07:55+02:00 NTB827EB5B7A86 kernel - - - bcm2835-dma 3f007000.dma: DMA transfer could not be terminated
<3>1 2020-10-13T07:07:55+02:00 NTB827EB5B7A86 kernel - - - bcm2835-dma 3f007000.dma: DMA transfer could not be terminated
<3>1 2020-10-13T07:07:55+02:00 NTB827EB5B7A86 kernel - - [meta sequenceId="6"] [85835.862625] spidev spi0.0: SPI transfer timed out
<3>1 2020-10-13T07:07:55+02:00 NTB827EB5B7A86 kernel - - [meta sequenceId="7"] [85835.864434] bcm2835-dma 3f007000.dma: DMA transfer could not be terminated
<3>1 2020-10-13T07:07:55+02:00 NTB827EB5B7A86 kernel - - - spidev spi0.0: SPI transfer timed out
<3>1 2020-10-13T07:07:55+02:00 NTB827EB5B7A86 kernel - - - bcm2835-dma 3f007000.dma: DMA transfer could not be terminated


Attached Files
.zip   syslog-20201013.zip (Size: 14.83 KB / Downloads: 3)
  Reply
#6
Hello again,

one question related to the "profinet controller to stop" issue. I have seen several Siemens S7 PLCs of type 1500 or 1200 in my live but non of these are stopping the PROFINET network just because one single PROFINET device like netPI has a problem. They just switch on a red status LED and continue to communicate to the rest of the PROFINET network while trying to get a communication running again to the missing device.

So has someone configured the PLC to go into forced stop mode in case one device is missing?

What do you or the end customer have exactly to do to get the system running again. Restart of netPI or just a container restart? Or a reset on PLC side? 

Is it always the same device that is showing the problem?

The Industrial Network Controller netX51 PROFINET device firmware is of version V3.12.0.2 from year 2017. Of course we have newer versions of software for this controller available, but before we change anything in the software we should focus more on the hardware first. When the problem occurs is there a special activity on PROFINET side active that is activated maybe the same time? Or it is a 24/7 machine that is not observed and no human is onsite?
You never fail until you stop trying.“, Albert Einstein (1879 - 1955)

  Reply
#7
(November-2nd-2020, 02:39 PM)Armin@netPI Wrote: Hello again,

one question related to the "profinet controller to stop" issue. I have seen several Siemens S7 PLCs of type 1500 or 1200 in my live but non of these are stopping the PROFINET network just because one single PROFINET device like netPI has a problem. They just switch on a red status LED and continue to communicate to the rest of the PROFINET network while trying to get a communication running again to the missing device.

So has someone configured the PLC to go into forced stop mode in case one device is missing?

What do you or the end customer have exactly to do to get the system running again. Restart of netPI or just a container restart? Or a reset on PLC side? 

Is it always the same device that is showing the problem?

The Industrial Network Controller netX51 PROFINET device firmware is of version V3.12.0.2 from year 2017. Of course we have newer versions of software for this controller available, but before we change anything in the software we should focus more on the hardware first. When the problem occurs is there a special activity on PROFINET side active that is activated maybe the same time? Or it is a 24/7 machine that is not observed and no human is onsite?

Thank you for your quick response!

Please see answers below:
Yes, it might be possible to configure the PLC that it does not go into forced stop. I would then have the netPi beeing able to recoved itself.
Up to now the customer restarted only the netPI to recover from the Profinet failure (no need to reset the PLC).
Yes, it is so far only the same device.
The error occured while machining (no human intervention nor profinet activation/configuration)
  Reply
#8
So what is if you order a new netPI and load the very same container onto the system in order to exclude a hardware problem.

Your phenomena in indeed a problem I have not heard of befire. As I said instead of searching for software problems etc. I would rather exchange your device with a new hardware. After 1 year of running fine and then again after about 2-3 month runtime it is difficult to search for any software problems anyway.

From all the changes after 2-3 years of first Node-RED container I made and the FRAM container meanwhile nearly all software components have been changed. New netX software, new Node-RED version, new netPI OS. But with all the changes made non of the parts I would see to solve your problem right now. In the latest netPI version V1.2.1.0 we solved a problem with an open interrupt in the Linux kernel ... but I don't think this relates to your problem.

From what I see that on one time the software says it fails to transmit DMA data and on the other hand the PLC also stop detecting also a problem. For me it seems the Industrial Network Contoller has a malfunction where we see reactions on two sides - internal communication and network communication.

Thx
Armin
You never fail until you stop trying.“, Albert Einstein (1879 - 1955)

  Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Profinet name lost after reboot Dipro 7 213 November-16th-2020, 01:07 PM
Last Post: Armin@netPI
  netPI profinet device name WalterSchaefer 2 282 October-7th-2020, 11:10 AM
Last Post: WalterSchaefer
  DHCP server not working m-yoshikawa 20 1,319 August-27th-2020, 01:30 PM
Last Post: ioexpert
  Fieldbus Ports not working Schoko 13 973 June-29th-2020, 09:43 PM
Last Post: Armin@netPI
  Node-Red "http in"-Node not working DWxPro 8 676 June-19th-2020, 01:12 AM
Last Post: DWxPro
  how to pick up PROFINET data and store them to OPC UA server on Node-RED/netPI. tad 7 694 March-16th-2020, 10:21 AM
Last Post: Armin@netPI
  PROFINET IO startup compilation arpan.patel@slscorp.com 3 314 February-24th-2020, 12:47 PM
Last Post: Armin@netPI
  Node Red stop working DSongra 1 429 February-12th-2020, 08:19 AM
Last Post: Armin@netPI
  Profinet Startup Time Schranz 3 433 November-18th-2019, 03:22 PM
Last Post: Armin@netPI
  Codesys RTE profinet echteler 2 420 September-25th-2019, 09:56 AM
Last Post: Armin@netPI

Forum Jump:


Users browsing this thread: 1 Guest(s)