Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random issue: machine disconnection - reconnect - queue continues when clicking jog #40

Open
ghost opened this issue Jun 27, 2017 · 37 comments

Comments

@ghost
Copy link

ghost commented Jun 27, 2017

Not exactly sure when or why this happens, too random to put a finger on it yet. But has always been an issue for me:

  1. Sometimes randomly machine disconnects (I'll blame Smoothieboard's known usb noise issues for that)
  2. Reconnect: machine is not moving because smoothie was reset, but i think LW queue is "active" just waiting for an OK back
  3. As is usually after a fuckup like loosing comms to your machine, your first action tends to need a jog: so you hit jog to start moving.
  4. And viola lw.comm-server resumes sending the queue when it gets the OK back from the Jog commands (or i think thats why) and starts running. Endmills break, curses are uttered and days are ruined
@cprezzi
Copy link
Member

cprezzi commented Jun 27, 2017

It's not the ok from the jog that causes the queue to continue (this is fed at the end of the queue), it's the jog command from the frontend itself that kicks the queue on.

What do you suggest to do? I could clear the queue on each machine connect, so nothing unexpected happens.

@ghost
Copy link
Author

ghost commented Jun 27, 2017

Not sure actually. If you only lost connection (but board didnt reset, so current position is held) you can usually resume queue. In other cases, you will definately need to clear queue. Depends on what happened. Not sure. Perhaps after reconnect, check if a Queue, then warn user: "You just reconnected a machine, which has a queue, would you like to Resume, or Clear?"? Open to discussion here - as i said, i havent even nailed down under exactly which circumstances it happens. I do feel like the jog gets through though, as I had two instances where it disconnected while down in a cut. And i clicked Z+ to back out of the hole: The resulting move when to next point while retracting out of the hole. That was very weird. I hit the emergency stop as soon as that happened (both times unfortunately) as i realised queue resumed, but wasnt sure if i want to or not, so estop was quickest instinct.

@cprezzi
Copy link
Member

cprezzi commented Jun 27, 2017

Jog should not be able to jump over an exisitng queue, because I use addQ which adds the command to the end of the queue (see: https://github.com/LaserWeb/lw.comm-server/blob/master/server.js#L2602).
I will try to simulate that issue by unplugging the cable while job is running.

@ghost
Copy link

ghost commented Jun 27, 2017

Just some input:
We have random disconnections reported by a few users. Most of the time it was solved by switching to a different USB port or computer. We have not been able to find a common denominator with respect to hardware or O/S either.

Please let me know if there is a specific test or situation you would like us to try.

With respect to what happens after a disconnect, I would think the 'safest' approach would be to alert the users and not continue with the queue. Clearing the queue is the safest.

I have always wanted to see a more prominent status shown on the left tab for connection status and queue information. Having it always there and visible regardless of what tab you are in would be very useful. At the moment you need to switch between either COMMS or CONTROL to see whether you are connected to your machine.

@ghost
Copy link
Author

ghost commented Jun 27, 2017 via email

@cojarbi
Copy link

cojarbi commented Jun 27, 2017

Since this is widespread I also have had random disconects with my GrBL board. I'm on OSx also but haven't been able to replicate. At some point I thought it was interference on the cable

@ghost
Copy link
Author

ghost commented Jun 27, 2017 via email

@ghost
Copy link

ghost commented Jun 27, 2017

@cojarbi We also tested this with customers who reported the issue but once again it was not conclusive. We supplied new USB cables with correct ferrites installed but it didn't appear to make a difference.

@ghost
Copy link

ghost commented Jun 27, 2017

@openhardwarecoza Thanks for this info. We will have users check this next time its reported.

@ghost
Copy link
Author

ghost commented Jun 27, 2017 via email

@ghost
Copy link

ghost commented Jun 27, 2017

@openhardwarecoza We don't run it off the USB power.

@tbfleming
Copy link
Member

Has anyone seen this with GRBL-LPC?

@ghost
Copy link
Author

ghost commented Jun 28, 2017 via email

@cprezzi
Copy link
Member

cprezzi commented Jul 3, 2017

I have started implementing SD card support on backend side (https://github.com/LaserWeb/lw.comm-server/blob/SD_support/server.js#L2157) but got stuck at the frontend part. I could use some help with the frontend from the GUI specialists ;) We need some sort of File Manager popup.

@ghost
Copy link
Author

ghost commented Jul 3, 2017 via email

@ghost
Copy link
Author

ghost commented Jul 3, 2017 via email

@jorgerobles
Copy link
Contributor

What kind of file manager do you need?
Only loading files or uploading, renaming, etc.

@ghost
Copy link
Author

ghost commented Jul 3, 2017 via email

@jorgerobles
Copy link
Contributor

@ghost
Copy link
Author

ghost commented Jul 3, 2017 via email

@cprezzi
Copy link
Member

cprezzi commented Jul 3, 2017

@openhardwarecoza If I understand your code correctly, you use the smoothie sd card via the mounted device, not over USB serial commands, right?

@ghost
Copy link
Author

ghost commented Jul 3, 2017 via email

@ghost
Copy link
Author

ghost commented Jul 3, 2017 via email

@ghost
Copy link
Author

ghost commented Jul 3, 2017 via email

@ghost
Copy link
Author

ghost commented Jul 3, 2017 via email

@cprezzi
Copy link
Member

cprezzi commented Jul 3, 2017

Yes, I also prefer not to use USB MSD as it's better to switch that off for performance reason (also less disconnect problems). I already have the USB implementation (mostly) ready, I just need a react compatible presentation in the frontend.

@openhardwarecoza I will check your frontend solution and also the link @jorgerobles posted.

@ghost
Copy link
Author

ghost commented Jul 3, 2017 via email

@tbfleming
Copy link
Member

Can Smoothie report if MSD is present so we can warn?

@ghost
Copy link
Author

ghost commented Jul 3, 2017

@ghost
Copy link
Author

ghost commented Jul 3, 2017

Oops all this sd talk should've been in #30

Oh well.

Sidenote, fixed a little of my random disconnects by throwing out the Nexbook and stealing the wifes old Asus for the workshop. But that has allowed me to reproduce the original issue in this thread reliably:

Run a mill job
Hit Abort while its moving
Hit Clear All. (typo... Clear Alarm i meant)
Hit Z+ (obviously after an abort i want to withdraw the endmill out of the cut)
The Z+ move is accompanied by an unasked for XY move. (no idea where it comes from). Expected result is a straight up Z move.

(note still an issue on latest build downloaded this morning, and since its a different PC thats also eliminated some...)

The only thing i dont have nailed down is why it doesnt always happen. But its probably two thirds of the time

@cprezzi
Copy link
Member

cprezzi commented Jul 4, 2017

Strange. When we abort a job, smoothieware should go to alarm state wating for the alarm clearing by clicking abort again. But it should not accept any moves while in alarm state.
I will try to replicate that behaviour.

@ghost
Copy link
Author

ghost commented Jul 4, 2017

Typo. Clear all, was Clear Alarm. It doesnt move while in alarm. The first move after taking it out of alarm, has the weird XY in addition to the requested Z

@ghost
Copy link
Author

ghost commented Jul 18, 2017

Ruined another $20 chunk of aluminum today because of this bug :(

@DouglasPearless
Copy link

I too am hit with this bug and I have spent time trying every scenario I can think of; it happens whether USB connected, ESP2866 connected or serial (USB<->TTL serial connected to Smoothie running the latest Smoothie firmware.

An example from the log file:

INFO: Connecting to USB,/dev/cu.usbmodem14211,115200
INFO: Connected to /dev/cu.usbmodem14211 at 115200
Smoothieware detected (edge-049af91, Fri Aug 18 2017)
Run Job (561806)
Done: 500 of 33029 (ave. 45 lines/s)
Done: 1000 of 33029 (ave. 71 lines/s)
Done: 1500 of 33029 (ave. 94 lines/s)
Done: 2000 of 33029 (ave. 111 lines/s)
Done: 2500 of 33029 (ave. 125 lines/s)
Done: 3000 of 33029 (ave. 136 lines/s)
Done: 3500 of 33029 (ave. 146 lines/s)
Done: 4000 of 33029 (ave. 154 lines/s)
Done: 4500 of 33029 (ave. 161 lines/s)
Done: 5000 of 33029 (ave. 167 lines/s)
Done: 5500 of 33029 (ave. 172 lines/s)
Done: 6000 of 33029 (ave. 176 lines/s)
Done: 6500 of 33029 (ave. 181 lines/s)
Done: 7000 of 33029 (ave. 184 lines/s)
Done: 7500 of 33029 (ave. 188 lines/s)
Done: 8000 of 33029 (ave. 190 lines/s)
Done: 8500 of 33029 (ave. 193 lines/s)
Done: 9000 of 33029 (ave. 196 lines/s)
Done: 9500 of 33029 (ave. 198 lines/s)
Done: 10000 of 33029 (ave. 200 lines/s)
Done: 10500 of 33029 (ave. 202 lines/s)
Done: 11000 of 33029 (ave. 204 lines/s)
Done: 11500 of 33029 (ave. 205 lines/s)
Done: 12000 of 33029 (ave. 207 lines/s)
Done: 12500 of 33029 (ave. 208 lines/s)
Done: 13000 of 33029 (ave. 210 lines/s)
Done: 13500 of 33029 (ave. 211 lines/s)
Done: 14000 of 33029 (ave. 212 lines/s)
Done: 14500 of 33029 (ave. 213 lines/s)
Done: 15000 of 33029 (ave. 214 lines/s)
Done: 15500 of 33029 (ave. 215 lines/s)
Done: 16000 of 33029 (ave. 216 lines/s)
INFO: Port closed

I am not sure WHY the port was closed, but I am fairly certain LW closes it as Smoothie is not closing it on its side. If is possible for lw to identify why it closed the port???

@cprezzi
Copy link
Member

cprezzi commented Aug 22, 2017

@DouglasPearless The message INFO: Port closed is caused by a close event we get from the driver, which means we don't close the port, it's closed by the os or the other side.

I can see from your log, that you are not using the CNC firmware as we expect. Not sure if this makes any difference, but could you try again with the CNC version of Smoothieware? You could also try to disable MSD in config.txt.

@DouglasPearless
Copy link

I will try the CNC version and will separately disable the MSD as well, but I won't be able to do that for several days :-)

@cprezzi
Copy link
Member

cprezzi commented Aug 23, 2017

@DouglasPearless No problem, take your time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants