Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disconnected after 30 seconds on servers upgraded to 5.10 (or even 5.9) #15627

Closed
DanS-User opened this issue Jan 3, 2025 · 8 comments · Fixed by #15680
Closed

Disconnected after 30 seconds on servers upgraded to 5.10 (or even 5.9) #15627

DanS-User opened this issue Jan 3, 2025 · 8 comments · Fixed by #15680
Assignees
Labels
Bug Issues that were confirmed to be a bug @ Network Regression Something that used to work no longer does
Milestone

Comments

@DanS-User
Copy link

Luanti version

2025-01-02 18:35:51: INFO[Main]: Luanti 5.10.0 (Linux)
2025-01-02 18:35:51: INFO[Main]: Using LuaJIT 2.1.0-beta3
2025-01-02 18:35:51: INFO[Main]: Built by GCC 11.4
2025-01-02 18:35:51: INFO[Main]: Running on Linux/6.8.0 x86_64
2025-01-02 18:35:51: INFO[Main]: BUILD_TYPE=Release
2025-01-02 18:35:51: INFO[Main]: RUN_IN_PLACE=0
2025-01-02 18:35:51: INFO[Main]: USE_CURL=1
2025-01-02 18:35:51: INFO[Main]: USE_GETTEXT=1

Operating system and version

Ubuntu 22.04

CPU model

Intel i5-3570K

GPU model

Intel HD 4000

Active renderer

No response

Summary

After the developers of Exile upgraded the servers to 5.10 I started getting disconnected after 30 seconds from log in about 50% of the time. It happens on multiple clients, 5.10, 5.8, 5.6 on Intel and AMD computers. On request, they downgraded one of the servers to 5.7 and the problems disappeared when playing on that particular server. I launched Luanti in verbose from the terminal and I get these messages (ConnectionSend happens 12 times, this is the last one):
VERBOSE[ConnectionSend]: con(19/2439)RE-SENDING timed-out RELIABLE to 104.157.94.37(t/o=0.1): count=12, channel=0, seqnum=65500
INFO[ConnectionSend]: con(19/2439)RunTimeouts(): Peer 1 has timed out (outgoing reliables channel=0)
INFO[Main]: Client::deletingPeer(): Server Peer is getting deleted (timeout=1)
ERROR[Main]: Access denied. Reason: Connection timed out.

Steps to reproduce

Clients (including 5.10) logged in servers upgraded to 5.10 version get disconnected after 30 seconds but not consistently, about 50% of the time. Good ping times to the server, great internet speed. It happens only when the server mods upgrade their servers from 5.8 (or less) to 5.10 (even 5.9 does it)

@DanS-User DanS-User added the Unconfirmed bug Bug report that has not been confirmed to exist/be reproducible label Jan 3, 2025
@DanS-User
Copy link
Author

MT510ServerError.txt

@sfan5 sfan5 added the @ Network label Jan 3, 2025
@sfan5
Copy link
Collaborator

sfan5 commented Jan 3, 2025

The first reliable packet a client sends is the empty "connection start" packet. But if that got lost then you wouldn't be able to connect. So there's something else going, and in fact there might be an edge case where this could happen.
Can you additionally provide a pcap of the problem? (use e.g. tcpdump -v -w file.pcap "udp port 30014")

@sfan5 sfan5 added the Regression Something that used to work no longer does label Jan 3, 2025
@Zughy
Copy link
Contributor

Zughy commented Jan 3, 2025

It happens to me as well, about 5% of the time, both when connecting to a locally hosted server and a public one. I don't know if it only happens when the server has just launched

@Zughy Zughy added Bug Issues that were confirmed to be a bug and removed Unconfirmed bug Bug report that has not been confirmed to exist/be reproducible labels Jan 3, 2025
@Zughy Zughy added this to the 5.11.0 milestone Jan 3, 2025
@SwissalpS
Copy link
Contributor

SwissalpS commented Jan 3, 2025

I don't know if it only happens when the server has just launched

Players have complained about sudden disconnects even after server had been running for hours if not days.
Often Luanti does not report them as having timed out. (i.e. functions regestered using core.register_on_leaveplayer(function(player, timed_out)....end) don't have the timed_out argument set.)

@DanS-User
Copy link
Author

DanS-User commented Jan 4, 2025

The first reliable packet a client sends is the empty "connection start" packet. But if that got lost then you wouldn't be able to connect. So there's something else going, and in fact there might be an edge case where this could happen. Can you additionally provide a pcap of the problem? (use e.g. tcpdump -v -w file.pcap "udp port 30014")

@Zughy
Copy link
Contributor

Zughy commented Jan 4, 2025

Possible useful info:

  • The fastest way to know if I'm affected by the bug is looking at HUDs, as HUDs don't update: when I log in, in my hand I have a weapon that hides the default crosshair to show a HUD with a custom one. This works. However, when I change item in my hand (e.g. empty hand) the custom HUD is supposed to be hidden and the default crosshair shown again. This doesn't work, I'm stuck with the custom HUD
  • if I notice to be affected by the bug and I disconnect before the game kicks me out, I can instantly relog in. On the contrary, if the game kicks me, the account will remain online for a few seconds

@sfan5
Copy link
Collaborator

sfan5 commented Jan 4, 2025

file.zip

thanks.
it's exactly the edge case I suspected:

  • client sends "connection start" packet from peer id 0 (unassigned)
  • everything works, server assigns a new peer id to the client
    • but the ack for the "connection start" packet gets lost
  • client resends "connection start" packet, still from peer id 0 (!)
  • server goes "what is this guy doing, I already told him his peer id" and ignores the packet
  • client gives up eventually and declares the connection broken

the fix:
the client needs to take the new peer id into account when re-sending packets

@DanS-User
Copy link
Author

I would like to add this happens only when the server code is updated to 5.10. When I requested Mantar to do me a big favour and change back the server code version to 5.7 temporarily, the bug stops even if my client is at 5.10

@sfan5 sfan5 self-assigned this Jan 5, 2025
sfan5 added a commit to sfan5/luanti that referenced this issue Jan 14, 2025
Otherwise a desync could ocurr since the server does strict checking.
fixes luanti-org#15627
@sfan5 sfan5 linked a pull request Jan 14, 2025 that will close this issue
sfan5 added a commit that referenced this issue Feb 1, 2025
Otherwise a desync could ocurr since the server does strict checking.
fixes #15627
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Issues that were confirmed to be a bug @ Network Regression Something that used to work no longer does
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants