Skip to content

tested locally on my mac #35

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

tested locally on my mac #35

wants to merge 3 commits into from

Conversation

mortenjc
Copy link
Collaborator

@mortenjc mortenjc commented May 8, 2025

I added a status server to the filewriter

Since this is the first non EFU that reports a status a few changed were necessary:

type_fw = 5 was added
check_fw_pipeline() was added to parse the response from the server
check_service() was updated to handle fw
some formatting

Comment on lines 142 to 155
def check_fw_pipeline(self, ipaddr, port):
if self.test:
return 5
try:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((ipaddr, port))
s.send(b"getstatus")
data = s.recv(256)
data2 = int(data.decode("utf-8").split()[1][0])
s.close()
return data2
except:
self.dprint("connection reset (by peer?)")
return 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest switching to JSON for status messages from the services we poll in the dashboard and be more verbose with the status codes.

pros:

  • easier parsing on the Python side: we can access human readable keys directly as a dictionary, rather than splitting strings and working with indexes[0][1].

  • more specific error handling - for example, if the JSON is malformed, we get a JSONDecodeError. If a key is missing, we can catch a KeyError and handle it accordingly.

Right now, any issue in the try block is treated as a connection error, which can be misleading if the actual problem is with the message format.

class ServiceStatus(IntEnum):
    WORKING = 5
    IDLE = 1
    ERROR = 2
    STOPPED = 3
    INVALID = 4
def request_status(host, port):
    with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
        s.connect((host, port))
        s.send(b"getstatus")

        # Read until newline delimiter
        data = b""
        while not data.endswith(b"\n"):
            chunk = s.recv(256)
            if not chunk:
                break
            data += chunk

        message = data.decode('utf-8').strip()
        try:
            return json.loads(message)
        except JSONDecodeError:
            logger.error(f"Error decoding JSON: {message}")
            return { "status": ServiceStatus.INVALID.value, "message": f"Didn't receive valid message from the service: {host}:{port}" }

then we can define check_status like that:

def check_status(self):
    response = request_status(host, port)
    try:
        status = ServiceStatus(int(response["status"]))
    except KeyError:
        logger.error(f"Status message must contain 'status' field. Ignoring message: {response}")
    except ValueError:
        logger.error(f"Status message contains invalid 'status' value. Valid status values: {list(ServiceStatus)}. Ignoring message: {response}")
    else:
        logging.debug("Setting new status: {self.idx} {status.name}")
        self.lab.setstatus(self.idx, status.value)

the debug message could look like this:

Setting new status: 127.0.0.1 WORKING

example message json:

{"status": 1, "message": "optional debug message"} # for filewriters

{"status": 5, "stage1": 5, "stage2": 5, "stage2": 5} # for efus

{"status": 2, "message": "Unhealthy", "uptime": "2d4h"} # anything else

If we agree on a common status message format for all services, we can reuse request_status and only implement a separate check_status function that extracts the necessary information from the JSON, depending on the service being polled.

I think this will make the code easier to read and debug.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those ServiceStatus enumerations are what I was thinking of on the call @milnowak

I would suggest:
WRITING (green)
IDLE (green with lines)
STARTING (light green)
FINISHING (green)
FAILED (red)

and a status for OFFLINE (grey - response timed out; would need to be checked separately with ping like for EFUs)

Hopefully makes sense as a first pass from the meeting

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants