-
Notifications
You must be signed in to change notification settings - Fork 1
tested locally on my mac #35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
dashboard/generate.py
Outdated
def check_fw_pipeline(self, ipaddr, port): | ||
if self.test: | ||
return 5 | ||
try: | ||
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) | ||
s.connect((ipaddr, port)) | ||
s.send(b"getstatus") | ||
data = s.recv(256) | ||
data2 = int(data.decode("utf-8").split()[1][0]) | ||
s.close() | ||
return data2 | ||
except: | ||
self.dprint("connection reset (by peer?)") | ||
return 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest switching to JSON for status messages from the services we poll in the dashboard and be more verbose with the status codes.
pros:
-
easier parsing on the Python side: we can access human readable keys directly as a dictionary, rather than splitting strings and working with indexes[0][1].
-
more specific error handling - for example, if the JSON is malformed, we get a JSONDecodeError. If a key is missing, we can catch a KeyError and handle it accordingly.
Right now, any issue in the try
block is treated as a connection error, which can be misleading if the actual problem is with the message format.
class ServiceStatus(IntEnum):
WORKING = 5
IDLE = 1
ERROR = 2
STOPPED = 3
INVALID = 4
def request_status(host, port):
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
s.connect((host, port))
s.send(b"getstatus")
# Read until newline delimiter
data = b""
while not data.endswith(b"\n"):
chunk = s.recv(256)
if not chunk:
break
data += chunk
message = data.decode('utf-8').strip()
try:
return json.loads(message)
except JSONDecodeError:
logger.error(f"Error decoding JSON: {message}")
return { "status": ServiceStatus.INVALID.value, "message": f"Didn't receive valid message from the service: {host}:{port}" }
then we can define check_status like that:
def check_status(self):
response = request_status(host, port)
try:
status = ServiceStatus(int(response["status"]))
except KeyError:
logger.error(f"Status message must contain 'status' field. Ignoring message: {response}")
except ValueError:
logger.error(f"Status message contains invalid 'status' value. Valid status values: {list(ServiceStatus)}. Ignoring message: {response}")
else:
logging.debug("Setting new status: {self.idx} {status.name}")
self.lab.setstatus(self.idx, status.value)
the debug message could look like this:
Setting new status: 127.0.0.1 WORKING
example message json:
{"status": 1, "message": "optional debug message"} # for filewriters
{"status": 5, "stage1": 5, "stage2": 5, "stage2": 5} # for efus
{"status": 2, "message": "Unhealthy", "uptime": "2d4h"} # anything else
If we agree on a common status message format for all services, we can reuse request_status
and only implement a separate check_status
function that extracts the necessary information from the JSON, depending on the service being polled.
I think this will make the code easier to read and debug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those ServiceStatus enumerations are what I was thinking of on the call @milnowak
I would suggest:
WRITING (green)
IDLE (green with lines)
STARTING (light green)
FINISHING (green)
FAILED (red)
and a status for OFFLINE (grey - response timed out; would need to be checked separately with ping like for EFUs)
Hopefully makes sense as a first pass from the meeting
I added a status server to the filewriter
Since this is the first non EFU that reports a status a few changed were necessary:
type_fw = 5 was added
check_fw_pipeline() was added to parse the response from the server
check_service() was updated to handle fw
some formatting