can_read() in read_command.c cause partial output in rare case. #68

xmlijhu · 2024-07-23T02:28:02Z

In rare case, when we run the logcollector with command output, we only get partial command output.

sample configuration in ossec.conf.

  <localfile>
    <log_format>command</log_format>
    <command>sleep 13; nice -n 10 bash /var/ossec/etc/share/sample.shell</command>
    <alias>sample_shell</alias>
    <out_format>$(timestamp) $(hostname) sample_shell: $(log)</out_format>
    <frequency>180</frequency>
  </localfile>

In most of the time, the output will output 80 lines of message, however in rare case, it will output less than 80.

After we drilled down, and then built a customized version of wazuh-logcollector with more debug information, we found the culprit is the can_read() function, which is false during the iteration of the fgets().

https://github.com/wazuh/wazuh/blob/master/src/logcollector/read_command.c#L43

What's the purpose of can_read() here for reading command output? I think it's mainly for monitoring file purpose when the file is rotated or truncated.
Further investigation turned out there is NO can_read() is used in read_fullcommand.c source,

In the ossec.log after turn on the debug=2

Most time the output will output 80 lines like below
2024/07/19 20:08:36 wazuh-logcollector[12738] read_command.c:73 at read_command(): DEBUG: Read 80 lines from command 'sleep 13; nice -n 10 bash /var/ossec/etc/shared/sample.shell'

While in rare case, it only show 1 line of output
2024/07/19 23:06:23 wazuh-logcollector[12738] read_command.c:73 at read_command(): DEBUG: Read 1 lines from command 'sleep 13; nice -n 10 bash /var/ossec/etc/shared/sample.shell'

The text was updated successfully, but these errors were encountered:

juliancnn · 2024-07-31T13:25:23Z

Hi @xmlijhu,

The set_read and can_read functions are part of an older synchronization mechanism between reader threads and the thread managing runtime configuration in Logcollector. This mechanism can occasionally lead to issues where command outputs are not fully captured due to the function's position within the while loop condition.
I am still not sure if this should be before or after the execution of the command, because of the implications it may have, but in rare cases as a condition of the while it may cause it not to send all the logs.

Suggested Workaround:
Consider using the Command wodle, which provides a more stable mechanism for executing and capturing command outputs. More details can be found here:

I will escalate this as a potential bug for further review and enhancement. Thank you for bringing this to our attention!

Regards

jeffery-jen · 2024-08-01T03:16:27Z

@juliancnn Thanks for looking at this.

From commit history the implementation had been there for quiet a while, but compared to read_fullcommand.c, where the ENTIRE command output is captured and delivered through w_msg_hash_queues_push not checking lock in can_read().

With this in mind, read_command.c checks other input threads for no apparent reason.

juliancnn · 2024-08-01T12:56:25Z

Hi @jeffery-jen, yes, it is an old code, I think the main reason for this is that all the reader threads leave their tasks as soon as possible so that the main thread can refresh its configs, following this logic I think the can_read() check should be before the command execution, although this does not address the timeout problems.

jeffery-jen · 2024-08-19T14:42:28Z

A related issue to this is also observed here since 2021

wazuh/wazuh#9130

Would any action be taken on this?

juliancnn self-assigned this Jul 31, 2024

juliancnn added type/bug Bug issue module/logcollector reporter/community Issue reported by the community labels Jul 31, 2024

juliancnn removed their assignment Jul 31, 2024

juliancnn transferred this issue from wazuh/wazuh Aug 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

can_read() in read_command.c cause partial output in rare case. #68

can_read() in read_command.c cause partial output in rare case. #68

xmlijhu commented Jul 23, 2024 •

edited

Loading

juliancnn commented Jul 31, 2024

jeffery-jen commented Aug 1, 2024

juliancnn commented Aug 1, 2024

jeffery-jen commented Aug 19, 2024

can_read() in read_command.c cause partial output in rare case. #68

can_read() in read_command.c cause partial output in rare case. #68

Comments

xmlijhu commented Jul 23, 2024 • edited Loading

In rare case, when we run the logcollector with command output, we only get partial command output.

In most of the time, the output will output 80 lines of message, however in rare case, it will output less than 80.

In the ossec.log after turn on the debug=2

juliancnn commented Jul 31, 2024

jeffery-jen commented Aug 1, 2024

juliancnn commented Aug 1, 2024

jeffery-jen commented Aug 19, 2024

xmlijhu commented Jul 23, 2024 •

edited

Loading