Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logging Not Full Downloaded When Log is Large #36

Open
engmtcdrm opened this issue Jun 28, 2024 · 1 comment
Open

Logging Not Full Downloaded When Log is Large #36

engmtcdrm opened this issue Jun 28, 2024 · 1 comment

Comments

@engmtcdrm
Copy link

If you have a piece of code that takes less time to run than to download the log, the SASStudioOperator will not finish downloading the log. This is due to how the method _run_job_and_wait works. It will continue to loop until the status of the job changes out of running or pending (let's ignore the unknown logic for a moment). While it is looping it will call the method stream_log which will download the log for Airflow. However, if the job completes before the log can be downloaded, the log will be incomplete.

The SAS code to run to replicate this is rather simple, it spits out a ton of NOTEs to the log intentionally trying to see how far the log can be pushed before breaking. If the uncommented 500000 does not work to see the issue, just change it to a higher number and try again.

%macro test_log_lines();
    /* %do i = 1 %to 250000; */
    %do i = 1 %to 500000;
    /* %do i = 1 %to 1000000; */
    /* %do i = 1 %to 2000000; */
    /* %do i=1 %to 40000000; */
    /* %do i=1 %to 80000000; */
        %put NOTE: hi mom &i.!;

        /* data _null_;
            sleep(1);
        run; */
    %end;
%mend test_log_lines;

%test_log_lines;

The solution to this is to have a second check after the while loop on line 347 to check if the num_log_lines < job['logStatistics']['lineCount']. This logic should be a while loop replace the logic on line 379 because that will only grab the last 99999 lines of the log, not ALL the rest.

Lastly, here is a screen shot of the log from the code I ran above in Airflow UI. You can see how the log abruptly stops.
image

@rizhansas
Copy link

rizhansas commented Jul 2, 2024

Polling the job status in the operator is suboptimal. Implementing defer operator is a better choice. Because it frees up the precious CPU task slot and let the Triggerer takes care of polling.

We have experienced a similar log collection delay in a customized operator. A second check or a delay collection are two possible solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants