-
-
Notifications
You must be signed in to change notification settings - Fork 273
Adding option for configuring custom log Regexp timeout #364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clean and simple change. I'll do the changes to the doc myself since I think we'd need a rewrite of the logging options in the configuration guide.
|
||
# Set custom timeout for log filtering | ||
# which is useful when working with large payloads | ||
config.log_regexp_timeout = 4 # seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not categorise this as basic logging. Let's remove that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be worth having a separate section called "Advanced logging options" where to put also the logger and log file options. Since that requires a rewrite of the section, I'm willing to take that on myself.
There's one problem with this: Regexp.timeout is only available from Ruby 3.1. Could you make it conditional to that having Ruby >= 3.2? |
Should we raise an error if a user sets |
I'd say output a warning and proceed |
@log_file = $stdout | ||
@log_level = ENV['RUBYLLM_DEBUG'] ? Logger::DEBUG : Logger::INFO | ||
@log_stream_debug = ENV['RUBYLLM_STREAM_DEBUG'] == 'true' | ||
@log_regexp_timeout = Gem::Version.new(RUBY_VERSION) >= Gem::Version.new('3.2.0') ? (Regexp.timeout || 1.0) : nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that in ruby_llm
spec setup, Regexp.timeout
ends up being nil
by default (probably caused by some recent change as specs passed before), and confirmed that Ruby itself doesn’t set a default outside of Rails. I think it's a good idea to introduce a healthy default of 1.0
second if Regexp.timeout
is nil
, at the same time, I see the case for leaving nil
untouched if that’s an intentional choice by the underlying system.
Do you have any thoughts or preferences on whether we should enforce a fallback, or keep respecting nil
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Convention over configuration! 1 second is perfect.
@log_file = $stdout | ||
@log_level = ENV['RUBYLLM_DEBUG'] ? Logger::DEBUG : Logger::INFO | ||
@log_stream_debug = ENV['RUBYLLM_STREAM_DEBUG'] == 'true' | ||
@log_regexp_timeout = Gem::Version.new(RUBY_VERSION) >= Gem::Version.new('3.2.0') ? (Regexp.timeout || 1.0) : nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Convention over configuration! 1 second is perfect.
if Gem::Version.new(RUBY_VERSION) >= Gem::Version.new('3.2.0') | ||
logger.filter( | ||
Regexp.new('[A-Za-z0-9+/=]{100,}', timeout: @config.log_regexp_timeout), | ||
'data":"[BASE64 DATA]"' | ||
) | ||
logger.filter(Regexp.new('[-\\d.e,\\s]{100,}', timeout: @config.log_regexp_timeout), '[EMBEDDINGS ARRAY]') | ||
else | ||
if @config.log_regexp_timeout | ||
RubyLLM.logger.warn("log_regexp_timeout is not supported on Ruby #{RUBY_VERSION}") | ||
end | ||
logger.filter(Regexp.new('[A-Za-z0-9+/=]{100,}'), 'data":"[BASE64 DATA]"') | ||
logger.filter(Regexp.new('[-\\d.e,\\s]{100,}'), '[EMBEDDINGS ARRAY]') | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should simplify all that and use the same approach as Rails: https://github.com/rails/rails/pull/53490/files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I fully follow.
@crmne are you proposing that ruby_llm
drops support for Ruby 3.1 in a newer version of ruby_llm
? 😀
Rails 8 is compatible with Ruby 3.2+ (source) and installing rails 8 on Ruby 3.1.3 gives error: rails-8.0.0 requires Ruby version >= 3.2.0. The current ruby version is 3.1.3.
Because rails 8 requires Ruby 3.2+, it is possible to set Regexp.timeout
for it.
In ruby_llm
case, if Ruby 3.1 wasn't supported, we could drop else
and revert this block to:
logger.filter(
Regexp.new('[A-Za-z0-9+/=]{100,}', timeout: @config.log_regexp_timeout),
'data":"[BASE64 DATA]"'
)
logger.filter(Regexp.new('[-\\d.e,\\s]{100,}', timeout: @config.log_regexp_timeout), '[EMBEDDINGS ARRAY]')
If you are proposing to use global Regexp.timeout
(e.g. from parent environment where gem is used) then ruby_llm
is already doing it, but it doesn't solve my use-case where I need Regexp.timeout
to be different for this logging.
What this does
Overview
This PR adds support for a configurable
log_regexp_timeout
used specifically byRubyLLM's
logging filters. It allows you to override the globalRegexp.timeout
for logging without affecting the rest of your application.Problem It Solves
When handling large or complex files, the default logging filters in
RubyLLM
may hit aRegexp::TimeoutError
. This is because many applications set a low globalRegexp.timeout
(e.g. 1 second) to protect against ReDoS, which can be too restrictive for logging large payloads.Solution
This change adds support for a
log_regexp_timeout
configuration value. If provided,RubyLLM
will use this timeout when compiling regular expressions for its Faraday logger filters. This isolates logging-related regex behavior from the broader application, improving reliability without compromising global safety.Example usage:
Type of change
Scope check
Quality check
overcommit --install
and all hooks passmodels.json
,aliases.json
)API changes
Related issues
Implements #331