Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add query-log-path configuration #25710

Open
wants to merge 3 commits into
base: master-1.x
Choose a base branch
from

Conversation

devanbenz
Copy link

@devanbenz devanbenz commented Dec 26, 2024

This PR adds the ability for an admin to define 'query-log-path' as a configuration item field.

A new configuration option is created

[data]
# Setting a logging path will enable query logging
query-log-path = "/var/influx/query.log"

This will enable query logging.

Logged queries example:

{"level":"info","ts":1735248393.084461,"msg":"Executing query","query":"SHOW DATABASES"}
{"level":"info","ts":1735248395.092188,"msg":"Executing query","query":"SHOW DATABASES"}
{"level":"info","ts":1735248398.58039,"msg":"Executing query","query":"SELECT * FROM stress.autogen.m0 LIMIT 20"}

@devanbenz devanbenz marked this pull request as ready for review December 30, 2024 16:56
query/executor_test.go Outdated Show resolved Hide resolved
Copy link
Contributor

@davidby-influx davidby-influx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did not look at the tests

etc/config.sample.toml Outdated Show resolved Hide resolved
query/executor.go Outdated Show resolved Hide resolved
query/executor.go Outdated Show resolved Hide resolved
query/executor.go Outdated Show resolved Hide resolved
query/executor.go Outdated Show resolved Hide resolved
query/executor.go Outdated Show resolved Hide resolved
query/executor.go Outdated Show resolved Hide resolved
query/executor.go Outdated Show resolved Hide resolved
query/executor.go Outdated Show resolved Hide resolved
Copy link
Contributor

@davidby-influx davidby-influx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick review (didn't look at tests) on this, treating it like a draft PR, per your request.

coordinator/points_writer.go Outdated Show resolved Hide resolved
// WatcherInterface is used for any file watch functionality using fsnotify
type WatcherInterface interface {
Event() chan fsnotify.Event
Error() chan error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a direction on this channel?

e.Logger.Debug("received file event", zap.String("event", event.Name))

if event.Op == fsnotify.Remove || event.Op == fsnotify.Rename {
e.mu.Lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps toss this block into a lambda and use defer mu.Unlock()? That protects against future additional return paths.

case err, ok := <-w.Error():
closeErr := w.Close()
if closeErr != nil {
return fmt.Errorf("failed to close watcher: %w", closeErr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to throw away the first error. Use errors.Join to get the original error and the Close error.

return f.fsWatcher.Events
}

func (f *FileLogWatcher) Error() chan error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

channel direction?

query/file_log_watcher.go Show resolved Hide resolved
logFile, err := os.OpenFile(f.path, os.O_CREATE|os.O_RDWR|os.O_APPEND, 0644)
if err != nil {
f.logger.Error("failed to open log file", zap.Error(err), zap.String("path", f.path))
return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return err ??

existingCore := f.logger.Core()
encoder, err := l.NewEncoder(f.formatterConfig)
if err != nil {
return err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to Close the file here? If this isn't a fatal error for the database we should probably close the file to avoid leaking the handle.

f.logger = zap.New(newCore)

if err := f.fsWatcher.Add(f.path); err != nil {
return err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above. What needs to be cleaned up here, if anything?

}

func (f *FileLogWatcher) Close() error {
if err := f.currFile.Sync(); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Do we need mutex protection here?
  • We should always close the file, even if the sync fails.
  • We should close the fsWatcher even if the sync or close fails
  • errors.Join is your friend.


for time.Now().Before(deadline) {
if mockWatcher.ChangeEvents.Load() == 1 {
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you breaking here, or running out the deadline?

Copy link
Contributor

@davidby-influx davidby-influx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some changes for debugging the CI problem, some general changes

require.Equal(t, uint64(1), mockWatcher.ChangeEvents.Load())
require.Equal(t, uint64(1), mockWatcher.RemovedPaths.Load())

err = mockWatcher.Close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already in the defer above. Move the require.NoError into a lambda for the defer.

require.Equal(t, uint64(1), mockWatcher.ChangeEvents.Load())
require.Equal(t, uint64(1), mockWatcher.RemovedPaths.Load())

err = mockWatcher.Close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, the Close is already in the defer. Move the require.NoError into a lambda in the defer

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops yeah meant to adjust that. Going to move to the lambdas as per your comment

Op: fsnotify.Remove,
}

for time.Now().Before(deadline) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps use time.After instead of a loop?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call I'll try that out


for time.Now().Before(deadline) {
if mockWatcher.ChangeEvents.Load() == 1 {
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set a bool variable here to see if this is the loop exit or the ChangeEvents.Load is

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

This PR adds the ability for an admin to define 'query-log-path' as a configuration item field.

A new configuration option is created

```toml
[data]
query-log-path = "/var/influx/query.log"
```

This will enable query logging being written to a log file on disk.

Logged queries example:
```
{"level":"info","ts":1735248393.084461,"msg":"Executing query","query":"SHOW DATABASES"}
{"level":"info","ts":1735248395.092188,"msg":"Executing query","query":"SHOW DATABASES"}
{"level":"info","ts":1735248398.58039,"msg":"Executing query","query":"SELECT * FROM stress.autogen.m0 LIMIT 20"}
```
@devanbenz devanbenz force-pushed the db/459/query-loggin branch from 2109d11 to efcf13c Compare January 8, 2025 20:18
@devanbenz devanbenz marked this pull request as ready for review January 8, 2025 21:11
// FileWatcher is not meant to be used with interactive shell sessions so
// console as a logging format will not be supported.
if format == "console" {
logger.Error("unknown logging format", zap.String("format", format), zap.String("path", path))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

console isn't an unknown logging format. It's an unsupported format for query logging.

format = "logfmt"
}

logFile, err := os.OpenFile(path, os.O_CREATE|os.O_RDWR|os.O_APPEND, 0644)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally it's better to use a permission of 0666 and let the system's umask value get the permissions the admin really wants (source: lots of headaches). It's likely the admin might want the file to be group writable for the external log rotation to work properly. Hard coding 0644 here might cause issues with said log rotation.

@@ -487,6 +487,16 @@ func (s *Server) Open() error {
// so it only logs as is appropriate.
s.QueryExecutor.TaskManager.Logger = s.Logger
}
if s.config.Data.QueryLogPath != "" {
path := s.config.Data.QueryLogPath
flw := query.NewFileLogWatcher(s.QueryExecutor, path, s.Logger, s.config.Logging.Format)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either query.NewFileLogWatcher is misnamed, or QueryExecutor.WithLogWriter is misnamed. It looks like the wrong thing is getting passed around.

}
}

func (e *Executor) WithLogWriter(w WatcherInterface, ctx context.Context) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ctx should be the first parameter. From the docs:

The Context should be the first parameter, typically named ctx.

// FileWatcher is not meant to be used with interactive shell sessions so
// console as a logging format will not be supported.
if format == "console" {
logger.Error("unknown logging format", zap.String("format", format), zap.String("path", path))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like it would be cleaner for NewFileLogWatcher to return an additional error value and not take the logger object in. The caller should log the error returned.

Stylistically, a function that on error only returns nil feels like it should be left in the world of C.

if flw != nil {
s.QueryExecutor.WithLogWriter(flw, context.Background())
} else {
s.Logger.Error("error creating log writer", zap.String("path", path))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, if NewFileLogWatcher returned an error, you could log the error right here instead of this generic log message.

@@ -65,7 +65,7 @@ func newEncoder(format string) (zapcore.Encoder, error) {
}
}

func newEncoderConfig() zapcore.EncoderConfig {
func NewEncoderConfig() zapcore.EncoderConfig {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't find any uses of NewEncodingConfig outside of the logger package. Is there a reason to change it to public?

@@ -289,6 +306,94 @@ func (e *Executor) WithLogger(log *zap.Logger) {
e.TaskManager.Logger = e.Logger
}

func startFileLogWatcher(w WatcherInterface, e *Executor, ctx context.Context) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

startFileLogWatcher made me think this was going to spawn a goroutine, but it doesn't. It's the body of a goroutine. Maybe either doFileLogWatch or simply fileLogWatch?

@@ -242,6 +257,8 @@ type Executor struct {

// expvar-based stats.
stats *Statistics

mu sync.Mutex
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not put mu at the bottom of the struct. A developer going into this code might miss mu and not lock appropriately.

My preference is to put mu above all the members that it will protect, along with a comment that mu protects all following members. Members that don't require protection can go above mu.

@gwossum
Copy link
Member

gwossum commented Jan 9, 2025

@devanbenz The hardest part of this, IMO, is correctly handling the fsnotify events. It would be great to have some tests that actually exercise that functionality.

@gwossum
Copy link
Member

gwossum commented Jan 9, 2025

@devanbenz You might even consider just using the real thing for all the tests instead of the mocked interface. Use testing.T.TempDir() to create a temp directory to drop the log files into. To check the contents, slurp the log files in with io.ReadAll and a regex or even a simple strings.Contains to check for the expected statement in the logs. You could then get rid of the complexity of the interface and deal directly with the real object.

@gwossum
Copy link
Member

gwossum commented Jan 9, 2025

@devanbenz The more I think about it, the more I like getting rid of the mock. I'm interested in how the fsnotify behaves when:

  1. influxd creates a log new file.
  2. influxd appends to an existing log file.
  3. A log rotator renames the current log.
  4. A log rotator renames the current log and then starts a new log file.
  5. Make sure we don't have other fsnotify related issues during normal operation.

@devanbenz
Copy link
Author

devanbenz commented Jan 10, 2025

@devanbenz The more I think about it, the more I like getting rid of the mock. I'm interested in how the fsnotify behaves when:

  1. influxd creates a log new file.
  2. influxd appends to an existing log file.
  3. A log rotator renames the current log.
  4. A log rotator renames the current log and then starts a new log file.
  5. Make sure we don't have other fsnotify related issues during normal operation.

This makes sense to me. I'm usually under the mindset to not "test" third party libraries but I could foresee this being an exception to that since its correctness is core to the changes. Do you think I should get rid of the interface and just make everything concrete as well?

Comment on lines +492 to +494
flw := query.NewFileLogWatcher(s.QueryExecutor, path, s.Logger, s.config.Logging.Format)
if flw != nil {
s.QueryExecutor.WithLogWriter(flw, context.Background())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devan I had an idea last night on how to make this way more flexible and reusable.

First, change QueryExecutor.WithLogWriter(*FileLogWatcher) to QueryExecutor.WithQueryLogger(*zap.Logger). By taking a *zap.Logger, we have way more flexibility in how queries can be logged. It will also decouple log rotation logic from the QueryExecutor if the code ends up being split between OSS and Enterprise.

Second, change query.NewFileLogWatcher(...) *FileLogWatcher to query.NewRotatableLogger(...) (*zap.Logger, error) (or logger.NewRotatableLogger...). This logger can then be used anywhere we need a log file that can be rotated by an external log rotator without making changes to the client code using the logger. Note: The goroutine to watch the signal channel will keep the references to the required data so it won't be garbage collected.

@gwossum
Copy link
Member

gwossum commented Jan 10, 2025

Adding this comment from slack for posterity:

I think we've taken the wrong approach on handling log rotation. I reviewed the logrotate config files on my system and did some research on how major packages handle it. Most packages handle log rotation by listening for a signal (e.g. SIGHUP or SIGUSR1) and reopening logs when the signal is received. logrotate is then configured to send the signal to the process when it rotates logs.

Here's a couple of examples from package docs describing reopening log files on SIGHUP:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants