Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the max data size which can be processed by netdata-timescale-relay? #3

Open
oleotiger opened this issue Jan 20, 2021 · 3 comments

Comments

@oleotiger
Copy link

Here is my problem.
When I export metrics from metrics at every 1s (as collected) from parent node and there is one child node, I get error from netdata like
2021-01-18 17:20:31: netdata ERROR : MAIN : EXPORTING: failed to write data to 'localhost:14866'. Willing to write 2092079 bytes, wrote 1766662 bytes. Will re-connect. (errno 11, Resource temporarily unavailable).

When I add filter send charts matching = !cpu.cpu* !ipv6* !users.* nfs.rpc net.* net_drops.* net_packets.* !system.interrupts* system.* disk.* disk_space.* disk_ops.* mem.*, it works well.

So I guess the error is raised by data size. There is too much data exporting to timescaledb through netdata-timescale-relay.

Am I right?

If yes, what is the max data flow that netdata-timescale-relay can handle?

@mahlonsmith
Copy link
Owner

Hmm, good question. There shouldn't be a practical limit, it just reads until the stream stops. I currently can't replicate this locally, so some more information is appreciated.

My local tests, having no netdata matching filter, puts a metrics row at about 14k. A filtered one is 3k (or so).
(Got this from select pg_size_pretty( pg_column_size(metrics)::bigint ) from netdata;)

Yours is 2 megs or so -- share your conf with me ([email protected] if you'd like to do it privately) - I'm interested in how you have a 2MB sample as a routine send, wow. Some public questions:

  • Versions of everything. PostgreSQL, the relay, netdata.
  • Are you using the newer exporting.conf or the deprecated backends config in netdata?
  • Anything interesting or relevant in your PostgreSQL log files?
  • Are you using a pooler or any middleware?

In the meantime I'll work up a test scenario that just injects N bytes of data straight to the relay and see how it does.

Thanks.

@oleotiger
Copy link
Author

Exporting conf of parent node:

[json:timescaledb_instance]
    enabled = yes
    destination = localhost:14866
    remote write URL path = /write
    data source = as collected
    prefix = netdata
    update every = 1
    send hosts matching = *
    buffer on failures = 100
#   send charts matching = !cpu.cpu* !ipv6* !users.* nfs.rpc net.* net_drops.* net_packets.* !system.interrupts* system.* disk.* disk_space.* disk_ops.* mem.*

Stream conf of child node:

[stream]
    enabled = yes
    destination=xxx
    timeout seconds = 60

    default port = 19999


    send charts matching = *

    buffer size bytes = 1048576

    reconnect delay seconds = 5

    initial clock resync iterations = 60

All child nodes and parent node are collecting all mtrics at 1Hz.

  • Versions of everything. PostgreSQL, the relay, netdata.
    PostgreSQL:12.5
    Relay: I just git clone the master branch and compile it.
    Netdata: netdata v1.28.0-128-g852bbdf
    OS:CentOS Linux release 7.7.1908 (Core)
  • Are you using the newer exporting.conf or the deprecated backends config in netdata?
    I'm using exporting.conf
  • Anything interesting or relevant in your PostgreSQL log files?
    I will reproduce the error and post it later.
  • Are you using a pooler or any middleware?
    No, just child node netdata stream---------parent node netdata exporting ------- netdata-timescale-relay ----------- timescaledb

@oleotiger
Copy link
Author

I reproced it. Error log of netdata:

2021-01-27 10:30:15: netdata ERROR : MAIN : EXPORTING: failed to write data to 'localhost:14866'. Willing to write 3289916 bytes, wrote 2154360 bytes. Will re-connect. (errno 11, Resource temporarily unavailable)
2021-01-27 10:30:15: netdata ERROR : MAIN : Failed to connect to '::1', port '14866' (errno 111, Connection refused)
2021-01-27 10:30:23: netdata INFO  : WEB_SERVER[static3] : POLLFD: LISTENER: client slot 2 (fd 75) from 150.1.68.34 port 59858 has not sent a complete request in 60 seconds - closing it.
2021-01-27 10:30:29: netdata ERROR : MAIN : EXPORTING: failed to write data to 'localhost:14866'. Willing to write 3289931 bytes, wrote 2088868 bytes. Will re-connect. (errno 11, Resource temporarily unavailable)
2021-01-27 10:30:29: netdata ERROR : MAIN : Failed to connect to '::1', port '14866' (errno 111, Connection refused)
2021-01-27 10:30:42: netdata ERROR : MAIN : EXPORTING: failed to write data to 'localhost:14866'. Willing to write 3289939 bytes, wrote 2154336 bytes. Will re-connect. (errno 11, Resource temporarily unavailable)
2021-01-27 10:30:42: netdata ERROR : MAIN : Failed to connect to '::1', port '14866' (errno 111, Connection refused)
2021-01-27 10:30:53: netdata INFO  : STREAM_RECEIVER[150.1.68.37,[150.1.68.37]:57942] : RRDSET: chart name 'cpu.cpu68_interrupts' on host '150.1.68.37' already exists.
2021-01-27 10:30:56: netdata ERROR : MAIN : EXPORTING: failed to write data to 'localhost:14866'. Willing to write 3289941 bytes, wrote 2154318 bytes. Will re-connect. (errno 11, Resource temporarily unavailable)
2021-01-27 10:30:56: netdata ERROR : MAIN : Failed to connect to '::1', port '14866' (errno 111, Connection refused)

And there is log of netdata-timescale-relay:


Client 127.0.0.1 closed socket.
Client 127.0.0.1 closed socket.
Client 150.1.68.32 closed socket.
Client 127.0.0.1 closed socket.
Client 127.0.0.1 closed socket.
Client 127.0.0.1 closed socket.
Client 150.1.68.32 closed socket.
Client 127.0.0.1 closed socket.
Client 127.0.0.1 closed socket.
Client 127.0.0.1 closed socket.```

No error message of postgresql is found.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants