Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export Functionality #96

Open
lordlycastle opened this issue Mar 11, 2021 · 15 comments
Open

Export Functionality #96

lordlycastle opened this issue Mar 11, 2021 · 15 comments
Labels
enhancement New feature or request

Comments

@lordlycastle
Copy link

lordlycastle commented Mar 11, 2021

Would love some export functionality of all the data that is collected.

Simplest export would be a "CSV" file where each line is latency & start time of each request in order they were made.

Eventually we could expand this to include other things about the request like the response size, start timestamps, number of request there were in parallel when you started it etc.

I don't want to export processed data. Since they're fairly basic calculations I don't think that weight should fall on this tool; as it is not its main focus.

@nakabonne
Copy link
Owner

@lordlycastle There you go. Good idea. Actually I'm looking into the export functionality. But I feel like it's too much to include a set of all requests and their timestamp. I'm guessing writing this summary to a file is enough:
image

@nakabonne nakabonne added the enhancement New feature or request label Mar 13, 2021
@ghowardMSD
Copy link

I think it should be a time-stamped entry for each attack in a format that can be submitted to RRDTool.

@solarisfire
Copy link

Or InfluxDB... Way more reliable than RRDTool, and can then be grafana'd super easily :-)

@nakabonne
Copy link
Owner

@ghowardMSD Okay, you want to plot into external tools, right? Initially, the scope of this tool was just to plot the terminal, so I wasn't considering it. But looks interesting. Because I'm honestly not very familiar with that area, give me a moment to think about it.

@lordlycastle
Copy link
Author

lordlycastle commented Mar 20, 2021

Supporting a propertary format may not be cool. Almost all tools can accept standard formats like CSV (most universal), JSON (newer). This is perfectly tabular data where CSV will be most intuitive.

@nakabonne I believe the raw info would be best like the RAW photos. People can do their own processing. With tools like xsv/jq/fx its simple. Is it too difficult to collect this info? It could write every X sec to disk to reduce memory size for long attacks.

I must say processed data is useful too though. Otherwise you need to do manipulations to compare multiple export results outside of ali. This is difficult 😅

One question is what about requests that returned a different code. This info is important because you might wish to filter only successful/failed/429/5XX requests.

id, start_unix_time, stop_unix_time, http_code would be important. What do you guys think? Can we think something better than unix_time I don't like it as you can't just read it.

@nakabonne
Copy link
Owner

@lordlycastle

This is perfectly tabular data where CSV will be most intuitive.

Yes, I feel the same way 👍

I believe the raw info would be best like the RAW photos.

I must say processed data is useful too though.

Collecting raw data is a little bit tedious but isn't so difficult. The phrase "processed data" means a kind of like the request latency, right? If so, I'm guessing only processed data is enough like the k6's export feature. Also, as you mentioned, the HTTP status codes should be included, and input data a kind of like the URL and Method as well.

What I'm thinking is:

timestamp,latency,url,method,status_code
1595325560,438000000,http://host.xz,GET,200

I feel like we need to include only the start_time as a timestamp.

@nakabonne
Copy link
Owner

Can we think something better than unix_time I don't like it as you can't just read it.

I don't know anything that's better than unix_time.

@lordlycastle
Copy link
Author

lordlycastle commented Mar 25, 2021

Maybe we could do ISO format. https://en.wikipedia.org/wiki/ISO_8601

Would make it readable and it’s recognised by most tools easily. Should be able to fetch on all systems easily too regardless of locale and timezone.

Date and time in UTC

  • 2021-03-24T23:12:14+00:00
  • 2021-03-24T23:12:14Z
  • 20210324T231214Z
  • 2021-02-02T15:14:01.9177924Z (common in logs where it needs to be human readable but also parsable) ✅

Decimal points for Milli seconds.

A decimal fraction may be added to the lowest order time element present, in any of these representations. A decimal mark, either a comma or a dot (following ISO 80000-1 according to ISO 8601:1-2019, which does not stipulate a preference except within International Standards, but with a preference for a comma according to ISO 8601:2004) is used as a separator between the time element and its fraction. To denote "14 hours, 30 and one half minutes", do not include a seconds figure. Represent it as "14:30,5", "T1430,5", "14:30.5", or "T1430.5". There is no limit on the number of decimal places for the decimal fraction. However, the number of decimal places needs to be agreed to by the communicating parties. For example, in Microsoft SQL Server, the precision of a decimal fraction is 3, i.e., "yyyy-mm-ddThh:mm:ss[.mmm]".

@lordlycastle
Copy link
Author

Do we need url? Thought that was always fixed for a single run.

Yeah doesn’t matter if you include latency or stop time. One is enough.

@nakabonne
Copy link
Owner

Looks good. I have no objection to the ISO format for now.

Do we need url?

We don't need it In case that you want to handle only one target once. But for those who want to export results for multiple targets and save them into a single time-series DB, I feel like it would be nice if we have such input data.

I'm in the middle of thinking about this topic but I'm wondering our CSV should be convertible to InfluxDB's line protocol. InfluxDB is one of the most popular time-series DB and this format looks relatively versatile. I mean it's gonna be kind of like:

m,url,method,status_code,value,time
latency,http://host.xz,GET,200,438000000,2020-01-01T00:00:00Z

@alwaysastudent
Copy link

Right now, I am not even able to copy the metrics from the GUI. Is there a way to report the summary to a file?

@nakabonne
Copy link
Owner

@alwaysastudent Actually even such a simple summary doesn't exist. I'm looking to support it in the near future, kind of like: #96 (comment)

@nakabonne
Copy link
Owner

Currently, I'm in the middle of working on implementing the time-series storage layer; once got finished I'm going to support the export functionality, so just a moment.

@wangzhankun
Copy link

Have this feature been supported?

@ocervell
Copy link

3 years later, is there any progress ? I feel like the tool can be prod-ready only with this feature added, no ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

7 participants