Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retention fix distributed #770

Commits on Jan 15, 2024

  1. fix for parseablehq#573 corrected error message in case of deployment…

    … mismatch
    
    data directory creation should not happen in case of deployment mismatch
    staging should be overwritten in case of new staging
    nikhilsinhaparseable committed Jan 15, 2024
    Configuration menu
    Copy the full SHA
    e70e144 View commit details
    Browse the repository at this point in the history
  2. fix for parseablehq#573 corrected error message in case of deployment…

    … mismatch
    
        data directory creation should not happen in case of deployment mismatch
        staging should be overwritten in case of new staging
        default staging and data directory should not be created if env var has different path
    nikhilsinhaparseable committed Jan 15, 2024
    Configuration menu
    Copy the full SHA
    2e77714 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    f8780ee View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    6b21462 View commit details
    Browse the repository at this point in the history

Commits on Jan 16, 2024

  1. Configuration menu
    Copy the full SHA
    505bf01 View commit details
    Browse the repository at this point in the history

Commits on Apr 19, 2024

  1. temp: allow configurable buffer size while ingestion (parseablehq#624)

    Co-authored-by: Nitish Tiwari <[email protected]>
    2 people authored and nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    e575a41 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    c6e71fd View commit details
    Browse the repository at this point in the history
  3. Bump h2 from 0.3.17 to 0.3.24 (parseablehq#630)

    Bumps [h2](https://github.com/hyperium/h2) from 0.3.17 to 0.3.24.
    - [Release notes](https://github.com/hyperium/h2/releases)
    - [Changelog](https://github.com/hyperium/h2/blob/v0.3.24/CHANGELOG.md)
    - [Commits](hyperium/h2@v0.3.17...v0.3.24)
    
    ---
    updated-dependencies:
    - dependency-name: h2
      dependency-type: indirect
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored and nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    e4fa127 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    9b6e179 View commit details
    Browse the repository at this point in the history
  5. fix: clean up storage validation logic further (parseablehq#633)

    Standardise the error message and also use the new logg.ing
    domain for short URLs.
    nitisht authored and nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    007fd88 View commit details
    Browse the repository at this point in the history
  6. feat: separate out ingest and query functionality of the server (pars…

    …eablehq#634)
    
    Add P_MODE with options `ingest`, `query` and `all`. Default mode is `all`. 
    There are still more changes required for both modes to work well. Will be
    added in next subsequent PRs.
    
    Fixes parseablehq#617
    Eshanatnight authored and nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    96ef250 View commit details
    Browse the repository at this point in the history
  7. feat: allow configurable duration for data push to S3 (parseablehq#626)

    create parquet file by grouping all arrow files (in staging) for the duration 
    provided in env variable P_STORAGE_UPLOAD_INTERVAL also check 
    if arrow files vector is not empty, then sort the arrow files and create key 
    for parquet file from last file from sorted arrow files vector
    
    Fixes parseablehq#616
    nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    9eb9d3f View commit details
    Browse the repository at this point in the history
  8. fix: allow only GET /logstream in query mode (parseablehq#643)

    Previously in Query Mode, All log stream endpoints were allowed.
    But is it better that only ingester is allowed to create streams.
    
    Fixes parseablehq#641
    Eshanatnight authored and nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    fe860bd View commit details
    Browse the repository at this point in the history
  9. Update CNAME

    Signed-off-by: Nitish Tiwari <[email protected]>
    nitisht authored and nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    3e017bd View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    14d1da2 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    1ef78ea View commit details
    Browse the repository at this point in the history
  12. fix: init_scheduler (parseablehq#649)

    Earlier, separate scheduler was initialized for each 
    stream on load time or whenever retention period is set.
    Now, a single scheduler is initialized which checks retention 
    config of all the streams and performs the retention cleanup.
    
    Fixes parseablehq#636
    gurjotkaur20 authored and nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    6604151 View commit details
    Browse the repository at this point in the history
  13. fix: removed use of environment var P_STORAGE_UPLOAD_INTERVAL (parsea…

    …blehq#653)
    
    added const of 60 secs to be used for local to storage sync
    
    fixes parseablehq#651
    nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    f873fd1 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    e742a2e View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    b806697 View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    9b192a1 View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    8f180a8 View commit details
    Browse the repository at this point in the history
  18. Configuration menu
    Copy the full SHA
    0e7fe60 View commit details
    Browse the repository at this point in the history
  19. prepare for release v0.8.0 (parseablehq#663)

    includes console release v0.4.0
    nitisht authored and nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    31f6efc View commit details
    Browse the repository at this point in the history
  20. Configuration menu
    Copy the full SHA
    7c060c6 View commit details
    Browse the repository at this point in the history
  21. delete parseable-license.html (parseablehq#665)

    It is better that users generate themselves as needed.
    
    Signed-off-by: Nitish Tiwari <[email protected]>
    nitisht authored and nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    c8bbaa6 View commit details
    Browse the repository at this point in the history
  22. Revert "fix: avoid removal of retention configuration while updating …

    …snapshot" (parseablehq#666)
    
    Reverts parseablehq#661 because with this change we're backward incompatible with older 
    versions where .stream.json doesn't contain retention field.
    
    This reverts commit 121bf01.
    nitisht authored and nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    898773b View commit details
    Browse the repository at this point in the history
  23. Configuration menu
    Copy the full SHA
    5ca0ab5 View commit details
    Browse the repository at this point in the history
  24. do not lock PRs after merge (parseablehq#674)

    Signed-off-by: Nitish Tiwari <[email protected]>
    nitisht authored and nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    f522d95 View commit details
    Browse the repository at this point in the history
  25. fix: add stream creation time in get stats api (parseablehq#632)

    Changes does in the PR -
    1. adds the first_event_at property (from the min value of p_timestamp of the first parquet file listed in the first manifest file from the snapshot of the stream.json) to the stats api and writes it to the stream.json file at the request of get stats.
    2. updates the first_event_at in case of retention
    
    Fixes : parseablehq#587
    gurjotkaur20 authored and nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    ba865fe View commit details
    Browse the repository at this point in the history
  26. fix: avoid removal of retention configuration while updating snapshot (

    …parseablehq#673)
    
    current: when server restarts, the retention config gets deleted from stream.json
    change: retention config does not get deleted on server restart
    Fixes: parseablehq#654
    gurjotkaur20 authored and nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    e20034a View commit details
    Browse the repository at this point in the history
  27. Configuration menu
    Copy the full SHA
    3c590a5 View commit details
    Browse the repository at this point in the history
  28. Configuration menu
    Copy the full SHA
    93c6636 View commit details
    Browse the repository at this point in the history
  29. Configuration menu
    Copy the full SHA
    855a5ce View commit details
    Browse the repository at this point in the history
  30. Configuration menu
    Copy the full SHA
    891eba6 View commit details
    Browse the repository at this point in the history
  31. Bump mio from 0.8.8 to 0.8.11 (parseablehq#684)

    Bumps [mio](https://github.com/tokio-rs/mio) from 0.8.8 to 0.8.11.
    - [Release notes](https://github.com/tokio-rs/mio/releases)
    - [Changelog](https://github.com/tokio-rs/mio/blob/master/CHANGELOG.md)
    - [Commits](tokio-rs/mio@v0.8.8...v0.8.11)
    
    ---
    updated-dependencies:
    - dependency-name: mio
      dependency-type: indirect
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored and nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    8e6031d View commit details
    Browse the repository at this point in the history
  32. feat: allow historical data ingestion based on user defined time (par…

    …seablehq#683)
    
    This PR adds enhancement to use a user provided timestamp for partition 
    in ingesting logs instead of using server time.
    
    User needs to add custom header X-P-Time-Partition (optional) at stream 
    creation api to allow ingestion/query using timestamp column from the 
    log data instead of server time p_timestamp
    
    This is time_partition field name is stored in stream.json and in memory 
    STREAM_INFO in ingest api. Server checks if timestamp column name exists in 
    the log event, if not, throw exception. Also, checks if timestamp value can be 
    parsed into datetime, if not, throw exception arrow file name gets the date, 
    hr, mm from the timestamp field (if defined in stream) else file name gets 
    the date, hr, mm from the server time parquet file name gets a random 
    number attached to it. This is because a lot of log data can have same 
    date, hr, mm value of the timestamp field and with this random number, 
    parquet will not get overwritten in the console, query from and to date will 
    be matched against the value of the timestamp column of the log data (if 
    defined in the stream), else from and to date will be matched against the 
    p_timestamp column.
    
    Fixes parseablehq#671 
    Fixes parseablehq#685
    nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    56011c7 View commit details
    Browse the repository at this point in the history
  33. Remove incorrect release v0.9 (parseablehq#693)

    Fixes parseablehq#690 
    
    Signed-off-by: Nitish Tiwari <[email protected]>
    nitisht authored and nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    61d6960 View commit details
    Browse the repository at this point in the history
  34. fix: Query Optimization (parseablehq#696)

    modified function that checks if query has starttime before the 1st manifest lower bound time
    to ensure server uses manifest for count(*) where starttime is greater than manifest creation date
    
    fix for ingestion with time partition
    fix for query with time partition
    fix for query optimization
    fixed get_first_event call for time partition
    random number generation logic changed while creating parquet file name
    nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    c7dbef1 View commit details
    Browse the repository at this point in the history
  35. fix: removed time partition logic from ingestion flow (parseablehq#703)

    * removed the X-P-Time-Partition header in log stream creation API
    * removed the logic that partitions the ingested log based on the 
    X-P-Time-Partition header value of which was stored in `stream.json`
    * Query still uses the logic to make query compatible with the external 
    tool that ingests based on time partition
    nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    753925c View commit details
    Browse the repository at this point in the history
  36. feat: allow historical ingestion with custom date column in log (pars…

    …eablehq#716)
    
    This PR allow historical ingestion only when date column provided in
    header x-p-time-partition and server time are within the same minute, 
    no change for default ingestion.
    nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    f3130c0 View commit details
    Browse the repository at this point in the history
  37. feat: add a new stream info API (parseablehq#720)

    Add API - GET /logstream/{logstream}/info
    Response -
    {
        "created-at": "2024-03-27T15:58:28.418792+05:30",
        "first-event-at": "2024-03-27T15:59:08.980+05:30",
        "cache_enabled": false,
        "time_partition": "source_time"
    }
    
    Also removed logic of first-event-at and created-at 
    from stats API response
    nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    6443db3 View commit details
    Browse the repository at this point in the history
  38. feat: add feature to provide static schema in logstream creation

    set header x-p-static-schema-flag = true and schema in body in below format -
    {
       "fields":[
        {
            "name": name of the field",
            "data_type": "data type of the field, out of these (int, double, float, boolean, string, datetime, string_list, int_list, double_list, float_list, boolean_list)",
        }
       ]
    }
    once provided, schema is persisted in the storage and in metadata
    if static schema provided, ingest api will verify if event log schema matches the static schema provided in stream creation
    if schema does not match, ingestion is rejected
    nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    ba4ba1c View commit details
    Browse the repository at this point in the history
  39. Configuration menu
    Copy the full SHA
    df1ef5e View commit details
    Browse the repository at this point in the history
  40. allow type casting in static schema (parseablehq#729)

    if field is defined as float,
    below values (examples) can be accepted -
    100,-100.45, 100.23, "200.45"
    nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    b196039 View commit details
    Browse the repository at this point in the history
  41. Configuration menu
    Copy the full SHA
    ae08d56 View commit details
    Browse the repository at this point in the history
  42. fix: delete node endpoint (parseablehq#731)

    delete ingester if offline
    Eshanatnight authored and nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    2dea510 View commit details
    Browse the repository at this point in the history
  43. fix: stats response (parseablehq#733)

    * fix: stats response
    
    * fix: s3 get objects
    
    Refactor object storage to filter objects by starts_with_pattern
    Eshanatnight authored and nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    a686d81 View commit details
    Browse the repository at this point in the history
  44. Add node_url field to Cli struct and update related code (parseablehq…

    …#734)
    
    * Add node_url field to Cli struct and update related code
    
    * Update required flag for Node URL in CLI
    
    * updated logic to have server address (ip:port) in parquet file name similar to other json files
    
    ---------
    
    Co-authored-by: Nikhil Sinha <[email protected]>
    Eshanatnight and nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    97ff9bf View commit details
    Browse the repository at this point in the history
  45. Configuration menu
    Copy the full SHA
    f828cbb View commit details
    Browse the repository at this point in the history
  46. Rm staging query result (parseablehq#740)

    * remove staging query from the query result (for distributed)
    
    * Refactor get_schema method to handle missing schema in object storage
    Eshanatnight authored and nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    8c7bb86 View commit details
    Browse the repository at this point in the history
  47. chore(deps): bump h2 from 0.3.24 to 0.3.26 (parseablehq#741)

    Bumps [h2](https://github.com/hyperium/h2) from 0.3.24 to 0.3.26.
    - [Release notes](https://github.com/hyperium/h2/releases)
    - [Changelog](https://github.com/hyperium/h2/blob/v0.3.26/CHANGELOG.md)
    - [Commits](hyperium/h2@v0.3.24...v0.3.26)
    
    ---
    updated-dependencies:
    - dependency-name: h2
      dependency-type: indirect
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored and nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    4c0b6ed View commit details
    Browse the repository at this point in the history
  48. fix analytics for the cluster (parseablehq#732)

    * fix analytics for the cluster
    * Add active_ingesters and inactive_ingesters metrics
    * updated ingesters' count and event related metrics
    Eshanatnight authored and nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    4aac841 View commit details
    Browse the repository at this point in the history
  49. Fix: OAuth User to Role Mapping Fix (parseablehq#742)

    This PR adds fixes for 
    
    1. Default role not assigned to the OAuth user if group does not exist
    2. Use user name used instead of id
    
    fixes parseablehq#638
    fixes parseablehq#868
    nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    87e9a08 View commit details
    Browse the repository at this point in the history
  50. Configuration menu
    Copy the full SHA
    14b12f1 View commit details
    Browse the repository at this point in the history
  51. Update README.md

    Signed-off-by: Nitish Tiwari <[email protected]>
    nitisht authored and nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    30c90b5 View commit details
    Browse the repository at this point in the history
  52. Configuration menu
    Copy the full SHA
    df03bc7 View commit details
    Browse the repository at this point in the history
  53. fix for staging size metrics not resetting to 0 (parseablehq#743)

    fix for staging size metrics not resetting to 0 even when 
    local to storage sync is completed and no arrow/parquet 
    file is left in staging folder
    nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    acbc92d View commit details
    Browse the repository at this point in the history
  54. Fetch schema and stream from obj store (parseablehq#745)

    * Refactor object storage to use filter_func instead of 
    starts_with_pattern in get_objects method
    * Refactor fetch_schema method to use object storage instead of HTTP requests
    * Refactor metadata.rs and storage.rs
    * refactor ingest logic
    * fetch stream info from store if stream info is not present in memory.
    error if stream info does not exist in S3 and memory
    Eshanatnight authored and nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    b3f6841 View commit details
    Browse the repository at this point in the history
  55. Configuration menu
    Copy the full SHA
    8ff0e45 View commit details
    Browse the repository at this point in the history
  56. Configuration menu
    Copy the full SHA
    bc58ca3 View commit details
    Browse the repository at this point in the history
  57. Configuration menu
    Copy the full SHA
    4fe0249 View commit details
    Browse the repository at this point in the history
  58. multiple fixes on server (parseablehq#753)

    1. fixed banner spacing
    2. modified server mode: All to Standalone, Ingest to Distributed (Ingest), Query to Distributed (Query)
    3. updated server mode in about API response
    4. updated logic for env var P_INGESTOR_URL to use HOSTNAME and PORT from env
    5. remove put cache api from querier
    6. added put cache api to ingestor
    7. renamed ingester to ingestor
    8. corrected cache flow for ingestors and standalone
    9. removed query, other logstream apis for ingestors
    nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    e11f9fb View commit details
    Browse the repository at this point in the history
  59. fix for GET /cache API (parseablehq#754)

    also fixed P_INGESTOR_URL fetch from env variables
    nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    fa3174a View commit details
    Browse the repository at this point in the history
  60. Configuration menu
    Copy the full SHA
    09606b9 View commit details
    Browse the repository at this point in the history
  61. Configuration menu
    Copy the full SHA
    6fd9c36 View commit details
    Browse the repository at this point in the history
  62. Configuration menu
    Copy the full SHA
    b9057c5 View commit details
    Browse the repository at this point in the history
  63. fix: add delete stream API in distributed mode (parseablehq#766)

    with this PR, when delete stream is called, querier 
    deletes the stream folder from the storage then calls 
    delete stream API for each ingestor. Finally, ingestor 
    deletes the stream from its local map
    nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    86620bd View commit details
    Browse the repository at this point in the history
  64. fix: update files in distributed mode to use hash (parseablehq#761)

    This PR ensures all metadata and data files (json and parquet) use 
    a simple sha256 based hash name mechanism. Each ingestor 
    allocates itself a unique hash which is used in all file names
    relevant to that ingestor. This hash is persisted in metadata file
    content also and is supposed to be the same for the lifecycle 
    of the ingestor.
    ---------
    
    Co-authored-by: Nikhil Sinha <[email protected]>
    Eshanatnight and nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    03b92ae View commit details
    Browse the repository at this point in the history
  65. fix: cache fix ingestors (parseablehq#767)

    caching for distributed mode can be 
    enabled from querier UI. Querier calls the PUT 
    /cache API to all ingestors. An ingestor, first checks 
    if stream exists, if not found in local map, checks 
    in S3 and creates stream. Then checks if caching 
    env vars are set. If yes, add cache_enabled flag to 
    STREAM_INFO and update its stream.json in S3
    
    Fixes: parseablehq#764
    nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    363d318 View commit details
    Browse the repository at this point in the history
  66. Configuration menu
    Copy the full SHA
    918c3b0 View commit details
    Browse the repository at this point in the history
  67. Configuration menu
    Copy the full SHA
    96ef239 View commit details
    Browse the repository at this point in the history
  68. fix for retention for distributed deployment

    fix for stream info api
    nikhilsinhaparseable committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    052c6de View commit details
    Browse the repository at this point in the history

Commits on Apr 20, 2024

  1. Configuration menu
    Copy the full SHA
    d0d0ac8 View commit details
    Browse the repository at this point in the history