"froster index [folders...]" command usage #28

macheitor · 2024-04-16T14:27:36Z

macheitor
Apr 16, 2024
Collaborator

froster index [folders...] command is used to extract metadata of the given folders using pwalk crawler.
This command also has a couple of options:

--pwalk-copy PATH used to copy the pwalk generated csv to PATH location.
--pwalk-csv FILE to use the given FILE as csv input for the indexer instead of running pwalk again.

I have several questions regarding this command:

What should be the standard workflow for a user? Something like this?
- Use froster index [folders...] to index folders and spot hotspots.
- Use visidata or check directly the generated csv file to spot folders worth archiving.
- Use froster archive FOLDER to archive FOLDER in the AWS S3 bucket.
- Use froster delete or froster delete [FOLDERS...] to delete archived files/folders.
Why is the hotposts output copied again with the folder the user has write access to?
- Why does it matter?
- How does the user know which hotspot file to check?
- Can we avoid copying the file again and have a command to retrieve this info? Like: froster index --archivable-hotspots [FOLDERS...]?
--pwalk-copy FOLDER
- Why would a user use this file?
- Is this file used elsewhere? As an input for --pwalk-csv FILE maybe?
--pwalk-csv FILE
- Why use it? If the FILE is already generated, the hotspot file is already generated.
- It appears to me that using --pwalk-csv FILE works fine with a single user but it struggles when used in a shared configuration, as the user's colleagues may not know if a folder has been indexed or where the output is located.
- As a future feature: I assume that in a large HPC system, storing the raw output of the pwalk command is not feasible due to size constraints. If so, would it be possible to store the folder's metadata currently used to locate hotspots in the data_dir or shared_data_dir? This way when a user runs froster index [folders...] over a folder that has already been indexed, the command can automatically retrieve the stored information instead of running pwalk again. There could be an option to force the pwalk execution to refresh the folder's metadata.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

"froster index [folders...]" command usage #28

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

"froster index [folders...]" command usage #28

Uh oh!

macheitor Apr 16, 2024 Collaborator

Replies: 0 comments

macheitor
Apr 16, 2024
Collaborator