Skip to content

Feature/rteco 814 implement build info collection for hugging face#368

Open
naveenku-jfrog wants to merge 6 commits intojfrog:mainfrom
naveenku-jfrog:feature/RTECO-814-Implement-Build-Info-collection-for-Hugging-Face
Open

Feature/rteco 814 implement build info collection for hugging face#368
naveenku-jfrog wants to merge 6 commits intojfrog:mainfrom
naveenku-jfrog:feature/RTECO-814-Implement-Build-Info-collection-for-Hugging-Face

Conversation

@naveenku-jfrog
Copy link
Collaborator

@naveenku-jfrog naveenku-jfrog commented Feb 15, 2026

  • All tests passed. If this feature is not already covered by the tests, I added new tests.
  • All static analysis checks passed.
  • Appropriate label is added to auto generate release notes.
  • I used gofmt for formatting the code before submitting the pull request.
  • PR description is clear and concise, and it includes the proposed solution/fix.

What: [Added build info collection logic for hugging face download and upload command
Depend on other PR: oNo,
Testing: Done, On artifactory instance

@naveenku-jfrog naveenku-jfrog added new feature Automatically generated release notes safe to test Approve running integration tests on a pull request labels Feb 15, 2026
@github-actions github-actions bot removed the safe to test Approve running integration tests on a pull request label Feb 15, 2026
installCmd.Stdout = os.Stdout
installCmd.Stderr = os.Stderr
if err := installCmd.Run(); err != nil {
// If --user fails, try with --break-system-packages as fallback
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this flag comes with risks and can also break os packages using python, hope we understand the consequences before moving forward with this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed it.

Comment on lines 121 to 123
installCmd := exec.Command(pythonPath, "-m", "pip", "install", "huggingface_hub", "--user", "--quiet")
installCmd.Stdout = os.Stdout
installCmd.Stderr = os.Stderr
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for accesing standard error or output easily we can use .Output method that encapsulates this logic and handles error properly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed it.

installCmd.Stderr = os.Stderr
if err := installCmd.Run(); err != nil {
// If --user fails, try with --break-system-packages as fallback
log.Debug("User install failed, trying with --break-system-packages")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed it.

func GetRepoKeyFromHFEndpoint() (string, error) {
endpoint := os.Getenv("HF_ENDPOINT")
if endpoint == "" {
return "", errorutils.CheckErrorf("HF_ENDPOINT environment variable is not set")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any other fallback mechanism that we can use apart from env?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, its from huggingface itself.

// Fall back to Python module mode
pythonPath, pythonErr := GetPythonPath()
if pythonErr != nil {
return "", nil, errorutils.CheckErrorf("neither huggingface-cli nor hf found in PATH, and Python is not available: %w", pythonErr)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we improve the log? we can start with informing that huggingface-cli and hf not found, searching for python, later if that fails, we can log that python executable not found

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed it.

Comment on lines 70 to 72
cmd := exec.Command(hfCliPath, cmdArgs...)
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we remove standard output and error?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed it.

Comment on lines 87 to 114
if hfd.buildConfiguration == nil {
return nil
}
isCollectBuildInfo, err := hfd.buildConfiguration.IsCollectBuildInfo()
if err != nil {
return errorutils.CheckError(err)
}
args := map[string]interface{}{
"repo_id": hfd.repoId,
"revision": hfd.revision,
"etag_timeout": hfd.etagTimeout,
if !isCollectBuildInfo {
return nil
}
if hfd.repoType != "" {
args["repo_type"] = hfd.repoType
} else {
args["repo_type"] = "model"
log.Info("Collecting build info for executed huggingface ", hfd.name, "command")
buildName, err := hfd.buildConfiguration.GetBuildName()
if err != nil {
return errorutils.CheckError(err)
}
argsJSON, err := json.Marshal(args)
buildNumber, err := hfd.buildConfiguration.GetBuildNumber()
if err != nil {
return errorutils.CheckErrorf("failed to marshal arguments to JSON: %w", err)
}
pythonCmd := BuildPythonDownloadCmd(string(argsJSON))
log.Debug("Executing Python function to download ", args["repo_type"], ": ", hfd.repoId)
cmd := exec.Command(pythonPath, "-c", pythonCmd)
cmd.Dir = scriptDir
output, err := cmd.CombinedOutput()
if len(output) == 0 {
if err != nil {
return errorutils.CheckErrorf("Python script produced no output and exited with error: %w", err)
}
return errorutils.CheckErrorf("Python script produced no output. The script may not be executing correctly.")
return errorutils.CheckError(err)
}
var result Response
if jsonErr := json.Unmarshal(output, &result); jsonErr != nil {
if err != nil {
return errorutils.CheckErrorf("failed to execute Python script: %w, output: %s", err, string(output))
}
return errorutils.CheckErrorf("failed to parse Python script output: %w, output: %s", jsonErr, string(output))
project := hfd.buildConfiguration.GetProject()
buildInfoService := buildUtils.CreateBuildInfoService()
build, err := buildInfoService.GetOrCreateBuildWithProject(buildName, buildNumber, project)
if err != nil {
return fmt.Errorf("failed to create build info: %w", err)
}
if !result.Success {
return errorutils.CheckErrorf("%s", result.Error)
buildInfo, err := build.ToBuildInfo()
if err != nil {
return fmt.Errorf("failed to build info: %w", err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like repititive code from huggingface_upload.py

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed it.

revisionPattern = hfd.revision + "_*"
multipleDirsInSearchResults = true
}
aqlQuery := fmt.Sprintf(`{"repo": "%s", "path": {"$match": "%s/%s/%s/*"}}`,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are this files not available locally?

Aql: servicesUtils.Aql{ItemsFind: aqlQuery},
},
}
reader, err := serviceManager.SearchFiles(searchParams)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sha256: resultItem.Sha256,
},
})
if latestCreatedDir != resultItem.Path {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this condition signify?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, there are multiple folders in huggingface repo with same name but with timestamp difference. So, need to pick latest one, so this condition breaks on 2nd folder items.

@github-actions
Copy link
Contributor

👍 Frogbot scanned this pull request and did not find any new security issues.


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new feature Automatically generated release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments