Improve Safetensors packaging #607

ilopezluna · 2026-01-28T15:03:24Z

This pull request introduces a new directory-based model packaging builder and updates the unpacking logic to support a more flexible layer-per-file model format (V0.2). The changes significantly improve support for complex model directory structures.

Directory-based model packaging and exclusion logic:

Added FromDirectory and related options in pkg/distribution/builder/from_directory.go, enabling recursive packaging of model directories with full structure preservation and customizable exclusion patterns (by name, glob, or path).
Implemented the shouldExclude helper function for flexible exclusion matching, supporting directory names, file names, globs, and specific paths.

Testing and validation:

Added extensive tests in pkg/distribution/builder/from_directory_test.go to verify directory packaging and exclusion logic, covering various exclusion scenarios and edge cases.

Unpacking logic improvements:

Updated pkg/distribution/internal/bundle/unpack.go to auto-detect and unpack models using the new V0.2 layer-per-file format, falling back to legacy logic for older formats. Added isV02Model detection and refactored runtime config unpacking for interface compatibility. [1] [2]
Added support for unpacking generic file layers, ensuring all files are extracted correctly for V0.2 models.

Documentation:

Added pkg/distribution/MODEL_TYPES.md to document model types, interfaces, helper functions, concrete types, supported formats, media types, and design rationale for multiple interfaces.

…m model files in a directory

…model unpacking

…packaging

gemini-code-assist · 2026-01-28T15:03:50Z

Summary of Changes

Hello @ilopezluna, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the model packaging and unpacking capabilities by introducing a new directory-based builder. This allows for a more faithful representation of complex model structures, such as those found in HuggingFace, by preserving nested file paths through OCI layer annotations. The changes also include a robust mechanism for excluding unwanted files during packaging and a smarter unpacking process that automatically adapts to different model formats. Furthermore, critical security improvements have been made to prevent directory traversal vulnerabilities during file extraction.

Highlights

New FromDirectory Builder: Introduced a new builder function that allows packaging AI models directly from a directory, preserving the original nested file structure.
Exclusion Patterns: Added support for excluding specific files or directories during the packaging process using WithExclusions options.
V0.2 Packaging Format: Implemented a new model configuration media type (MediaTypeModelConfigV02) that uses layer-per-file packaging with filepath annotations to maintain directory hierarchy.
Auto-detection and Unpacking: The Unpack function now intelligently detects between legacy (V0.1) and the new V0.2 packaging formats, using UnpackFromLayers for the latter to correctly restore the directory structure.
Enhanced Path Validation: Strengthened the validatePathWithinDirectory function with additional security checks to prevent directory traversal attacks during unpacking.
Renamed Packaging Module: The safetensors packaging module was renamed to directory to reflect its broader capability of handling general directory structures, not just safetensors files.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a significant improvement to model packaging by allowing the creation of model artifacts directly from a directory structure, preserving nested file paths. This is achieved through a new builder.FromDirectory function and a corresponding V0.2 model config format. The unpacking logic has been updated to handle this new format while maintaining backward compatibility with V0.1 models. The changes are well-structured and include thorough testing. My review focuses on improving the robustness and maintainability of the new unpacking logic, particularly regarding the handling of different layer types and reducing code duplication.

pkg/distribution/internal/bundle/unpack.go

…packing logic

sourcery-ai

Hey - I've found 1 issue, and left some high level feedback:

The new layer-based unpacking paths (UnpackFromLayers, unpackGenericFileLayers, unpackSafetensorsWithAnnotations) duplicate very similar descriptorProvider / pathProvider patterns and annotation handling logic; consider extracting shared helper interfaces/functions to reduce drift and make future changes less error-prone.
In FromDirectory all filesystem entries whose base name starts with . are unconditionally skipped, which prevents packaging intentionally hidden content (e.g., .well-known); you may want an option to override this behavior or to limit the auto-skip to a narrower set (like .git).
Several loops over layers silently continue on errors from methods like MediaType(), Digest(), or Layers() (e.g., in unpackSafetensorsWithAnnotations, UnpackFromLayers, unpackGenericFileLayers), which can mask mispackaged layers; consider logging or wrapping these errors to make diagnosing bad artifacts easier while still tolerating non-critical failures if needed.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The new layer-based unpacking paths (`UnpackFromLayers`, `unpackGenericFileLayers`, `unpackSafetensorsWithAnnotations`) duplicate very similar `descriptorProvider` / `pathProvider` patterns and annotation handling logic; consider extracting shared helper interfaces/functions to reduce drift and make future changes less error-prone.
- In `FromDirectory` all filesystem entries whose base name starts with `.` are unconditionally skipped, which prevents packaging intentionally hidden content (e.g., `.well-known`); you may want an option to override this behavior or to limit the auto-skip to a narrower set (like `.git`).
- Several loops over layers silently `continue` on errors from methods like `MediaType()`, `Digest()`, or `Layers()` (e.g., in `unpackSafetensorsWithAnnotations`, `UnpackFromLayers`, `unpackGenericFileLayers`), which can mask mispackaged layers; consider logging or wrapping these errors to make diagnosing bad artifacts easier while still tolerating non-critical failures if needed.

## Individual Comments

### Comment 1
<location> `pkg/distribution/MODEL_TYPES.md:172-174` </location>
<code_context>
+| Format | Constant | Description |
+|--------|----------|-------------|
+| GGUF | `FormatGGUF` | llama.cpp quantized models |
+| Safetensors | `FormatSafetensors` | HuggingFace weights |
+| Diffusers | `FormatDiffusers` | Image generation models |
+
</code_context>

<issue_to_address>
**suggestion (typo):** Consider changing "HuggingFace" to the more common "Hugging Face" spelling.

Using the standard "Hugging Face" spelling will align with the official brand and common usage.

```suggestion
| Format | Constant | Description |
|--------|----------|-------------|
| GGUF | `FormatGGUF` | llama.cpp quantized models |
| Safetensors | `FormatSafetensors` | Hugging Face weights |
| Diffusers | `FormatDiffusers` | Image generation models |
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

pkg/distribution/MODEL_TYPES.md

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

ilopezluna added 4 commits January 28, 2026 11:44

feat(builder): implement FromDirectory function to create Builder fro…

60f24da

…m model files in a directory

feat(builder): add support for exclusions in FromDirectory function

beb05f5

feat(unpack): implement UnpackFromLayers function for layer-per-file …

3f8b58b

…model unpacking

feat(unpack): enhance unpacking logic to support V0.2 layer-per-file …

6674c88

…packaging

gemini-code-assist bot reviewed Jan 28, 2026

View reviewed changes

ilopezluna added 3 commits January 28, 2026 16:25

feat(unpack): streamline V0.2 model unpacking logic in Unpack function

16a8ac5

feat(docs): add MODEL_TYPES.md to document model types and interfaces

f383f26

feat(unpack): improve runtime config handling and streamline layer un…

491326d

…packing logic

ilopezluna force-pushed the improve-safetensors-packaging branch from adcc1d3 to 491326d Compare January 29, 2026 11:33

ilopezluna changed the title ~~[WIP] Improve Safetensors packaging~~ Improve Safetensors packaging Jan 29, 2026

ilopezluna marked this pull request as ready for review January 29, 2026 11:59

sourcery-ai bot reviewed Jan 29, 2026

View reviewed changes

pkg/distribution/MODEL_TYPES.md Show resolved Hide resolved

Update pkg/distribution/MODEL_TYPES.md

707326b

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

ilopezluna requested a review from a team January 29, 2026 12:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Safetensors packaging #607

Improve Safetensors packaging #607

ilopezluna commented Jan 28, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Jan 28, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sourcery-ai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Improve Safetensors packaging #607

Are you sure you want to change the base?

Improve Safetensors packaging #607

Conversation

ilopezluna commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot commented Jan 28, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ilopezluna commented Jan 28, 2026 •

edited

Loading