Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Added docling and pytorch as add on #5089

Merged
merged 2 commits into from
Feb 27, 2025

Conversation

ntkathole
Copy link
Contributor

What this PR does / why we need it:

Adds docling and torch as add on to feast.
Integrating Docling would allow users to efficiently chunk text data within Online Feature Views (ODFVs) during the write process and torch will allow conversion of feature vectors into pytorch tensors.

Which issue(s) this PR fixes:

#5037 #4890

@ntkathole ntkathole requested a review from a team as a code owner February 25, 2025 11:53
@ntkathole ntkathole force-pushed the docling_extra branch 2 times, most recently from a71074f to cbe4e99 Compare February 25, 2025 14:59
@ntkathole ntkathole force-pushed the docling_extra branch 4 times, most recently from d98454b to c6176b1 Compare February 25, 2025 16:48
@ntkathole
Copy link
Contributor Author

It seems this is causing increase in an image size. Do we have large runners available under our github org?

@franciscojavierarceo
Copy link
Member

@ntkathole this is what I see available. Anything here work?

Screenshot 2025-02-25 at 12 28 14 PM

@ntkathole ntkathole force-pushed the docling_extra branch 3 times, most recently from 5239ef1 to 4be9a30 Compare February 26, 2025 05:51
@@ -162,7 +162,7 @@ docker-build: ## Build docker image with the manager.
## Build feast docker image.
.PHONY: feast-ci-dev-docker-img
feast-ci-dev-docker-img:
cd ./../.. && make build-feature-server-dev
cd ./../.. && make build-feature-server-dev-minimal
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@franciscojavierarceo Since we do not have larger runner to run kind + feast with all dependencies, either we need self-hosted runners or we need to use image with minimal dependencies installed in CI.
Currently moving from multicloud/Dockerfile.dev to multicloud/Dockerfile works in CI to run operator e2e test.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern is that we miss potential breaks by using this minimal image. That's the risk here, right?

Cc @lokeshrangineni @tchughesiv @redhatHameed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will be using this image just for operator e2e test, feast functionality will be tested via integration and unit tests with all dependencies installed.
@tchughesiv or someone more familiar with operator work can confirm if it's a big risk in operator testing or better way to handle this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change needs to be reverted

Copy link
Contributor

@tchughesiv tchughesiv Mar 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ntkathole @franciscojavierarceo this must remain make build-feature-server-dev

@ntkathole ntkathole force-pushed the docling_extra branch 2 times, most recently from 01d60df to b2e1769 Compare February 27, 2025 06:12
@franciscojavierarceo
Copy link
Member

looks like that worked!

@franciscojavierarceo franciscojavierarceo merged commit 135342b into feast-dev:master Feb 27, 2025
24 checks passed
Comment on lines +530 to +535
build-feature-server-dev-minimal:
docker buildx build \
-t feastdev/feature-server:dev \
-f sdk/python/feast/infra/feature_servers/multicloud/Dockerfile \
--load sdk/python/feast/infra/feature_servers/multicloud

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is just a release build... its not a dev image build

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll need to remove this

franciscojavierarceo pushed a commit that referenced this pull request Mar 10, 2025
# [0.47.0](v0.46.0...v0.47.0) (2025-03-10)

* feat!: Include PUBLIC_URL in defaultProjectListPromise URL in /ui ([2f0f7b3](2f0f7b3))

### Bug Fixes

* Add transformation_service_endpoit to support Go feature server. ([#5071](#5071)) ([5627d7c](5627d7c))
* Adding extra space on the VM to kind cluster to see if this solves the issue with memory not available with operator e2e tests. ([#5102](#5102)) ([e6e928c](e6e928c))
* Allow unencrypted Snowflake key ([#5097](#5097)) ([87a7c23](87a7c23))
* Cant add different type of list types ([#5118](#5118)) ([bebd7be](bebd7be))
* Fixing transformations on writes ([#5127](#5127)) ([95ac34a](95ac34a))
* Identify s3/remote uri path correctly ([#5076](#5076)) ([93becff](93becff))
* Increase available action VM storage and reduce dev feature-server image size ([#5112](#5112)) ([75f5a90](75f5a90))
* Move Feast to pyproject.toml instead of setup.py ([#5067](#5067)) ([4231274](4231274))
* Skip refresh if already in progress or if lock is already held ([#5068](#5068)) ([f3a24de](f3a24de))

### Features

* Add an OOTB Chat uI to the Feature Server to support RAG demo ([#5106](#5106)) ([40ea7a9](40ea7a9))
* Add Couchbase Columnar as an Offline Store ([#5025](#5025)) ([4373cbf](4373cbf))
* Add Feast Operator RBAC example with Kubernetes Authentication … ([#5077](#5077)) ([2179fbe](2179fbe))
* Added docling and pytorch as add on ([#5089](#5089)) ([135342b](135342b))
* Feast Operator example with Postgres in TLS mode. ([#5028](#5028)) ([2c46f6a](2c46f6a))
* Operator - Add feastProjectDir section to CR with git & init options ([#5079](#5079)) ([d64f01e](d64f01e))
* Override the udf name when provided as input to an on demand transformation ([#5094](#5094)) ([8a714bb](8a714bb))
* Set value_type of entity directly in from_proto ([#5092](#5092)) ([90e7498](90e7498))
* Updating retrieve online documents v2 to work for other fields for sq… ([#5082](#5082)) ([fc121c3](fc121c3))

### BREAKING CHANGES

* The PUBLIC_URL environment variable is now taken into account by default
when fetching the projects list. This is a breaking change only if all
these points apply:

1. You're using Feast UI as a module

2. You're serving the UI files from a non-root path via the PUBLIC_URL
   environment variable

3. You're serving the project list from the root path

4. You're not passing the `feastUIConfigs.projectListPromise` prop to
   the FeastUI component

In this case, you need to explicitly fetch the project list from the
root path via the `feastUIConfigs.projectListPromise` prop:

```diff
 const root = createRoot(document.getElementById("root")!);
 root.render(
   <React.StrictMode>
-    <FeastUI />
+    <FeastUI
+      feastUIConfigs={{
+        projectListPromise: fetch("/projects-list.json", {
+            headers: {
+              "Content-Type": "application/json",
+            },
+          }).then((res) => res.json())
+      }}
+    />
   </React.StrictMode>
 );
```

Signed-off-by: Harri Lehtola <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants