Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Vertex AI] Add image generation support using Imagen #14236

Merged
merged 21 commits into from
Feb 5, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
940c05b
[Vertex AI] Add ImageGenerationInstance for input to predict call (#1…
andrewheard Dec 3, 2024
5906b14
[Vertex AI] Add ImageGenerationParameters for input to predict call (…
andrewheard Dec 3, 2024
71e9660
[Vertex AI] Add ImageGenerationResponse for decoding PredictResponse …
andrewheard Dec 6, 2024
66bf7ce
[Vertex AI] Make `ImageGenerationResponse` generic and add image type…
andrewheard Dec 6, 2024
227945c
[Vertex AI] Add `ImageGenerationRequest` for Imagen (#14225)
andrewheard Dec 6, 2024
f6b95de
[Vertex AI] Add `ImagenModel` with `generateImages` functions (#14226)
andrewheard Dec 7, 2024
86067e6
[Vertex AI] Run Imagen integration tests on cron schedule only (#14231)
andrewheard Dec 7, 2024
af33ee0
[Vertex AI] Add `ImagenGenerationConfig` to `generateImages()` (#14234)
andrewheard Dec 9, 2024
1c83b1f
[Vertex AI] Add `ImagenSafetySettings` type and param (#14237)
andrewheard Dec 10, 2024
6842840
[Vertex AI] Replace `ImagenImage` protocol with `_ImagenImage` struct…
andrewheard Dec 12, 2024
a209212
[Vertex AI] Refactor `ImagenSafetySettings` (#14307)
andrewheard Jan 7, 2025
746f176
[Vertex AI] Add `ImagenModelConfig` for model-level config params (#1…
andrewheard Jan 7, 2025
505c228
[Vertex AI] Rename `ImageGenerationResponse` to `ImagenGenerationResp…
andrewheard Jan 8, 2025
a18e0fa
[Vertex AI] Make `ImagenImageRepresentable` internal (#14341)
andrewheard Jan 13, 2025
01b2f98
[Vertex AI] Move `ImagenModelConfig` params to `ImagenGenerationConfi…
andrewheard Jan 14, 2025
d3dab8b
[Vertex AI] Update Imagen public APIs to match API proposal (#14388)
andrewheard Jan 28, 2025
368b0d0
[Vertex AI] Add Imagen integration tests for GCS and filtering (#14403)
andrewheard Jan 30, 2025
ef97e21
[Vertex AI] Add documentation for Imagen symbols (#14411)
andrewheard Feb 5, 2025
717ccad
Add CHANGELOG entry
andrewheard Feb 5, 2025
e18f8bc
Update TODOs
andrewheard Feb 5, 2025
f807fe7
Fix `filteredReasons` concatenation and update tests
andrewheard Feb 5, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/workflows/vertexai.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ on:
schedule:
# Run every day at 11pm (PST) - cron uses UTC times
- cron: '0 7 * * *'
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.ref }}
Expand Down Expand Up @@ -102,6 +103,7 @@ jobs:
needs: spm-package-resolved
env:
TEST_RUNNER_FIRAAppCheckDebugToken: ${{ secrets.VERTEXAI_INTEGRATION_FAC_DEBUG_TOKEN }}
TEST_RUNNER_VTXIntegrationImagen: ${{ github.event_name == 'schedule' || github.event_name == 'workflow_dispatch' }}
FIREBASECI_USE_LATEST_GOOGLEAPPMEASUREMENT: 1
secrets_passphrase: ${{ secrets.GHASecretsGPGPassphrase1 }}
steps:
Expand Down
8 changes: 8 additions & 0 deletions FirebaseVertexAI/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,11 @@
# Unreleased
- [feature] **Public Preview**: Added support for generating images using the
Imagen 3 model.
<br /><br />
Note: This feature is in Public Preview, which means that the it is not
subject to any SLA or deprecation policy and could change in
backwards-incompatible ways.

# 11.6.0
- [changed] The token counts from `GenerativeModel.countTokens(...)` now include
tokens from the schema for JSON output and function calling; reported token
Expand Down
15 changes: 14 additions & 1 deletion FirebaseVertexAI/Sources/GenerationConfig.swift
Original file line number Diff line number Diff line change
Expand Up @@ -162,4 +162,17 @@ public struct GenerationConfig {
// MARK: - Codable Conformances

@available(iOS 15.0, macOS 12.0, macCatalyst 15.0, tvOS 15.0, watchOS 8.0, *)
extension GenerationConfig: Encodable {}
extension GenerationConfig: Encodable {
enum CodingKeys: String, CodingKey {
case temperature
case topP
case topK
case candidateCount
case maxOutputTokens
case presencePenalty
case frequencyPenalty
case stopSequences
case responseMIMEType = "responseMimeType"
case responseSchema
}
}
3 changes: 3 additions & 0 deletions FirebaseVertexAI/Sources/GenerativeAIRequest.swift
Original file line number Diff line number Diff line change
Expand Up @@ -52,3 +52,6 @@ public struct RequestOptions {
self.init(timeout: timeout, apiVersion: .v1beta)
}
}

@available(iOS 15.0, macOS 12.0, macCatalyst 15.0, tvOS 15.0, watchOS 8.0, *)
extension RequestOptions: Equatable {}
1 change: 0 additions & 1 deletion FirebaseVertexAI/Sources/GenerativeAIService.swift
Original file line number Diff line number Diff line change
Expand Up @@ -203,7 +203,6 @@ struct GenerativeAIService {
}

let encoder = JSONEncoder()
encoder.keyEncodingStrategy = .convertToSnakeCase
urlRequest.httpBody = try encoder.encode(request)
urlRequest.timeoutInterval = request.options.timeout

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
// Copyright 2024 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

@available(iOS 15.0, macOS 12.0, macCatalyst 15.0, tvOS 15.0, watchOS 8.0, *)
struct ImageGenerationInstance {
let prompt: String
}

@available(iOS 15.0, macOS 12.0, macCatalyst 15.0, tvOS 15.0, watchOS 8.0, *)
extension ImageGenerationInstance: Equatable {}

// MARK: - Codable Conformance

@available(iOS 15.0, macOS 12.0, macCatalyst 15.0, tvOS 15.0, watchOS 8.0, *)
extension ImageGenerationInstance: Encodable {}
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
// Copyright 2024 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

import Foundation

@available(iOS 15.0, macOS 12.0, macCatalyst 15.0, tvOS 15.0, watchOS 8.0, *)
struct ImageGenerationOutputOptions {
let mimeType: String
let compressionQuality: Int?
}

@available(iOS 15.0, macOS 12.0, macCatalyst 15.0, tvOS 15.0, watchOS 8.0, *)
extension ImageGenerationOutputOptions: Equatable {}

// MARK: - Codable Conformance

@available(iOS 15.0, macOS 12.0, macCatalyst 15.0, tvOS 15.0, watchOS 8.0, *)
extension ImageGenerationOutputOptions: Encodable {}
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
// Copyright 2024 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

@available(iOS 15.0, macOS 12.0, macCatalyst 15.0, tvOS 15.0, watchOS 8.0, *)
struct ImageGenerationParameters {
let sampleCount: Int?
let storageURI: String?
let negativePrompt: String?
let aspectRatio: String?
let safetyFilterLevel: String?
let personGeneration: String?
let outputOptions: ImageGenerationOutputOptions?
let addWatermark: Bool?
let includeResponsibleAIFilterReason: Bool?
}

@available(iOS 15.0, macOS 12.0, macCatalyst 15.0, tvOS 15.0, watchOS 8.0, *)
extension ImageGenerationParameters: Equatable {}

// MARK: - Codable Conformance

@available(iOS 15.0, macOS 12.0, macCatalyst 15.0, tvOS 15.0, watchOS 8.0, *)
extension ImageGenerationParameters: Encodable {
enum CodingKeys: String, CodingKey {
case sampleCount
case storageURI = "storageUri"
case negativePrompt
case aspectRatio
case safetyFilterLevel = "safetySetting"
case personGeneration
case outputOptions
case addWatermark
case includeResponsibleAIFilterReason = "includeRaiReason"
}

func encode(to encoder: any Encoder) throws {
var container = encoder.container(keyedBy: CodingKeys.self)
try container.encodeIfPresent(sampleCount, forKey: .sampleCount)
try container.encodeIfPresent(storageURI, forKey: .storageURI)
try container.encodeIfPresent(negativePrompt, forKey: .negativePrompt)
try container.encodeIfPresent(aspectRatio, forKey: .aspectRatio)
try container.encodeIfPresent(safetyFilterLevel, forKey: .safetyFilterLevel)
try container.encodeIfPresent(personGeneration, forKey: .personGeneration)
try container.encodeIfPresent(outputOptions, forKey: .outputOptions)
try container.encodeIfPresent(addWatermark, forKey: .addWatermark)
try container.encodeIfPresent(
includeResponsibleAIFilterReason,
forKey: .includeResponsibleAIFilterReason
)
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
// Copyright 2024 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

import Foundation

@available(iOS 15.0, macOS 12.0, macCatalyst 15.0, tvOS 15.0, watchOS 8.0, *)
struct ImagenGenerationRequest<ImageType: ImagenImageRepresentable> {
let model: String
let options: RequestOptions
let instances: [ImageGenerationInstance]
let parameters: ImageGenerationParameters

init(model: String, options: RequestOptions, instances: [ImageGenerationInstance],
parameters: ImageGenerationParameters) {
self.model = model
self.options = options
self.instances = instances
self.parameters = parameters
}
}

@available(iOS 15.0, macOS 12.0, macCatalyst 15.0, tvOS 15.0, watchOS 8.0, *)
extension ImagenGenerationRequest: GenerativeAIRequest where ImageType: Decodable {
typealias Response = ImagenGenerationResponse<ImageType>

var url: URL {
return URL(string: "\(Constants.baseURL)/\(options.apiVersion)/\(model):predict")!
}
}

@available(iOS 15.0, macOS 12.0, macCatalyst 15.0, tvOS 15.0, watchOS 8.0, *)
extension ImagenGenerationRequest: Encodable {
enum CodingKeys: CodingKey {
case instances
case parameters
}

func encode(to encoder: any Encoder) throws {
var container = encoder.container(keyedBy: CodingKeys.self)
try container.encode(instances, forKey: .instances)
try container.encode(parameters, forKey: .parameters)
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
// Copyright 2024 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

import Foundation

// TODO(andrewheard): Make this public when the SDK supports Imagen operations that take images as
// input (upscaling / editing).
@available(iOS 15.0, macOS 12.0, macCatalyst 15.0, tvOS 15.0, watchOS 8.0, *)
protocol ImagenImageRepresentable {
/// Internal representation of the image for use with the Imagen model.
///
/// - Important: Not needed by SDK users.
var _internalImagenImage: _InternalImagenImage { get }
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
// Copyright 2024 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

import Foundation

/// Internal representation of an image for the Imagen model.
///
/// - Important: For internal use by types conforming to ``ImagenImageRepresentable``; all
/// properties are `internal` and are not needed by SDK users.
///
/// TODO(andrewheard): Make this public when the SDK supports Imagen operations that take images as
/// input (upscaling / editing).
@available(iOS 15.0, macOS 12.0, macCatalyst 15.0, tvOS 15.0, watchOS 8.0, *)
struct _InternalImagenImage {
let mimeType: String
let bytesBase64Encoded: String?
let gcsURI: String?
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
// Copyright 2024 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

@available(iOS 15.0, macOS 12.0, macCatalyst 15.0, tvOS 15.0, watchOS 8.0, *)
struct RAIFilteredReason {
let raiFilteredReason: String
}

@available(iOS 15.0, macOS 12.0, macCatalyst 15.0, tvOS 15.0, watchOS 8.0, *)
extension RAIFilteredReason: Decodable {
enum CodingKeys: CodingKey {
case raiFilteredReason
}
}
5 changes: 5 additions & 0 deletions FirebaseVertexAI/Sources/Types/Internal/InternalPart.swift
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,11 @@ struct FileData: Codable, Equatable, Sendable {
self.fileURI = fileURI
self.mimeType = mimeType
}

enum CodingKeys: String, CodingKey {
case fileURI = "fileUri"
case mimeType
}
}

@available(iOS 15.0, macOS 12.0, macCatalyst 15.0, tvOS 15.0, watchOS 8.0, *)
Expand Down
Loading
Loading