Gemini Flash runs very slow (1 it/s) #223

RyanMarten · 2024-12-06T06:44:14Z

api key and paid tier https://aistudio.google.com/app/apikey
quota (rate limits) monitoring https://console.cloud.google.com/apis/api/generativelanguage.googleapis.com/quotas?inv=1&invt=AbjX7w&project=bespokelabs
usage (server response and latency) montoring https://console.cloud.google.com/apis/api/generativelanguage.googleapis.com/metrics?project=bespokelabs

Only doing one request at a time

CharlieJCJ · 2024-12-10T01:01:13Z

What happened here was that 1) we've set a slow default tpm, which is 150k, and 2) we did a conservative output token estimation, which doesn't exhaust the token capacity.
Related issues: #206

The fix for this is manually set the tpm to a higher value. I'll followup with a higher default tpm.

CharlieJCJ · 2024-12-10T01:19:29Z

Example script

from bespokelabs.curator import Prompter
from datasets import Dataset
import logging

logger = logging.getLogger("bespokelabs.curator")
logger.setLevel(logging.DEBUG)

dataset = Dataset.from_dict({"prompt": ["write me a poem"] * 100_000})

prompter = Prompter(
    prompt_func=lambda row: row["prompt"],
    model_name="gemini/gemini-1.5-flash-002",
    response_format=None,
    max_requests_per_minute=2000,
    max_tokens_per_minute=4000000,
)

dataset = prompter(dataset)
print(dataset.to_pandas())

CharlieJCJ · 2024-12-10T01:21:19Z

Also works for gemini-1.5-pro-002

CharlieJCJ · 2024-12-10T01:26:15Z

gemini-1.5-pro-002

gemini-1.5-flash-002

RyanMarten assigned CharlieJCJ Dec 6, 2024

This was referenced Dec 9, 2024

Better token consumption estimation #235

Closed

[OnlineRequestProcessor Enhancement] Better way to do output token estimation #206

Open

CharlieJCJ closed this as completed Dec 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gemini Flash runs very slow (1 it/s) #223

Gemini Flash runs very slow (1 it/s) #223

RyanMarten commented Dec 6, 2024

CharlieJCJ commented Dec 10, 2024 •

edited

Loading

CharlieJCJ commented Dec 10, 2024

CharlieJCJ commented Dec 10, 2024

CharlieJCJ commented Dec 10, 2024

Gemini Flash runs very slow (1 it/s) #223

Gemini Flash runs very slow (1 it/s) #223

Comments

RyanMarten commented Dec 6, 2024

CharlieJCJ commented Dec 10, 2024 • edited Loading

CharlieJCJ commented Dec 10, 2024

CharlieJCJ commented Dec 10, 2024

CharlieJCJ commented Dec 10, 2024

CharlieJCJ commented Dec 10, 2024 •

edited

Loading