Skip to content

Commit

Permalink
Remove retry from backends (#105)
Browse files Browse the repository at this point in the history
* Remove retry from backends

* add tgi

* fix ci

* fix ci

* prettier
  • Loading branch information
lucasavila00 authored Apr 8, 2024
1 parent 71f68d3 commit 57bccd4
Show file tree
Hide file tree
Showing 20 changed files with 466 additions and 125 deletions.
1 change: 1 addition & 0 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
"mdts",
"openai",
"positionals",
"regexes",
"runpod",
"sglang",
"tailwindcss",
Expand Down
3 changes: 3 additions & 0 deletions docker/tgi/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# TGI Docker

Experimental.
21 changes: 21 additions & 0 deletions docker/tgi/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
services:
sv:
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
image: ghcr.io/huggingface/text-generation-inference:1.4.5
command: "--model-id TheBloke/Mistral-7B-Instruct-v0.2-AWQ --quantize awq"
ports:
- 8080:8080
network_mode: host
ipc: "host"
volumes:
- tgi_cache:/data

volumes:
tgi_cache:
driver: local
2 changes: 1 addition & 1 deletion docker/vllm/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ services:
- driver: nvidia
count: 1
capabilities: [gpu]
image: vllm/vllm-openai:v0.4.0
image: vllm/vllm-openai:v0.4.0.post1
command: "--model TheBloke/Mistral-7B-Instruct-v0.2-AWQ --gpu-memory-utilization 0.8 --max-model-len 4096 --quantization awq"
ports:
- 8000:8000
Expand Down
4 changes: 4 additions & 0 deletions examples/client/src/tasks/kitchen-sink.ts
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,10 @@ const characterGen = (model: InitClient, name: string) =>
)
.gen("json_output", { maxTokens: 256, regex: characterRegex });
export const kitchenSink = async (client: InitClient) => {
// const out = await client
// .user("Tell me a joke")
// .assistant((m) => m.gen("joke"))
// .run();
const start10 = Date.now();
const { rawText: conversation10, captured: captured10 } = await xmlGeneration(client).run({
temperature: 0.0,
Expand Down
92 changes: 92 additions & 0 deletions examples/client/src/tasks/long-task.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
import { InitClient } from "@lmscript/client";
import createSummary from "../generated/fabric/create_summary";

const longTask = async (client: InitClient) => {
const start = Date.now();
const { rawText: _conversation6 } = await createSummary(client, {
input: `
Edward Jones (7 April 1824 – c. 1893 or 1896), also known as "the boy Jones", was an English stalker who became notorious for breaking into Buckingham Palace several times between 1838 and 1841.
Jones was fourteen years old when he first broke into the palace in December 1838. He was found in possession of some items he had stolen, but was acquitted at his trial. He broke in again in 1840, ten days after Queen Victoria had given birth to Princess Victoria. Staff found him hiding under a sofa and he was arrested and subsequently questioned by the Privy Council—the monarch's formal body of advisers. He was sentenced to three months' hard labour at Tothill Fields Bridewell prison. He was released in March 1841 and broke back into the palace two weeks later, where he was caught stealing food from the larders. He was again arrested and sentenced to three months' hard labour at Tothill Fields.
To remove Jones from Britain, the Thames Police tried to surreptitiously coerce him into employment as a sailor. After a voyage on a merchant ship to Brazil, Jones returned to London, where he worked for a month before disappearing and signing up to the Royal Navy—again at the instigation of the Thames Police. He was a ship's boy on HMS Warspite and had further duty on Inconstant and Harlequin. He deserted twice before being allowed to leave the service in 1847. After his return to Britain, Jones was arrested in 1849 for burgling houses in Lewisham, Kent (now South London), and sentenced to transportation to Australia for ten years. He returned to Britain in late 1855 or early 1856 and was again arrested for burglary, before he returned—of his own accord—to Australia. The details of his death are not known, although it was possibly in Bairnsdale in the east of Australia on Boxing Day 1893 or in Perth, in the west of Australia in 1896.
Jones's exploits were extensively covered in the press, and several songs, ballads, poems and cartoons were created. He has been used as the basis for fictional characters and, because of the connection to Queen Victoria, is mentioned in several history books.
Biography
Early life, 1824–1838
Edward Jones was born on 7 April 1824 in Charlotte Street, Westminster, London, the eldest of seven children of Henry Jones, an impecunious tailor, and Mary (née Shores).[1][a] Mary was a 16-year-old seamstress in Henry's employ when the couple married in 1822. Edward had some basic education: he was literate and excelled at arithmetic, but left school before he was twelve.[3] He professed an interest in becoming an architect.[2] The Weekly Chronicle reported that as a child, Jones "manifested a very restless spirit, ... always inquisitive, active and thirsting for information".[4] According to his father, however, Jones was lazy, pessimistic, melancholic and reserved; Henry also said his son did not mix with his siblings and treated them with open contempt.[3]
Jones was apprenticed to two pharmacists between 1836 and 1838.[2] He was dismissed from one of the positions for demonstrating a "mischievous and restless disposition".[5] He was also apprenticed to a Thomas Griffiths, a builder with premises on Coventry Street, where he lasted for a year before being dismissed. In August 1838 he became an apprentice to a carver and gilder, but left on 11 December.[6]
Break-ins, 1838–1841
Buckingham Palace, showing Marble Arch on its left, as a ceremonial entrance.
Buckingham Palace in 1837, with Marble Arch as the front gate
December 1838
At 5:00 am on 14 December 1838 Jones was found in Buckingham Palace—the main residence of the monarch—by William Cox, one of the night porters. Among the items in Jones's possession were a regimental sword, some underwear, three pairs of trousers, some foreign coins and a likeness of Queen Victoria. She was not in residence at the time, but was staying at her country palace, Windsor Castle.[7] He was covered in bear's grease and soot, which led palace staff to think he had climbed down the chimney and tried to make his way out the same way. When he was taken to the police station, he claimed his name was Edward Cotton, the son of a tradesman. When asked where he came from, he said "I came from Hertfordshire twelve months ago, and I met a man ... who asked me to go with him to Buckingham-house. I went, and have been there ever since."[8][9][b]
Jones appeared in front of the magistrates at the police station in Queen Street on 19 December. News of a boy living secretly in Buckingham Palace had become known among the public, and the court was full of viewers and journalists. His real name and background were told to the court, and he admitted that he had lied about having lived in the house for the previous year, and had only spent "a day or two" in the palace. He was asked about the various items he had on him and said he found them on the lawn; the magistrate disbelieved his story and sent him for trial.[10][11] On 28 December 1838 Jones appeared for trial at the Westminster court of sessions. William Prendergast, Jones's solicitor, described how his client had a "warmth of spirit which ... had manifested itself in an inordinate curiosity to obtain a view of Buckingham Palace; and a very natural inclination it was".[12][13] Griffiths gave a character reference for Jones and said that he would have him back as an apprentice; after Prendergast told the court that Jones had promised he would not break into the palace again, he was acquitted.[14]
Jones was re-apprenticed to Griffiths after the trial. Several members of the public travelled to meet Jones, paying his father for the experience. The American novelist James Fenimore Cooper was one who visited, but found Jones to be a "dull, undersized runt, remarkable only for his taciturnity and obstinacy". An offer to take the boy to the US was turned down by Jones's father.[15][16] Another offer Jones received was from a theatre manager, who was planning to stage Intrusion; or a Guest Uninvited, a comedy based on Jones's exploits. Jones was to receive a salary of £4 a week for coming on stage at the end of the night to take a bow. Jones's father, concerned his son would be a laughing stock, declined the offer.[17][18][c] Jones was sacked for a second time by Griffiths and, in 1840, began working for another chemist, but his unpunctuality led to his again losing his position.[20]
November and December 1840
Albert and Victoria sitting on a sofa; Jones is present in the background, peering through a doorway.
Jones spying on Prince Albert and Queen Victoria; from the Sunday Chronicle, April 1841
On 30 November 1840 Jones scaled a wall on Constitution Hill to access the grounds of Buckingham Palace. He entered the building through a window, but there were too many people moving around and he left the way he had come.[21][22] He entered again on the following night. Victoria was in residence with her daughter, Princess Victoria, who had been born ten days previously. Just after midnight the domestic staff at the palace found Jones hiding under a sofa in an anteroom near the Queen's bedroom.[21][23] Neither the Queen nor her baby was woken by the event.[24] In her diary the following morning, Victoria recorded the following:
Albert told me ... that a man had been found, under the sofa in my sitting room. ... The audience room, and [Baroness Louise] Lehzen's were searched first and then mine, Kinnaird, looking under one corner of the sofa, on which I had been rolled into the bedroom but said nothing. Lehzen however pushed it away, and there on the ground, lay a lad who was seized and would not speak, but he was quite unarmed. After he had been taken downstairs, he said he had meant no harm, and had only come to see the Queen! We have since heard that he was in the palace once before, was half-witted, and had merely come, out of curiosity. But supposing he had come into the bedroom – how frightened I should have been.[25]
Police based at the palace arrested Jones and took him into custody at the Gardner's Lane police station; he told officers that he "sat upon the throne, saw the Queen, and heard the Princess Royal cry".[26][23] When he was arrested he did not struggle, nor had he stolen anything; he was unarmed and polite to his captors.[27] At midday on 3 December he was taken from the police station to offices of the Home Department in Whitehall where he was interrogated by the Privy Council—the monarch's formal body of advisers.[28][d]
During his questioning Jones said he would show the members where and how he entered, and he was taken to the palace, explained his route and method, and returned to the council to continue being questioned.[21] He told the council that his reason for entering the palace was because he wanted to write a book about the Queen and "wanted to know how they lived at the Palace" and that he thought "an account of the Palace, and of the disposition and arrangement of the chambers, and particularly of the dressing room of Her Majesty, would be very interesting".[29]
Jones's father was summoned to the council; he suggested his son was insane.[29] Two police doctors examined Jones and concluded that although "his head was of a most peculiar formation", they could not decide on his sanity.[21] As he had been unarmed and not stolen anything, the council decided that a summary punishment was the best course of action; Thomas Hall, the Chief Metropolitan Police Magistrate wrote a warrant to send Jones to Tothill Fields Bridewell prison for three months' hard labour.[30]
`,
}).run({
temperature: 0.1,
});

const end = Date.now();
console.log(`Time taken: ${end - start}ms`);
// console.log(conversation6);
};

import { LmScript } from "@lmscript/client";
import { VllmBackend } from "@lmscript/client/backends/vllm";

const bench = async () => {
let promptTokens = 0;
let completionTokens = 0;
const backend = new VllmBackend({
url: `http://localhost:8000`,
model: "TheBloke/Mistral-7B-Instruct-v0.2-AWQ",
reportUsage: ({ promptTokens: pt, completionTokens: ct }) => {
promptTokens += pt;
completionTokens += ct;
},
template: "mistral",
});
const model = new LmScript(backend, {
temperature: 0.1,
});
// const batch = Array.from({ length: 10 }, (_, _i) =>
// longTask(model).catch((e) => {
// console.error(e);
// }),
// );

const start = Date.now();

for (let index = 0; index < 1; index++) {
await longTask(model);
}
// await Promise.all(batch);
const duration = Date.now() - start;
console.log(`Duration: ${duration}ms`);
console.log(`Prompt tokens: ${promptTokens}`);
console.log(`Completion tokens: ${completionTokens}`);
};

bench().catch(console.error);
33 changes: 33 additions & 0 deletions examples/client/src/tgi-runtime.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
import { LmScript } from "@lmscript/client";
import { kitchenSink } from "./tasks/kitchen-sink";
import { TgiBackend } from "@lmscript/client/backends/tgi";

const bench = async () => {
let promptTokens = 0;
let completionTokens = 0;
const backend = new TgiBackend({
url: `https://suh51yx1k8f3wgk3.eu-west-1.aws.endpoints.huggingface.cloud`,
reportUsage: ({ promptTokens: pt, completionTokens: ct }) => {
promptTokens += pt;
completionTokens += ct;
},
template: "mistral",
});
const model = new LmScript(backend, {
temperature: 0.1,
});
const batch = Array.from({ length: 1 }, (_, _i) =>
kitchenSink(model).catch((e) => {
console.error(e);
}),
);

const start = Date.now();
await Promise.all(batch);
const duration = Date.now() - start;
console.log(`Duration: ${duration}ms`);
console.log(`Prompt tokens: ${promptTokens}`);
console.log(`Completion tokens: ${completionTokens}`);
};

bench().catch(console.error);
2 changes: 1 addition & 1 deletion internal-packages/mdts/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
"mdts": "./out/cli/main.js"
},
"scripts": {
"test": "vitest",
"test": "vitest --run",
"coverage": "vitest run --coverage",
"ts": "tsc --noEmit",
"static": "npm run ts",
Expand Down
6 changes: 6 additions & 0 deletions packages/client/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,12 @@
"import": "./dist/esm/backends/vllm.js",
"default": "./dist/esm/backends/vllm.js"
},
"./backends/tgi": {
"types": "./dist/types/backends/tgi.d.ts",
"require": "./dist/cjs/backends/tgi.js",
"import": "./dist/esm/backends/tgi.js",
"default": "./dist/esm/backends/tgi.js"
},
"./backends/runpod-serverless-sglang": {
"types": "./dist/types/backends/runpod-serverless-sglang.d.ts",
"require": "./dist/cjs/backends/runpod-serverless-sglang.js",
Expand Down
46 changes: 43 additions & 3 deletions packages/client/src/backends/abstract.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,19 @@
import { Role } from "../chat-template";
import { SchemaData } from "../schema";

export type TasksOutput = { text: string; captured: Record<string, unknown> };

/**
* Callback for capturing values from the AI model in real time.
*/
export type OnCapture = (args: { name: string; value: unknown }) => void;

/**
* Notifies about the resource usage of the AI model.
*/
export type ReportUsage = (args: { promptTokens: number; completionTokens: number }) => void;

/**
* Callbacks for the execution of the AI model.
*/
export type ExecutionCallbacks = {
onCapture: OnCapture;
};
Expand All @@ -23,14 +27,20 @@ export type ExecutionCallbacks = {
* Interface for fetching from a SGL server.
*/
export type AbstractBackend = {
executeJSON: (data: GenerationThread, callbacks: ExecutionCallbacks) => Promise<TasksOutput>;
executeJSON: (data: GenerationThread, callbacks: ExecutionCallbacks) => Promise<ClientState>;
};

/**
* Task that just adds text to the current state.
*/
export type AddTextTask = {
tag: "AddTextTask";
text: string;
};

/**
* Task that generates text from the AI model.
*/
export type GenerateTask = {
tag: "GenerateTask";
name: string | undefined;
Expand All @@ -39,34 +49,53 @@ export type GenerateTask = {
regex: string | undefined;
};

/**
* Task that selects a choice from a list.
*/
export type SelectTask = {
tag: "SelectTask";
name: string | undefined;
choices: string[];
};

/**
* Task that repeats previous captured text.
*/
export type RepeatTask = {
tag: "RepeatTask";
variable: string;
};

/**
* Task that matches a variable to a list of tasks.
*/
export type MatchTask = {
tag: "MatchTask";
variable: string;
choices: Record<string, Task[]>;
};

/**
* Task that generates structured data from the AI model.
*/
export type XmlTask = {
tag: "XmlTask";
name: string;
schemaKey: string | undefined;
schema: SchemaData;
};

/**
* Task that starts a role, and finishes the previous one.
*/
export type StartRoleTask = {
tag: "StartRoleTask";
role: Role;
};

/**
* List of all possible tasks supported by the AbstractBackend.
*/
export type Task =
| StartRoleTask
| AddTextTask
Expand All @@ -75,6 +104,10 @@ export type Task =
| RepeatTask
| MatchTask
| XmlTask;

/**
* Parameters for sampling from the AI model.
*/
export type FetcherSamplingParams = {
temperature: number;
top_p?: number;
Expand All @@ -83,11 +116,18 @@ export type FetcherSamplingParams = {
presence_penalty?: number;
};

/**
* Thread for generating text from the AI model.
*/
export type GenerationThread = {
sampling_params: FetcherSamplingParams;
tasks: Task[];
initial_state: ClientState;
};

/**
* Output of the AbstractBackend executeJSON method.
*/
export type ClientState = {
text: string;
captured: Record<string, unknown>;
Expand Down
21 changes: 19 additions & 2 deletions packages/client/src/backends/executor.ts
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@ import {
GenerateTask,
ExecutionCallbacks,
Task,
TasksOutput,
} from "./abstract";

const INTEGER = "(-)?(0|[1-9][0-9]*)";
Expand All @@ -40,6 +39,24 @@ export abstract class BaseExecutor {
this.callbacks = callbacks;
}

protected async fetchJSONWithTimeout<T>(input: RequestInfo, init?: RequestInit): Promise<T> {
const controller = new AbortController();
const timeout = setTimeout(() => controller.abort(), 60_000);
try {
const response = await fetch(input, {
...init,
signal: controller.signal,
});
if (!response.ok) {
const text = await response.text();
throw new Error(`HTTP error: ${response.status} - ${text.slice(0, 1000)}`);
}
return await response.json();
} finally {
clearTimeout(timeout);
}
}

async #writeToPath(path: string[], captured: unknown) {
let current = this.state.captured;

Expand Down Expand Up @@ -284,7 +301,7 @@ export abstract class BaseExecutor {
}
}
}
async executeJSON(): Promise<TasksOutput> {
async executeJSON(): Promise<ClientState> {
for (const task of this.data.tasks) {
await this.#runTask(task);
}
Expand Down
Loading

0 comments on commit 57bccd4

Please sign in to comment.