Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT: Code completions #1476

Open
wants to merge 63 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
4dd5159
add the types for code completions
mikeshi80 May 8, 2024
2d893b3
add prompt style definition for code completions.
mikeshi80 May 8, 2024
81a77cb
add code completion mixin.
mikeshi80 May 8, 2024
b3c0e3a
refactor the code prompt style and add the unit test for code prompt …
mikeshi80 May 9, 2024
30676f0
Merge branch 'xorbitsai:main' into code_completions
mikeshi80 May 10, 2024
ec8c815
correct path to name function to work on windows file system. Add som…
mikeshi80 May 10, 2024
985a4d7
Merge branch 'xorbitsai:main' into code_completions
mikeshi80 May 10, 2024
fa81cf8
Merge branch 'code_completions' of github.com:mikeshi80/xinference in…
mikeshi80 May 10, 2024
38200a3
make the code completion process works
mikeshi80 May 10, 2024
3f1a160
Merge branch 'refs/heads/main' into code_completions
mikeshi80 May 10, 2024
e352804
add vllm inference engine support for `deepseek-coder-base` code model.
mikeshi80 May 10, 2024
2093b65
Show code icon for code models in model cards.
mikeshi80 May 10, 2024
f353d35
add client test for code completions.
mikeshi80 May 10, 2024
0e449e0
check whether the code_prompt_style is None.
mikeshi80 May 10, 2024
4bbbb15
format the code by prettier.
mikeshi80 May 10, 2024
62c8dd9
add the function to generate prompt for code completion.
mikeshi80 May 11, 2024
d388303
fix the bug that cannot get generated prompt correctly.
mikeshi80 May 11, 2024
51390bd
fix the bug that cannot get generated prompt correctly.
mikeshi80 May 11, 2024
ec28dec
adjust the test result to make unit test pass.
mikeshi80 May 13, 2024
16db4a7
Merge branch 'xorbitsai:main' into code_completions
mikeshi80 May 13, 2024
3a0840c
Merge remote-tracking branch 'origin/code_completions' into code_comp…
mikeshi80 May 13, 2024
10d90b0
basically finished code generating web client.
mikeshi80 May 14, 2024
6215454
added the repo_name and file_path for prompt file support, and add th…
mikeshi80 May 14, 2024
5819674
Merge branch 'xorbitsai:main' into code_completions
mikeshi80 May 14, 2024
514e862
ignore the codespell since there are a lot of file extensions that ar…
mikeshi80 May 14, 2024
2a091aa
format ui code by prettier.
mikeshi80 May 14, 2024
c0925e5
fix the get_code_prompt missing parameter.
mikeshi80 May 14, 2024
e518e74
Merge branch 'xorbitsai:main' into code_completions
mikeshi80 May 16, 2024
be8a862
Merge branch 'refs/heads/main' into code_completions
mikeshi80 May 20, 2024
f7d02fb
adapt to langchain 0.2.x, which has breaking changes
mikeshi80 May 20, 2024
85f52d7
Merge branch 'refs/heads/adapt-to-langchain-0.2.x' into code_completions
mikeshi80 May 20, 2024
6c42c71
Merge branch 'xorbitsai:main' into code_completions
mikeshi80 May 23, 2024
12dc09a
add base suffix for codeqwen1.5 to diff with the official generate model
mikeshi80 May 24, 2024
4dbd3bb
add vllm support for codeqwen1.5-base
mikeshi80 May 24, 2024
02f7536
merge llm_family definition.
mikeshi80 May 28, 2024
5e3e749
Merge branch 'xorbitsai:main' into code_completions
mikeshi80 May 28, 2024
ecbf893
modified the model names to use latest model name in definition.
mikeshi80 May 28, 2024
71843c1
modified the model names to use latest model name in definition.
mikeshi80 May 28, 2024
aa9d4bd
merge with main
mikeshi80 Jun 9, 2024
72fd615
Merge branch 'xorbitsai:main' into code_completions
mikeshi80 Jun 13, 2024
d8e0259
add model_hub for model definition in llm_family_modelscope.json.
mikeshi80 Jun 13, 2024
e731632
merge with main
mikeshi80 Jun 17, 2024
8b4abab
Merge branch 'xorbitsai:main' into code_completions
mikeshi80 Jun 18, 2024
b931d9d
Merge branch 'refs/heads/main' into code_completions
mikeshi80 Jun 25, 2024
4f2f8c0
Merge branch 'xorbitsai:main' into code_completions
mikeshi80 Jun 27, 2024
df53445
Merge branch 'refs/heads/main' into code_completions
mikeshi80 Jul 1, 2024
bff135b
Merge branch 'refs/heads/main' into code_completions
mikeshi80 Jul 10, 2024
06ead5b
add the missing import module
mikeshi80 Jul 10, 2024
1901603
format the frontend code.
mikeshi80 Jul 10, 2024
b71e3b6
fix the wrong usage of fetchWrapper
mikeshi80 Jul 10, 2024
3869b30
fix wrong code_prompts get logic
mikeshi80 Jul 10, 2024
810eff3
Merge branch 'refs/heads/main' into code_completions
mikeshi80 Jul 18, 2024
efa6c47
format the frontend code
mikeshi80 Jul 18, 2024
7ae7c77
to use the right call wrapper method.
mikeshi80 Jul 18, 2024
533fcfc
to use the right call wrapper method, again
mikeshi80 Jul 18, 2024
aa724ac
reversed code to use parse_obj
mikeshi80 Jul 18, 2024
569aa23
Merge branch 'xorbitsai:main' into code_completions
mikeshi80 Jul 31, 2024
5a31312
remove vllm disable setting check.
mikeshi80 Jul 31, 2024
28e7e3f
Merge branch 'xorbitsai:main' into code_completions
mikeshi80 Aug 5, 2024
9d5930a
merge with main branch
mikeshi80 Aug 21, 2024
0ad126a
Merge remote-tracking branch 'origin/code_completions' into code_comp…
mikeshi80 Aug 21, 2024
c6948ca
change the format of starcoder from gglmv3 to ggufv2
mikeshi80 Aug 24, 2024
80a9d99
Merge branch 'main' into code_completions
mikeshi80 Aug 24, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -292,7 +292,7 @@ exclude =

[codespell]
ignore-words-list = hist,rcall,fpr,ser,nd,inout,ot,Ba,ba,asend,hart,coo,splitted,datas,fro
skip = .idea,.git,./build,./docs/build,node_modules,static,generated,*.po,*.ts,*.json,*.c,*.cpp,*.cfg,thirdparty
skip = .idea,.git,./build,./docs/build,node_modules,static,generated,*.po,*.ts,*.json,*.c,*.cpp,*.cfg,thirdparty,xinference/model/llm/lang_utils.py

[isort]
profile = black
Expand Down
1 change: 1 addition & 0 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,7 @@ class CustomDevelop(ExtraCommandMixin, develop):
class CustomSDist(ExtraCommandMixin, sdist):
pass


class BuildWeb(Command):
"""build_web command"""

Expand Down
152 changes: 152 additions & 0 deletions xinference/api/restful_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@
ChatCompletionMessage,
Completion,
CreateChatCompletion,
CreateCodeCompletion,
CreateCompletion,
ImageList,
PeftModelConfig,
Expand Down Expand Up @@ -158,6 +159,8 @@ class BuildGradioInterfaceRequest(BaseModel):
model_ability: List[str]
model_description: str
model_lang: List[str]
infill_supported: Optional[bool]
repo_level_supported: Optional[bool]


class BuildGradioImageInterfaceRequest(BaseModel):
Expand Down Expand Up @@ -258,6 +261,9 @@ async def internal_exception_handler(request: Request, exc: Exception):
self._router.add_api_route(
"/v1/models/prompts", self._get_builtin_prompts, methods=["GET"]
)
self._router.add_api_route(
"/v1/models/code_prompts", self._get_builtin_code_prompts, methods=["GET"]
)
self._router.add_api_route(
"/v1/models/families", self._get_builtin_families, methods=["GET"]
)
Expand Down Expand Up @@ -554,6 +560,29 @@ async def internal_exception_handler(request: Request, exc: Exception):
),
)

self._router.add_api_route(
"/v1/code/completions",
self.create_code_completion,
methods=["POST"],
response_model=Completion,
dependencies=(
[Security(self._auth_service, scopes=["models:read"])]
if self.is_authenticated()
else None
),
)

self._router.add_api_route(
"/v1/code/prompt",
self.get_code_prompt,
methods=["POST"],
dependencies=(
[Security(self._auth_service, scopes=["models:read"])]
if self.is_authenticated()
else None
),
)

# for custom models
self._router.add_api_route(
"/v1/model_registrations/{model_type}",
Expand Down Expand Up @@ -743,6 +772,18 @@ async def _get_builtin_prompts(self) -> JSONResponse:
logger.error(e, exc_info=True)
raise HTTPException(status_code=500, detail=str(e))

async def _get_builtin_code_prompts(self) -> JSONResponse:
"""
For internal usage
:return:
"""
try:
data = await (await self._get_supervisor_ref()).get_builtin_code_prompts()
return JSONResponse(content=data)
except Exception as e:
logger.error(e, exc_info=True)
raise HTTPException(status_code=500, detail=str(e))

async def _get_builtin_families(self) -> JSONResponse:
"""
For internal usage
Expand Down Expand Up @@ -1003,6 +1044,8 @@ async def build_gradio_interface(
model_description=body.model_description,
model_lang=body.model_lang,
access_token=access_token,
infill_supported=body.infill_supported,
repo_level_supported=body.repo_level_supported,
).build()
gr.mount_gradio_app(self._app, interface, f"/{model_uid}")
except ValueError as ve:
Expand Down Expand Up @@ -1763,6 +1806,115 @@ async def stream_results():
self.handle_request_limit_error(e)
raise HTTPException(status_code=500, detail=str(e))

async def create_code_completion(self, request: Request) -> Response:
json_data = await request.json()

if "mode" in json_data and json_data["mode"] not in ("completion", "infill"):
raise HTTPException(
status_code=400,
detail="mode must be one of 'completion' or 'infill'",
)

if json_data.get("stream", False):
json_data["stream"] = False

body = CreateCodeCompletion.parse_obj(json_data)
exclude = {
"mode",
"prompt",
"file_path",
"suffix",
"repo_name",
"files",
"model",
"n",
"messages",
"logit_bias",
"logit_bias_type",
"user",
}

kwargs = body.dict(exclude_unset=True, exclude=exclude)

# TODO: Decide if this default value override is necessary #1061
if body.max_tokens is None:
kwargs["max_tokens"] = max_tokens_field.default

if body.logit_bias is not None:
raise HTTPException(status_code=501, detail="Not implemented")

model_uid = body.model

try:
model = await (await self._get_supervisor_ref()).get_model(model_uid)
except ValueError as ve:
logger.error(str(ve), exc_info=True)
await self._report_error_event(model_uid, str(ve))
raise HTTPException(status_code=400, detail=str(ve))
except Exception as e:
logger.error(e, exc_info=True)
await self._report_error_event(model_uid, str(e))
raise HTTPException(status_code=500, detail=str(e))

assert not body.stream

try:
data = await model.code_generate(
body.mode,
body.prompt,
body.file_path,
body.suffix,
body.repo_name,
body.files,
kwargs,
)
return Response(content=data, media_type="application/json")
except Exception as e:
logger.error(e, exc_info=True)
await self._report_error_event(model_uid, str(e))
self.handle_request_limit_error(e)
raise HTTPException(status_code=500, detail=str(e))

async def get_code_prompt(self, request: Request) -> Response:
json_data = await request.json()

if "mode" in json_data and json_data["mode"] not in ("completion", "infill"):
raise HTTPException(
status_code=400,
detail="mode must be one of 'completion' or 'infill'",
)

body = CreateCodeCompletion.parse_obj(json_data)

model_uid = body.model

try:
model = await (await self._get_supervisor_ref()).get_model(model_uid)
except ValueError as ve:
logger.error(str(ve), exc_info=True)
await self._report_error_event(model_uid, str(ve))
raise HTTPException(status_code=400, detail=str(ve))
except Exception as e:
logger.error(e, exc_info=True)
await self._report_error_event(model_uid, str(e))
raise HTTPException(status_code=500, detail=str(e))

try:
code_prompt = await model.get_code_prompt(
body.mode,
body.prompt,
body.file_path,
body.suffix,
body.repo_name,
body.files,
)
return Response(content=code_prompt, media_type="application/json")
except Exception as e:
logger.error(e, exc_info=True)
await self._report_error_event(model_uid, str(e))
self.handle_request_limit_error(e)
raise HTTPException(status_code=500, detail=str(e))

async def query_engines_by_model_name(self, model_name: str) -> JSONResponse:
try:
content = await (
Expand Down
Loading
Loading