Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Japanese language dependencies #1511

Open
wants to merge 51 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
4e49666
Added Japanese dependencies to mlserver-huggingface
jbauer2718 Dec 12, 2023
e6562e1
Added unit test for Japanese server
jbauer2718 Dec 12, 2023
4414ccf
linted
jbauer2718 Dec 12, 2023
399d83b
Addressed style comments; made dependencies optional
jbauer2718 Dec 14, 2023
854784f
Updated README.md and unit test
jbauer2718 Dec 15, 2023
c6f1a4f
Updated docs
jbauer2718 Dec 15, 2023
f75ee9e
Working test in docs
jbauer2718 Dec 15, 2023
936fadf
Generalized test to a list of cases
jbauer2718 Dec 16, 2023
5b3a258
updated docs
jbauer2718 Dec 16, 2023
c30f2ad
Fixed notebook cell outputs
jbauer2718 Dec 16, 2023
339527f
Fixed notebook cell outputs
jbauer2718 Dec 16, 2023
f44656d
Refactored test with pytest.pymark.parameterize. Used a smaller test …
jbauer2718 Dec 26, 2023
73c60a7
Finished an inline comment I forgot to fully type out
jbauer2718 Dec 26, 2023
2715c2b
Updated lock and docs
jbauer2718 Jan 16, 2024
2a9652b
Removed .bak.ipynb
jbauer2718 Jan 16, 2024
8992881
Updated README.ipynb to match README.md
jbauer2718 Jan 16, 2024
d1ae9b2
Synced with main branch and resolved conflicts
jbauer2718 Apr 12, 2024
e4596e0
rebuilt lock files with same version of poetry
jbauer2718 Apr 12, 2024
aa3ea7b
build(deps-dev): bump pytest-cases from 3.8.4 to 3.8.5 (#1691)
dependabot[bot] Apr 16, 2024
ed6ef38
build(deps): bump scikit-learn in /runtimes/sklearn (#1693)
dependabot[bot] Apr 16, 2024
40fcaf5
build(deps-dev): bump pytorch-lightning in /runtimes/mlflow (#1694)
dependabot[bot] Apr 16, 2024
6639484
build(deps): bump sqlparse from 0.4.4 to 0.5.0 in /runtimes/mlflow (#…
dependabot[bot] Apr 16, 2024
dc93eba
build(deps-dev): bump sqlparse from 0.4.4 to 0.5.0 (#1697)
dependabot[bot] Apr 16, 2024
c6a9c5d
build(deps): bump optimum from 1.17.1 to 1.18.1 in /runtimes/huggingf…
dependabot[bot] Apr 16, 2024
28e4e75
build(deps): bump joblib from 1.3.2 to 1.4.0 in /runtimes/sklearn (#1…
dependabot[bot] Apr 16, 2024
c9ed9ba
build(deps): bump pandas from 2.2.1 to 2.2.2 in /runtimes/lightgbm (#…
dependabot[bot] Apr 16, 2024
b1e7c14
Fixed merge conflict in README
jbauer2718 Apr 16, 2024
7f21db5
Fixed merge conflict in lock
jbauer2718 Apr 16, 2024
6df3dbe
Rebased and fixed lock
jbauer2718 Apr 16, 2024
32d16e8
Added unit test for Japanese server
jbauer2718 Dec 12, 2023
f6a3901
linted
jbauer2718 Dec 12, 2023
47bd2da
Addressed style comments; made dependencies optional
jbauer2718 Dec 14, 2023
188b369
Updated README.md and unit test
jbauer2718 Dec 15, 2023
7fa5d0c
Updated docs
jbauer2718 Dec 15, 2023
d755990
Working test in docs
jbauer2718 Dec 15, 2023
bf4d412
Generalized test to a list of cases
jbauer2718 Dec 16, 2023
6174e00
updated docs
jbauer2718 Dec 16, 2023
5f8c8e3
Fixed notebook cell outputs
jbauer2718 Dec 16, 2023
3fa63e0
Fixed notebook cell outputs
jbauer2718 Dec 16, 2023
aafa4d1
Refactored test with pytest.pymark.parameterize. Used a smaller test …
jbauer2718 Dec 26, 2023
549f244
Finished an inline comment I forgot to fully type out
jbauer2718 Dec 26, 2023
0d61961
Updated lock and docs
jbauer2718 Jan 16, 2024
827329f
Removed .bak.ipynb
jbauer2718 Jan 16, 2024
c9372a5
Updated README.ipynb to match README.md
jbauer2718 Jan 16, 2024
e137c05
rebuilt lock files with same version of poetry
jbauer2718 Apr 12, 2024
d9211bf
Fixed merge conflict in README
jbauer2718 Apr 16, 2024
4a9c1c6
Fixed merge conflict in lock
jbauer2718 Apr 16, 2024
4c63c86
Fixed merge conflicts
jbauer2718 Apr 16, 2024
85bddf1
Update README.md
jbauer2718 Apr 16, 2024
616d3a9
removed files for bert-japanese
jbauer2718 Apr 16, 2024
cf4ec15
Removed model setting
jbauer2718 Apr 16, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
100 changes: 90 additions & 10 deletions docs/examples/huggingface/README.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
"metadata": {},
"outputs": [],
"source": [
"!pip install requests\n",
"# Import required dependencies\n",
"import requests"
]
Expand Down Expand Up @@ -437,6 +438,93 @@
").json()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "f1d4b24a-4c09-4743-a086-6f8b143711ad",
"metadata": {},
"source": [
"### Masked Language Modeling (Optional Japanese Language Example)\n",
"\n",
"We can also serve a masked language model. In the following example, we also build the `huggingface` runtime with the `-E japanese` flag to enable support for Japanese tokenizers. For example, after running the normal project build from the root directory with `make install-dev`, we can install the optional Japanese dependencies in dev mode:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "63e368ee-a5ef-44b7-aab8-cafd30ab227a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Overwriting ./model-settings.json\n"
]
}
],
"source": [
"%%writefile ./model-settings.json\n",
"{\n",
" \"name\": \"transformer\",\n",
" \"implementation\": \"mlserver_huggingface.runtime.HuggingFaceRuntime\",\n",
" \"parameters\": {\n",
" \"extra\": {\n",
" \"task\": \"fill-mask\",\n",
" \"pretrained_model\": \"cl-tohoku/bert-base-japanese\",\n",
" \"pretrained_tokenizer\": \"cl-tohoku/bert-base-japanese\"\n",
" }\n",
" }"
]
},
{
"cell_type": "markdown",
"id": "72b05244-f52f-41c0-a1bc-a5d11d1c75a0",
"metadata": {},
"source": [
"Using the shell to start mlserver like so,\n",
"\n",
"```shell\n",
"mlserver start .\n",
"```\n",
"we can pass inferences like this. Note the `[MASK]` token. The mask token can be different for different models, so check the HuggingFace model config for special tokens."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "b619c852-f35d-4506-91e1-0a1e3bcb7b8b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'実際 に 空 が 見える の か?'"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from mlserver_huggingface.codecs import HuggingfaceRequestCodec\n",
"import json\n",
"\n",
"# Test sentence: Is the sky really [MASK]?\n",
"test_sentence = \"実際に空が[MASK]のか?\"\n",
"# [MASK] = visible\n",
"expected_output = \"見える\"\n",
"\n",
"inference_request = HuggingfaceRequestCodec.encode_request(\n",
" {\"inputs\": [test_sentence]},\n",
" use_bytes=False,\n",
")\n",
"json.dumps(inference_request.dict())\n",
"response = requests.post(\"http://localhost:8080/v2/models/transformer/infer\", json=inference_request.dict()).json()\n",
"json.loads(response['outputs'][0]['data'][0])[\"sequence\"]"
]
},
{
"cell_type": "markdown",
"id": "fe6655d9",
Expand Down Expand Up @@ -714,19 +802,11 @@
" report \\\n",
" -type=text"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0ddcb458",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
Expand All @@ -740,7 +820,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.8"
"version": "3.9.5"
}
},
"nbformat": 4,
Expand Down
62 changes: 54 additions & 8 deletions docs/examples/huggingface/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -259,21 +259,67 @@ requests.post(
"http://localhost:8080/v2/models/transformer/infer", json=inference_request
).json()
```
```
{'model_name': 'transformer',
'id': '835eabbd-daeb-4423-a64f-a7c4d7c60a9b',
'parameters': {},
'outputs': [{'name': 'output',
'shape': [1, 1],
'datatype': 'BYTES',
'parameters': {'content_type': 'hg_jsonlist'},
'data': ['{"label": "NEGATIVE", "score": 0.9996137022972107}']}]}
```

### Masked Language Modeling (Optional Japanese Language Example)

We can also serve a masked language model. In the following example, we also build the `huggingface` runtime with the `-E japanese` flag to enable support for Japanese tokenizers. For example, after running the normal project build from the root directory with `make install-dev`, we can install the optional Japanese dependencies in dev mode:

`poetry install -E japanese`

{'model_name': 'transformer',
'id': '835eabbd-daeb-4423-a64f-a7c4d7c60a9b',
'parameters': {},
'outputs': [{'name': 'output',
'shape': [1, 1],
'datatype': 'BYTES',
'parameters': {'content_type': 'hg_jsonlist'},
'data': ['{"label": "NEGATIVE", "score": 0.9996137022972107}']}]}
from the `./runtimes/huggingface` level.

```python
%%writefile ./model-settings.json
{
"name": "model",
"implementation": "mlserver_huggingface.runtime.HuggingFaceRuntime",
"parameters": {
"extra": {
"task": "fill-mask",
"pretrained_model": "cl-tohoku/bert-base-japanese",
"pretrained_tokenizer": "cl-tohoku/bert-base-japanese"
}
}
}
```
Using the shell to start mlserver like so,
```shell
mlserver start .
```
we can pass inferences like this. Note the `[MASK]` token. The mask token can be different for different models, so check the HuggingFace model config for special tokens.
```python
from mlserver_huggingface.codecs import HuggingfaceRequestCodec
import json

# Test sentence: Is the sky really [MASK]?
test_sentence = "実際に空が[MASK]のか?"
# [MASK] = visible
expected_output = "見える"

inference_request = HuggingfaceRequestCodec.encode_request(
{"inputs": [test_sentence]},
use_bytes=False,
)
json.dumps(inference_request.dict())
response = requests.post("http://localhost:8080/v2/models/transformer/infer", json=inference_request.dict()).json()
json.loads(response['outputs'][0]['data'][0])["sequence"]
```
```
Response:
{'model_name': 'transformer', 'id': '9e966d8d-b43d-4ab4-8d47-90e367196233', 'parameters': {}, 'outputs': [{'name': 'output', 'shape': [5, 1], 'datatype': 'BYTES', 'parameters': {'content_type': 'hg_jsonlist'}, 'data': ['{"score": 0.3277095854282379, "token": 11819, "token_str": "\\u3042\\u308b", "sequence": "\\u5b9f\\u969b \\u306b \\u7a7a \\u304c \\u3042\\u308b \\u306e \\u304b?"}', '{"score": 0.10271108895540237, "token": 14656, "token_str": "\\u898b\\u3048\\u308b", "sequence": "\\u5b9f\\u969b \\u306b \\u7a7a \\u304c \\u898b\\u3048\\u308b \\u306e \\u304b?"}', '{"score": 0.08325661718845367, "token": 11835, "token_str": "\\u306a\\u3044", "sequence": "\\u5b9f\\u969b \\u306b \\u7a7a \\u304c \\u306a\\u3044 \\u306e \\u304b?"}', '{"score": 0.036131054162979126, "token": 18413, "token_str": "\\u6b63\\u3057\\u3044", "sequence": "\\u5b9f\\u969b \\u306b \\u7a7a \\u304c \\u6b63\\u3057\\u3044 \\u306e \\u304b?"}', '{"score": 0.029351236298680305, "token": 11820, "token_str": "\\u3044\\u308b", "sequence": "\\u5b9f\\u969b \\u306b \\u7a7a \\u304c \\u3044\\u308b \\u306e \\u304b?"}']}]}
Data:
{'score': 0.3277095854282379, 'token': 11819, 'token_str': 'ある', 'sequence': '実際 に 空 が ある の か?'}
```
## GPU Acceleration

We can also evaluate GPU acceleration, we can test the speed on CPU vs GPU using the following parameters
Expand Down
Loading
Loading