Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tutorial and README badges #18

Merged
merged 11 commits into from
Nov 25, 2024
Merged

Add tutorial and README badges #18

merged 11 commits into from
Nov 25, 2024

Conversation

SagiPolaczek
Copy link
Contributor

No description provided.

"1. Data Module - A Lightning data module class where we load and process the data for the task.\n",
"2. `data_preprocessing()` function. Which responsible for formatting the input prompt.\n",
"3. `process_model_output()` function. Which takes the raw output of the model and translate it into a human-level answers."
"## The big picture\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"The Big Picture" is typically with capital B and P
see https://www.merriam-webster.com/dictionary/big%20picture

Copy link
Collaborator

@matanninio matanninio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. waiting for the final PR to approve

"## The big picture\n",
"In order to implement a downstream task for the MAMMAL framework, one should implement a `MammalTask` class instance ([source](https://github.com/BiomedSciAI/biomed-multi-alignment/blob/e56a03e0e9f69e42f919a96def739b78e50a47e5/mammal/task.py#L15)). A `MammalTask` class consists of three main components:\n",
"1. Data Module - A Lightning data module (`LightningDataModule`) where we load and process the data of the task.\n",
"2. `data_preprocessing()` method. Which responsible for formatting the input prompt to the model's expected format.\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should mention that the data processing should include translating known entities to the lexicon/standard the model was trained with.

@@ -412,7 +413,12 @@
{
"name": "stdout",
"output_type": "stream",
"text": []
"text": [
"The OrderedVocab you are attempting to save contains holes for indices [314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499, 500], your vocabulary could be corrupted !\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete

"name": "stderr",
"output_type": "stream",
"text": [
"\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete

"name": "stderr",
"output_type": "stream",
"text": [
"/dccstor/mm_hcls/usr/sagi/envs/mammal_env/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=15` in the `DataLoader` to improve performance.\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete

"evalue": "name 'exit' is not defined",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete

Copy link
Contributor

@mosheraboh mosheraboh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.
Please delete some of the output in the notebooks.

Copy link
Contributor

@mosheraboh mosheraboh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@SagiPolaczek SagiPolaczek merged commit 5bb1204 into main Nov 25, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants