Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing bug 857 (regression from 0.1.14 to 0.1.15) #858

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

FoxBuchele
Copy link

Fixing regression issue in 0.1.15 where capturing text would cause the LM to slice it incorrectly due to stop and start tokens.

Fixing issue where capturing text would cause the LM to slice it incorrectly due to stop and start tokens.
@FoxBuchele
Copy link
Author

Converted to draft because I noticed some errors when writing tests (specifically, guidance functions within f-strings were not converted properly).

Looking into fixing it the 'right' way, and will re-open the pull request then, unless someone else gets to it first.

@FoxBuchele FoxBuchele marked this pull request as ready for review May 25, 2024 20:10
@riedgar-ms
Copy link
Collaborator

Could you add a test which fails without your fix?

@codecov-commenter
Copy link

codecov-commenter commented May 28, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 61.25%. Comparing base (4ae2ea6) to head (cf2bf6f).
Report is 1 commits behind head on main.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #858      +/-   ##
==========================================
+ Coverage   56.43%   61.25%   +4.81%     
==========================================
  Files          63       63              
  Lines        4791     4798       +7     
==========================================
+ Hits         2704     2939     +235     
+ Misses       2087     1859     -228     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@FoxBuchele
Copy link
Author

I can definitely try!

The fix is only relevant/used in models that use role blocks (with system(), with assistant(), etc) that also have closing tags whenever the role is completed, so I'll have to dig into the test suite to determine how to write a test properly that only runs on some types of LMs.

@riedgar-ms
Copy link
Collaborator

I can definitely try!

The fix is only relevant/used in models that use role blocks (with system(), with assistant(), etc) that also have closing tags whenever the role is completed, so I'll have to dig into the test suite to determine how to write a test properly that only runs on some types of LMs.

Look at the selected_model_name and selected_model fixtures for that. You can see one approach here:

def llamacpp_model(selected_model, selected_model_name):

@FoxBuchele
Copy link
Author

I can definitely try!
The fix is only relevant/used in models that use role blocks (with system(), with assistant(), etc) that also have closing tags whenever the role is completed, so I'll have to dig into the test suite to determine how to write a test properly that only runs on some types of LMs.

Look at the selected_model_name and selected_model fixtures for that. You can see one approach here:

def llamacpp_model(selected_model, selected_model_name):

Thanks!! That was very helpful.

I added a single basic test that ensures if you capture text within a role block, the text captured is sliced from the model at the appropriate location.


from ..utils import get_model


@pytest.fixture(scope="module")
def instruct_model(selected_model, selected_model_name):
if selected_model_name in ["transformers_phi3cpu_mini_4k_instruct"]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the only model for which the test works? I thought that some of the others supported the role tags? Perhaps move the fixture to conf.py and call it something like model_with_role_tags ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah it appears so! It looks like there's both transformers and non transformers of phi3 mini instruct 4k, as well as a couple other instruct versions I recognized.

Good idea to move it to the conftest.py file though, I can definitely see situations where we could use more tests specifically for models that utilize roles.

@riedgar-ms
Copy link
Collaborator

@FoxBuchele there seems to be a test failure related to this?

@@ -1,6 +1,9 @@
from .._guidance import guidance
from .._grammar import capture as grammar_capture, GrammarFunction

# Adapted from active_role_end in _model.py, functionality should be shared probably?
import re
format_pattern = re.compile(r"<\|\|_.*?_\|\|>", flags=re.DOTALL)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slightly concerned that this appears to be relying on ChatML tags, which not all models use

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants