Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BiLSTM-CRF NERLearner - Skipping batch of size=1 #18

Open
gohjiayi opened this issue Mar 13, 2022 · 1 comment
Open

BiLSTM-CRF NERLearner - Skipping batch of size=1 #18

gohjiayi opened this issue Mar 13, 2022 · 1 comment

Comments

@gohjiayi
Copy link

gohjiayi commented Mar 13, 2022

Hi there, I've been trying to better understand the BiLSTM-CRF NER model, more specifically the NERLearner class in bilstm_crf_ner/model/ner_learner.py.

To run the NER model, I have ran the generate_data and build_data scripts, and subsequently moved on to running the train and test scripts. However, I noticed that when training (and also running test.py), the line 'Skipping batch of size=1' has been logged many times due to the following snippet of code (in both train and test functions).

https://github.com/smitkiri/ehr-relation-extraction/blob/master/bilstm_crf_ner/model/ner_learner.py#L220-L222

if inputs['word_ids'].shape[0] == 1:
    self.logger.info('Skipping batch of size=1')
    continue

All items within my training set and evaluation set will be caught by this if-statement and not move on to the other half of the code. I have tried removing this chunk for evaluation and the model could produce some prediction output - but not to great accuracy as I suspect that it might be affecting the model performance when training.

UPDATE: I realised this was due to the batch size = 1 set, which is not suited for this model. My 2 questions below still remains!

Can I check what is this code for, and will removing it for training and evaluation be okay?

Another question, can I ask how did you derive the results as seen in the BiLSTM-CRF README file? Is there a specific script that you have executed to achieve that?

@gohjiayi gohjiayi changed the title BiLSTM-NERLearner Train and Test, Skipping batch of size=1 BiLSTM-CRF NERLearner - Skipping batch of size=1 Mar 13, 2022
@smitkiri
Copy link
Owner

Hey! I haven't worked with the BiLSTM-CRF NER code much, but I'll try to figure this out over the weekend. From a quick glance, I think it might have something to do with how CRF works, it might require sequence length >1 to compute a score? I may be wrong here, but let me know if this makes sense. I got there from this line computing the loss.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants