Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't reconstruct text #2

Open
Yumeka999 opened this issue Dec 29, 2020 · 1 comment
Open

Can't reconstruct text #2

Yumeka999 opened this issue Dec 29, 2020 · 1 comment

Comments

@Yumeka999
Copy link

When i use run_union.py do predict-tokens task

Input texts are:
1
An alien race encounters the most terrifying predator imaginable . A lone , unarmed human . The last time they 'd come ... they 'd arrived on the planet , expecting the worst . But when they arrived , they 'd been silent for a few days , and just as quickly they were gone . I mean , no one knows where they came from , or who they came from . However , in the end , they 'd never come . They 'd been at the top of the world , watching us from miles away . They were all on the planet , but we could see them , and we could see them , that 's what we wanted . We went over to the surface to find the planet we could live in , and they appeared , in the center of the planet . They were humanoid , with a large mass of black , but they looked like us . They were humanoid and seemed to be humanoid , except for at their appearance . They wore strange masks and wore strange helmets . They were humanoid , with strange attire , and wore strange garments .

2
It 's surprising that the most important person in the world has so little security . '' Said the assassin to his target . I am here , there is no better security . '' Was the casual reply . `` I 'm not a security threat , I 'm a tool , I am a tool . '' I had always wanted to be a tool , and I had always wanted to be someone else . I was the only one who was n't a tool . I was the only one who could tell me what was going on . I was the only one who could prevent the death of those who I knew would kill me . I was the only one who could stop the murder of the most important people in the world to avoid a nuclear war . I was the only one who could stop the deaths of the world 's most important people .

3
Write your heart onto your sleeve , Reddit . You 've got ta understand , I 'm a bit surprised . The stress is off , and I 'm a little worried about how much I 'm getting . I 'm a bit nervous , and I 'm not like you , I just want to tell you how much I love you . I 'm just so upset , I 'm really , really good at it , but here 's my feelings , and I love you so much . I 've been to many different schools , to be specific . I know you 're not real , and I know you 're not real . I know you 're not real , but you do n't know me . So I 'm going to let you know . I know that feeling , that ... that I 'm feeling right now . It 's kind of like that , you know ? It 's like that 's the first thing I 've ever felt .

but the predict token are:
1
. [SEP] . . . . . . . . [SEP] [SEP] . [SEP] . [SEP] [SEP] [SEP] . . . [SEP] . . [SEP] . . . . . . [SEP] . . . . [SEP] . [SEP] . . . [SEP] . . [SEP] . . [SEP] [SEP] . . [SEP] . [SEP] [SEP] [SEP] [SEP] . [SEP] . [SEP] . . . [SEP] [SEP] . [SEP] . [SEP] . [SEP] . . [SEP] [SEP] . [SEP] . . . [SEP] [SEP] [SEP] . [SEP] . [SEP] [SEP] . . . . . [SEP] [SEP] . [SEP] . . . . [SEP] . . . [SEP] . . . [SEP] . . . . . . . . [SEP] . . [SEP] . . [SEP] . [SEP] . [SEP] [SEP] . . [SEP] . . . [SEP] [SEP] [SEP] . . . [SEP] . . [SEP] . . [SEP] [SEP] . . [SEP] . [SEP] . . [SEP] . . . [SEP] . [SEP] . [SEP] [SEP] [SEP] . . [SEP] . . . . . . . . . [SEP] [SEP] . [SEP] . [SEP] . [SEP] . [SEP] . . [SEP] . [SEP] [SEP] . .

2
. [SEP] [SEP] [SEP] . [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] . . [SEP] . [SEP] . [SEP] . . [SEP] . . [SEP] . . [SEP] [SEP] [SEP] [SEP] . [SEP] [SEP] [SEP] [SEP] . [SEP] . . . . [SEP] . . . . . . [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] . . [SEP] [SEP] . [SEP] . . [SEP] . [SEP] . . . . [SEP] [SEP] [SEP] . . . [SEP] . . [SEP] [SEP] . [SEP] . [SEP] [SEP] . . [SEP] [SEP] [SEP] . [SEP] [SEP] [SEP] [SEP] . . . [SEP] [SEP] [SEP] [SEP] [SEP] . . . . . . . [SEP] . [SEP] . [SEP] [SEP] [SEP] . [SEP] [SEP] . . [SEP] . . . . . . [SEP] . . [SEP] [SEP] [SEP] [SEP] . [SEP] [SEP] [SEP] . [SEP] . . [SEP] [SEP] [SEP] [SEP] . [SEP] . . . [SEP] . [SEP] [SEP] . [SEP] [SEP] . [SEP] [SEP] [SEP] . [SEP] . . [SEP] . [SEP] [SEP] [SEP] [SEP] [SEP] . . . . [SEP] . [SEP] . [SEP] [SEP] [SEP] [SEP] [SEP] . . [SEP] [SEP] [SEP] [SEP] [SEP] . [SEP]

3
[SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] . [SEP] [SEP] [SEP] . [SEP] [SEP] . [SEP] [SEP] [SEP] [SEP] . . [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] . [SEP] [SEP] [SEP] [SEP] . [SEP] [SEP] . . [SEP] [SEP] . [SEP] . [SEP] [SEP] [SEP] [SEP] . [SEP] . . [SEP] [SEP] [SEP] [SEP] . . [SEP] [SEP] [SEP] . . . [SEP] [SEP] . [SEP] . . . [SEP] [SEP] [SEP] [SEP] [SEP] . . [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] . [SEP] . . . . . [SEP] [SEP] [SEP] . [SEP] . [SEP] [SEP] [SEP] . [SEP] [SEP] [SEP] . [SEP] . [SEP] [SEP] . [SEP] . . . [SEP] . [SEP] [SEP] [SEP] . [SEP] [SEP] . [SEP] . . [SEP] . . [SEP] . [SEP] . . [SEP] [SEP] . [SEP] . . . [SEP] . . [SEP] . . . . [SEP] [SEP] . . [SEP] . [SEP] . . [SEP] . [SEP] [SEP] . . . . . . [SEP] [SEP] . . . . . [SEP] [SEP] [SEP] . [SEP] . [SEP] . [SEP] . . . [SEP] . . [SEP] [SEP] [SEP] [SEP] . . [SEP]

and these tokens are wrong tokens.

@JianGuanTHU
Copy link
Member

Thank you for your question!
Although the key motivation of the reconstruction task is to provide more specific supervision signals for recognizing errors, UNION can generate meaningful editing results from unreasonable stories. We observe that UNION can correct lexical errors. For example, given the story "he played to play chess", UNION changed "played" to "chose". However, since UNION adopted a non-autoregressive generative framework (Equation 4), it is difficult to generate a grammatical story if the input has sentence-level errors. But UNION can still accurately recognize the errors. For example, given the repetitive story "we had a great time. we had a great time.", it generated "we had a great time. we . .". We plan to improve the design by aligning the input and output tokens and then auto-tagging with editing operations during training with the reconstruction task in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants