You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
from summa.preprocessing.textcleaner import clean_text_by_sentences as _clean_text_by_sentences.
text='''Ad sales boost Time Warner profit
Quarterly profits at US media giant TimeWarner jumped 76% to $1.13bn (£600m) for the three months to December, from $639m year-earlier.The firm, which is now one of the biggest investors in Google, benefited from sales of high-speed internet connections and higher advert sales. TimeWarner said fourth quarter sales rose 2% to $11.1bn from $10.9bn. Its profits were buoyed by one-off gains which offset a profit dip at Warner Bros, and less users for AOL.
'''
This is the output i have recieved from after preprocessing. As you can see the second sentence should get separated by full stop but instead it is only separating the sentence using space on a new line by enter key pressed.
[Original unit: 'Ad sales boost Time Warner profit' --- Processed unit: 'ad sale boost time warner profit',
Original unit: 'Quarterly profits at US media giant TimeWarner jumped 76% to $1.13bn (£600m) for the three months to December, from $639m year-earlier.The firm, which is now one of the biggest investors in Google, benefited from sales of high-speed internet connections and higher advert sales.' --- Processed unit: 'quarter profit media giant timewarn jump bn £m month decemb m year earlier firm biggest investor googl benefit sale high speed internet connect higher advert sale',
Original unit: 'TimeWarner said fourth quarter sales rose 2% to $11.1bn from $10.9bn.' --- Processed unit: 'timewarn said fourth quarter sale rose bn bn',
Original unit: 'Its profits were buoyed by one-off gains which offset a profit dip at Warner Bros, and less users for AOL.' --- Processed unit: 'profit buoy gain offset profit dip warner bros user aol']
The text was updated successfully, but these errors were encountered:
I have given the following input to
from summa.preprocessing.textcleaner import clean_text_by_sentences as _clean_text_by_sentences.
text='''Ad sales boost Time Warner profit
Quarterly profits at US media giant TimeWarner jumped 76% to $1.13bn (£600m) for the three months to December, from $639m year-earlier.The firm, which is now one of the biggest investors in Google, benefited from sales of high-speed internet connections and higher advert sales. TimeWarner said fourth quarter sales rose 2% to $11.1bn from $10.9bn. Its profits were buoyed by one-off gains which offset a profit dip at Warner Bros, and less users for AOL.
'''
This is the output i have recieved from after preprocessing. As you can see the second sentence should get separated by full stop but instead it is only separating the sentence using space on a new line by enter key pressed.
[Original unit: 'Ad sales boost Time Warner profit' --- Processed unit: 'ad sale boost time warner profit',
Original unit: 'Quarterly profits at US media giant TimeWarner jumped 76% to $1.13bn (£600m) for the three months to December, from $639m year-earlier.The firm, which is now one of the biggest investors in Google, benefited from sales of high-speed internet connections and higher advert sales.' --- Processed unit: 'quarter profit media giant timewarn jump bn £m month decemb m year earlier firm biggest investor googl benefit sale high speed internet connect higher advert sale',
Original unit: 'TimeWarner said fourth quarter sales rose 2% to $11.1bn from $10.9bn.' --- Processed unit: 'timewarn said fourth quarter sale rose bn bn',
Original unit: 'Its profits were buoyed by one-off gains which offset a profit dip at Warner Bros, and less users for AOL.' --- Processed unit: 'profit buoy gain offset profit dip warner bros user aol']
The text was updated successfully, but these errors were encountered: