You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to read .doc file in php(I'm using laravel v6). For reading .doc I'm using phpword library. It works fine with .doc file in english. But I'm from slovakia a we have in our alphabet characters like á,é,í,č,ň,ô. And whit those characters i have problem.
My code:
protected static function doc_to_text( $filename )
{
$objReader = IOFactory::createReader('MsDoc');
$phpWord = $objReader->load($filename); // instance of \PhpOffice\PhpWord\PhpWord
$text = '';
foreach ($phpWord->getSections() as $section) {
foreach ($section->getElements() as $element) {
if ($element instanceof Text) {
$text .= $element->getText();
}
}
}
return $text;
}
This is output of function with slovak .doc:
"}\x01IVOTOPIS Titul, menoKontaktné údaje:Ulica, stoTelefón: 0xx/ xxx xxx Mobil: 09xx xxx xxx e-mail: \x13 HYPERLINK "mailto:[email protected]" \[email protected]\x15 Dosiahnuté vzdelanie: Vysokoa\x01kolské/stredoa\x01kolskéVzdelanie: 2000-2006 Fakulta/ univerzita1995-2000 stredná aDoplH\x01ujúce informácie o vzdelaní: 1998-2000 kurzy 1996-1997 a\x01tudijné pobytyPracovné skúsenosti: 2000-2004 zamestnávate>\x01, pozícia 2004-2006 zamestnávate> pozícia Jazykové znalosti: Anglický jazyk - aktívne 8:|<U+0094><U+009E>¶ÔÖþ\x16\vB\vZ\v<U+0086>\v
<U+008A>\v®\vä\v\x10B\x10´\x10Ö\x10\x1E\x11F\x11H\x11J\x11üøòøìøüøäøäÞäøìüøìøüøüøìøüøüøìøüøüøìøÜìøìøìøØ\x06\x16h\e\t_\x03U\x08\x01\x16hÌeh0J\x11\x0F\x03j\x16hÌehU\x08\x01\x16hÌeh0J\x10\x16hO\x1FÝ0J\x10\x06\x16hÌeh\x06\x16hO\x1FÝ-\x08\x16\x08.\x08P\x08<U+0082>\x08°\x08Ú\x08R\t¶\tÎ\t:<U+0080>¢Ö&\vF\x04\x13¤d\x14¤d[$\x01\$\x01gdÌeh\x0F&\vF\x03\x13¤d\x14¤d[$\x01\$\x01gdÌeh\x0F&\vF\x02\x13¤d\x14¤d[$\x01\$\x01gdÌeh\x0F&\vF\x01\x13¤d\x14¤d[$\x01\$\x01gdÌeh\x04\x0FgdÌeh\x04\x03gdÌeh\x13PoPHP, C++ XHTML, CSS Microsoft Excel Microsoft Word Vodi preukaz: sk. C (najazdených cca 600 000km) Vlastnosti a záujmy: "
This is output of function with english .doc: "Lorem ipsum Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc ac faucibus odio. Vestibulum neque massa, scelerisque sit amet ligula eu, congue molestie mi. Praesent ut varius sem. Nullam at porttitor arcu, nec lacinia nisi. Ut ac dolor vitae odio interdum condimentum. Vivamus dapibus sodales ex, vitae malesuada ipsum cursus convallis. Maecenas sed egestas nulla, ac condimentum orci. Mauris diam felis, vulputate ac suscipit et, iaculis non est. Curabitur semper arcu ac ligula semper, nec luctus nisl blandit. Integer lacinia ante ac libero lobortis imperdiet. Nullam mollis convallis ipsum, ac accumsan nunc vehicula vitae. Nulla eget justo in felis tristique fringilla. Morbi sit amet tortor quis risus auctor condimentum. Morbi in ullamcorper elit. Nulla iaculis tellus sit amet mauris tempus fringilla.Maecenas mauris lectus, lobortis et purus mattis, blandit dictum tellus.Maecenas non lorem quis tellus placerat varius. Nulla facilisi. Aenean congue fringilla justo ut aliquam. In non mauris justo. Duis vehicula mi vel mi pretium, a viverra erat efficitur. Cras aliquam est ac eros varius, id iaculis dui auctor. Duis pretium neque ligula, et pulvinar mi placerat et. "
I have tried google and ask my friends. Also i try https://github.com/neitanod/forceutf8. I need some ideas what should be problem or how to solve it.
The text was updated successfully, but these errors were encountered:
I'm trying to read .doc file in php(I'm using laravel v6). For reading .doc I'm using phpword library. It works fine with .doc file in english. But I'm from slovakia a we have in our alphabet characters like á,é,í,č,ň,ô. And whit those characters i have problem.
My code:
This is output of function with slovak .doc:
This is output of function with english .doc:
"Lorem ipsum Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc ac faucibus odio. Vestibulum neque massa, scelerisque sit amet ligula eu, congue molestie mi. Praesent ut varius sem. Nullam at porttitor arcu, nec lacinia nisi. Ut ac dolor vitae odio interdum condimentum. Vivamus dapibus sodales ex, vitae malesuada ipsum cursus convallis. Maecenas sed egestas nulla, ac condimentum orci. Mauris diam felis, vulputate ac suscipit et, iaculis non est. Curabitur semper arcu ac ligula semper, nec luctus nisl blandit. Integer lacinia ante ac libero lobortis imperdiet. Nullam mollis convallis ipsum, ac accumsan nunc vehicula vitae. Nulla eget justo in felis tristique fringilla. Morbi sit amet tortor quis risus auctor condimentum. Morbi in ullamcorper elit. Nulla iaculis tellus sit amet mauris tempus fringilla.Maecenas mauris lectus, lobortis et purus mattis, blandit dictum tellus.Maecenas non lorem quis tellus placerat varius. Nulla facilisi. Aenean congue fringilla justo ut aliquam. In non mauris justo. Duis vehicula mi vel mi pretium, a viverra erat efficitur. Cras aliquam est ac eros varius, id iaculis dui auctor. Duis pretium neque ligula, et pulvinar mi placerat et. "
I have tried google and ask my friends. Also i try https://github.com/neitanod/forceutf8. I need some ideas what should be problem or how to solve it.
The text was updated successfully, but these errors were encountered: