Question about suitability to Web scraping. #78

deabreu · 2021-08-24T12:33:14Z

Hello all. Please, forgive me if I'm making a wrong move posting this question here.

I'm looking for an alternative in Scala for Scrapy for parsing HTML documents for Web Scraping. I've been trying to build this alternative using Jsoup, but as it is a pure Java library, the conversion for Scala every time made the development a little counterintuitive and I'd like to have a more Functional approach.

I've come across Pine, as such an approach but the project seems to be more focused on building the rendering than creating a data structure model from an existing project, which would be my main focus. If that is incorrect, please help me clarify this impression.

Given that thought, I ask you to answer these questions about the project, or the documentation.

Can Pine parse any existing HTML5 compliant document into a tree-like hierarchical structure? And can this structure be queried?
Can Pine help me parse Javascript code for dynamic sites? If so, could you point me out an example of how to start doing it, please?
If not, could you point me out some possible way to work around this limitation, please?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about suitability to Web scraping. #78

Question about suitability to Web scraping. #78

deabreu commented Aug 24, 2021

Question about suitability to Web scraping. #78

Question about suitability to Web scraping. #78

Comments

deabreu commented Aug 24, 2021