Skip to content

Releases: extractus/article-extractor

v7.2.17

01 Jul 03:49
3e47e87
Compare
Choose a tag to compare
  • Merge pr #350 by @LarchLiu
  • Add agent to fetchOptions
  • Update CI to test with Node 20
  • Update dependencies
  • Update README

Example article extraction via proxy server with agent

import { extract } from '@extractus/article-extractor'

import { HttpsProxyAgent } from 'https-proxy-agent'

const proxy = 'http://abc:[email protected]:31113'

const url = 'https://www.cnbc.com/2022/09/21/what-another-major-rate-hike-by-the-federal-reserve-means-to-you.html'

const article = await extract(url, {}, {
  agent: new HttpsProxyAgent(proxy),
})
console.log('Run article-extractor with proxy:', proxy)
console.log(article)

v7.2.16

21 May 09:35
575e911
Compare
Choose a tag to compare
  • Fix issue #347
  • Update dependencies

v7.2.15

06 May 02:51
7a51b44
Compare
Choose a tag to compare
  • Merge with changes from pr #341
  • Fix unsupported package string-similarity
  • Update deps

v7.2.14

18 Apr 01:29
87a9708
Compare
Choose a tag to compare
  • Add support parsely meta tags

Maybe it comes from Parse.ly. Our users found that serveral websites such as TheVerge start using the strange meta tags that may break the extraction process. With these non-standard resources, this release should be helpful.

Screenshot from 2023-04-18 08-21-39

v7.2.13

11 Apr 16:57
2715cd9
Compare
Choose a tag to compare
  • Fix issue while fetching data from some websites (Deno platform only)

v7.2.12

28 Mar 08:01
db5e9ce
Compare
Choose a tag to compare
  • Set default user-agent
  • Avoid error if parserOptions is null
  • Update dependencies

v7.2.11

12 Mar 04:32
a79aa7f
Compare
Choose a tag to compare

v7.2.10

07 Mar 05:00
f727040
Compare
Choose a tag to compare
  • Fix issue #331
  • Update dependencies
  • Remove unnecessary watermark

v7.2.9

20 Feb 09:14
ba62bad
Compare
Choose a tag to compare
  • Fix issue #329
  • Update dependencies
  • Improve unit test

v7.2.8

11 Jan 09:32
4e3debb
Compare
Choose a tag to compare
  • Expose new API method extractFromHtml()
  • Update dependencies
  • Change coding style (remove standardjs)

Related issues: #321, #326