below is JLPT grammar website (https://mainichi-nonbiri.com/japanese-grammar) scrapper usage (default crawler)
# use pnpm
pnpm install
should run this script first
pnpm start:link
default scraping level is N5
, you can change it in getDetail.js
:
const level = "n5" // n5, n4, n3, n2, n1, n0(not categorized)
pnpm start:detail
after finished start:detail
(you dont have to run all level(n1~n5) all details, choose you want), run this script to combine all grammar detail files in to one file by level
pnpm start:all
files generated in output/grammar
folder
all output files are in output
folder
- if detail json file's
例文
value is empty, it means the grammar detail page is special, you need to check it manually. - recommend to use
output/grammar/all_nX.json
files in your project
for website: link
files: /nihongokyoshi-net-com
pnpm start:link2
pnpm start:detail2
pnpm start:all2