k$B;=V{0{9o05{r7!M#ecu$OK_R|3Pd#P`S8GL
zLT~P~dy{`ZB-^~EC;zS}KQI&rEPKFp%BwPpwkgPMLJtHk9D#@}^AWjyNt3oY@?7qz
z@S6GxoUEQrtuCx3$YB&Bw%Rw5WBQL`S*pNKK%QVzwfBdb*;#<__s}zoRzWDb^rrOy)YZZOYr7UW@kFawU7A_+{ijS3my#g1^fM?nipS
zV|rg1Mcdf&p4fbZV_33x_G@*VMOt@hpga
z*0c1@jmJLjl;m&M4}Sf1U;q7I?}{q7k#M!{0k4_;WE5><%ZL2K-FG7^J4;bM^Z0$p
zPO8_PJd<2EE9MvV#l*y5*QD1vt&Z0E=gxCJ+^4>b#1*e2eV%^&1KuEick*Xny=Rd>
zqbbODDv!B6_8=!GM^V1>$-k3o_M7oCveJ?i^Nad+Z&?f9sXaTbj@J6;LG~Uz?;~j?
z`OohMvefzgpWt8ZJ_>I1&}Vjk8AaRJ@|)HZQC@qIo0q33zXklxdMDKzw|TZ=Zc$ri
zRu7h1gs)A18AaRJ@+bdj?|sP6
z&sUWH0$!(}6YBM4{)6G(in&E?-ku)tn%28hYG|upUeo!RFO|IN4P-3PlmA=f?`Z)K
zdgw>~BASByr}spx&wdmX6eudderTuE8*V>HF|Vesol9oHhiktvJoDx`A3`as-^4+C
z@_$SB^*`|KuBi1I4UZ{3;cxr8jG}F91(1K7?*SYxEL2n=kIxNVs~7v9=DJ=nuc#|N
zAp!gOH#0-$Z}0Xbe}3*S`7hLuf6#!gI7a@S9K!)~UYAj{jjg~LJrVEkgd;^q6cyw!
z%;2>K&FYPUtTe^EqOSe#u0`O?UIx$KUhjRxIUoGoKXuKU$Xxh3$WrIyKgr*F3_nZh
ziNLvU$SB&zR?y6zNDOd6adEMtf)@=lc&$lWW+>(qH61>D7`tW=8)e}j9t|%=nRaA)MaD&zwz;!1t(nB$)
zsL9XA8~!tU8#K4Bp8N4Lf1z~p&swA>|3T#MI~LyD-v-ZnQ%2D?wgms=U^kSNmn%y0
zH(mJ*WzjL=U6eOPyRNqLMVM5`8$&TAl=vh;K5x{NB%zaFl7FlGK#jb
z6*9XQQbXKParCI7LKhD=SgpS7cb3~K#eAY3@mY_@On%;OWS;_f&PM=0_s>}WHnJVx
z(2IY_kgllr<2Cwb^g_sjw`3G;V=L6A7t+E!QCU@~sIVm?3|6biS(=xwm`~K>zGDN#
z_%yahf_s79rE|*wHbLF8*2ZSjG}F9h1vE(Mz|NMYpN9$zI24a
zYE9cRMKPDCrL4RRdu<0BEN4f24xIBEf~bS!pR+_i{-NZ5B49j^WqKiu`#up(LE+?o
zFv1(PwY7?hST@oi^@ecn7sdG~<`T68`THW&wvRz_cGTyfxvwFZpZjKR9Dtmqdh#FI
z6(<8M;XkVv!UcauQ&7a5UdW2_#qs0E6&1Oh{0&fZ0NP-2wh#!@Hs($O?fR(~*k?$SB&zR+L>Y
zby5V%l1cc1t
z_ehoulu@*et(f_}aX2v;=gyr|RNQI{y=zTe_zLPvvK6;4@@Hpf!+j2quS~!f?HVHN
z`|;WzYW}PA;~ydThfY8!`xd)&pp2qzY{f3s4zk$5%~8u;R8w
z?$ez6TNclT7>~yGqM|BZa5P<5utN?BW~Fs8AaRJiX;D`
zlrUVnbV*T(Ysp{7dXpBtf||4d#qDbHy16*>`ieZx(lN)gcz)zU^5^IN`R|ecYVx5;g4?V6%;cT7PH*NWDZu#o4+JM=-po*Agi3dNNxmle0C$tS+kvu4&P#M}2Vx)-gn
zh+fnmLIv9f;qY4WAEW#Hj~v?#=fbTKLFS1o2Fobg##W+3Zsf7CegkFZ80*SMsWgJl$L
zV=IaLD>9>T^VY3rrpl@+Y}l{?TWy9Sne2`2L0kD_$zJI2_Q5Dz_co%&=|2CX#&^T{
z$VrIe8k)3fu#BQ@YzhC5X2;_8ojcD=xrt#2w0{%v-1iu}|Lw^ix41uq3U`qI`T=_J
zk5==KUeE{0s|Q0wQ&5UyA5`VU;qJYA&rFx951}ex1=5zjidgRVO~D7H8sf=5e#z@7
z+&LIW^yP2a4HqJ<5zYRl2>znqCZN>CeNdejkNXcEJagT?dLG9jw;-M4z!bb_t%mp|
zD*i_V{|%h~mb$P1(N^RiWsMjfC#S9%ETd=}TWL%Bpf*1d4<9`;nr@x1L#fjYBywM5
zY(HAcpTO%rBrJWM*Ka0&o&%r{|CkBgaFP6D`5sB|7ts`yPX5OVlJNNb_Z20+e;pPc
z-m2o_;u^odUBdG^&z(P~xV`529zA-9Q|a!=SjyknF%fSxbtIC1;xZNgqK$7OW`geX
zKW3tee;oUkwwAvo%V-M9SlS15g~|Bf!w(hZ&+GE+n?JNw?X(+?&HTM$p_Q}8VBT~~
z*x1;>+1VMzB_)d6*3NykG6UHwUo#1BG<=eH-H#;lFWx;EMVs{FA8XwWm!ho^&owTc
z{6#bcWi0E1`XecL^3fATWp5j=SZ@ORlgwiYp>&QZvE9^VJa67%AXeLqhNGh+$}7qh
zb7*R~R&@|r90R7{kElWLPv*5h1pngA
z+RFC{$3R_xLx*!3!r#n|LwWA!FkZiBEK1)Sq8I5T&@Eit0v(A|K`wE
z-qTTQ4VG75ud_D~*{+B+2?E-j%tr_DBn*0Q+1#J@zc8DMv4ux
zHt-yR9phvaZDZ@myF+lPCLdpa{k5V>TxT1!)|`!lkm9mTF{h~Mt1mu7@%G^c&8@5F
zxn!TabvUozL;gGT*TvRE;%sO#F%WE|o9
zE@OFBux%vI{Thkm-j+DJYbX+CcGvm)KZ&zc{PTI-TeN39L^K7J?;M63Wbm8c{#H?y
z-g6CJugLxQ%5lZKqOR+esVLeu)ZqEs>wm%akv#Wj6zY7)f46@8lV*3vZO;Eg@2dEV
zXbP&>H4L{JO7Q*v`Cd^~K6VDLSLFWY=N~BM6?NULO67Zip$5<2UjGm87{zmc$luoz
zm3xNrH=uN%|4BCFpKgPK_lBZ)?|2zS+t@m~dpPc#F2(PD_d7*Z``Q`2*2;Yria9kk
zU8_h!$&O(L&);6}1^dFCqfp;L{&&)Cad_J>lf7`=Hnytv
zjljLL75Kv+|DdR2{__o5Yst>x`1D-9Vops>#XgP(&8@5FMPx5@!ruy2g1@cq>woeb
z75^hUhVyp-Eg_;QsCxfM+&_O5fBMs(6m>jcfkEqS@ScSK`0clfIYmuh|Ke*D>>6v(
z+`9T+yxW4G{g3ALTPC1tpMLyP=61*3gSIH*xGy_kDWhl`TQvtp;o*f!{P2H&P*h!@
zy}@cdx_2Z#zf!80Pg75QO&LUu#`LLV&u9oWkbm|5p&(10-~Va&7w;N@awjW@XbP%z
z8ihv}tMKPP|5;J>!S)8Jw{qV|JgJUQ%%xpRn)iBCaJ(4PtCGE=`P2S=gakOWD^3Pb++JsX#H0b{*n-;Yavh
z$=Dv1?i+(L>QvA~u0{II|I~RZ{-t|H@^|2@AfhSgxXWm~f29Wh`CtE~sM8UP^{Thl
z$pWXium7kj6o2^cOT`@8HT?b$zemvS<*4E7jqQ_QU%r0~P6w;`&(nSVPo3W#_c{N|
z_KxE3z*)&C+QwF$>ll1+^%(y8*S{+2gAz|%j$F_#-A-MH_m4Q^XZ7*;?!$}t&p-d5
zxb2p5$Hzo)o{TrTKZ1{7Uw&XL&ItZaod5dpUqJp@wkY2>isz9{V6-p^op2k2C)bYS
zzkmF1lTm3|DUx?gLY32Klklj5>?;n8!rSnuCh{7N{bB5e=i!
z+>ZV_$#a@Mx?YDL|L1>9LgJeL{u`&_7}uqa^h^1*x_N)pX&i*khE5{?VIWJLkN?x`
z$v=Azjvg3|YPSh8iuSY780KkN4Lqme;~Vw(+u#0XRQ>Ufe?(|#C;~T4!wH^e(^|c4
z-*0$NE1kzd=p6YUbJ34~`a(5-r!lB;pNL9E<7q5&t8|K=X@7L14uAjq-;JUlfBaEi
z6L7l9Fe~V`8(5pwW=7O
zoyIYjS}x;wY;K7wwR!l*KmPGd^}`Q8$ZJ4v-n0>J>t>?PYa))hu`brsE@}E8W9Rx;
z?P>|3^Wl?mTwneUg8y9hZ7ga%tYs8!l~jGMX5S4WS8nGijpyx_RqHk$Db9=JIe7p4
z=ReVz`s%B%eTPKZFtol
z-`%ZHOI?hdiaPQqOP!DZH2ej(<6e`R@v}rNdHqE0ysu5EY7_N!L}&dcA${LW1a9H)
zYj0?#jlA84yf@882~-cF1QE$MwV}F~{+CNU4_gKh{cL+w!RK@ix3y_v2FZG}L+M$N%7B
z^3S)EjO%=+G_pBPM#^f%M{Td4YFm>*`&7s0wJMH-I{w|(cv7gXo&9U_L2Fz-
zUtLZ8ZU4A!?yJ^q&HUhcExfxDJA=n8BlP3Hg!~W9mkdw%O@mlxo77Zy6EE%iTHwZG
zwt6j{@SOtPG?mY%{GLOJx}|jMb3HxM_N?{!pzS@6S=8QMjhlsfFa7vuF71vF3Krl5
z=i^EL>D=bzR@AIde!Sf%6|0kE)~fucar?Sxx~xak)T#pb*t3#S9gUCa(my^|!LO$A
zukG{G3^cF57C##&yhnm8bw2)A@waC!Q+OW93>igRvuzpu$+b~W_pyOYr2-hqLRm9p
z4v|MHkg-+G#HqmNM-8mC^}aOgx9opgUh#2_w=K?_sf}J=Pq4vBefclzjt>vpGuKp}
zN5DRaoRXy`{^1p4KehZqjMCpHeVyOU-8N4B2qI8g<6@HQQ{~
zFLgh)`~I{qT0S+7`PTg9{pxw9VlWfeljh*0&nW%o|MC~{q|kvZ`Cgcw|AcK2^O57}
zpGd>1dDM8Ijv=%8`J|9d8?xo&S_);m-I~Mac9wLgL%V+2ZmZ8KZ(1Jf&3)A#Q)(XS
zxpE`f4kvvrkhxs<^*>ASKe9;j6^u>?bK6p?{AZX$s*z(0JVV(}DO-LP*^=h)HwIeM
zTt3!@!sqfD(0>w6(jA~0;JI2-QH(9hp7*z*Xg>ZEE#!TMRttIBs!cxLO5ZenZ^4@;
zznXa|=SGv)O=}>Ts2SW&Ux-uwWAvNijWIB)Rgak(_
zN~op2s(iahMon7Cb~DBBwv+>7d$fqpQ|S_8qhi_aENSJ@%#ImVt$7ivi`)kxK!%VgZdOv6q@
zHM^%6EFlw>t}s5O1l~?qs)v^GHJv4W>S@+5b$>Pe{>hls_|!P=wLJXE{84+anP+5i
zFKY>p-^c6C|I_l79ajqhp44p
zb&I}f^;g>mIfh#pPtE<59;^ITjTx;mLM9J#SMWW$C9+rRzW>i5|DRQ?A!o8q&~Rqt
zcCShaz8a~=a?&!g6in69?q4%cuBWVQ)%)uytN9p}+LqSvIW6l3?{Bx#_*^0J8cl4+
z>s!mMy1$LuHGNbVgTkYqj-}>frpa%)E^3XDhk2{`e%(qx{%gqp=vo!)M$W1^eNA#!
zvrcAyHRtP$*U4GMwi#b7mN(c|Ny~E$0~yzngM!lenAAGPCf#J)SkkVesjaS4sW0*|
z^|;e`&^Y!~K0Q4Tny*=EzlHB|jx~*shlf`4{rd#H`F~cmUe09|SF#mjHFcGWoto24
za=4kk9yc@ATS|Jd4R`2^
z#-lsJFX3C3kjlqfnVWHo-ZuK2u|aw-{1;yrCxqapl`D9+^W*w?}SQBe|TqeX?S+l5Ls)T(JW3JeZS=J2{#$BE|>jn;09JsyX<6(PsWFXfm?b>c0QaUHc-w
zI=LTD%6H>Q#a?`LbU!|-bi&6~F8H|G4WHC_;L}yfn;d&oLPy|VT~XWV8#M1O787bg
zPxSK^*Z-S&J1Kg=`Fw+G=Z&1L(x2;Do9Nj_>D#q~>!gp@(pO7gucq<+EqxY#OW*0g
z;BYyKJQDf$81lIkKVP!C7$?5J!D~V5!C$NyX8mh;kh8A)oG#{nW(Uj;m>n=XV0OUl
zfY|}F17-*406O{K|9jfyIp*6>8DjS1DLcBD{V@BX?SOeaXuHwHJU{-o?>`K^yZ-+e
C@Vb`(
literal 0
HcmV?d00001
diff --git a/README.md b/README.md
new file mode 100644
index 0000000..cb67d21
--- /dev/null
+++ b/README.md
@@ -0,0 +1,2 @@
+# dlsite-doujin-renamer
+
diff --git a/build.py b/build.py
new file mode 100644
index 0000000..4d1349f
--- /dev/null
+++ b/build.py
@@ -0,0 +1,13 @@
+import PyInstaller.__main__
+
+if __name__ == '__main__':
+ PyInstaller.__main__.run([
+ 'main.py',
+ '--onefile',
+ '--windowed',
+ '--clean',
+ '--icon',
+ 'Letter_R_blue.ico', # https://icon-icons.com/icon/letter-r-blue/34893
+ '--add-data',
+ 'Letter_R_blue.ico;.',
+ ])
diff --git a/config_file.py b/config_file.py
new file mode 100644
index 0000000..9c9d421
--- /dev/null
+++ b/config_file.py
@@ -0,0 +1,115 @@
+import json
+import os
+import re
+from typing import TypedDict, Final, Optional
+
+from scraper import Locale
+
+
+class Config(TypedDict):
+ scaner_max_depth: int
+ scraper_locale: str
+ scraper_connect_timeout: int
+ scraper_read_timeout: int
+ scraper_sleep_interval: int
+ scraper_http_proxy: Optional[str]
+ renamer_template: str
+ renamer_exclude_square_brackets_in_work_name_flag: bool
+
+
+class ConfigFile(object):
+ DEFAULT_CONFIG: Final[Config] = {
+ 'scaner_max_depth': 5,
+ 'scraper_locale': 'zh_cn',
+ 'scraper_connect_timeout': 10,
+ 'scraper_read_timeout': 10,
+ 'scraper_sleep_interval': 3,
+ 'scraper_http_proxy': None,
+ 'renamer_template': '[maker_name][rjcode] work_name cv_list_str',
+ 'renamer_exclude_square_brackets_in_work_name_flag': False,
+ }
+
+ def __init__(self, file_path: str):
+ self.__file_path = file_path
+ if not os.path.isfile(file_path):
+ self.save_config(ConfigFile.DEFAULT_CONFIG)
+
+ def load_config(self):
+ """
+ 从配置文件中读取配置
+ """
+ with open(self.__file_path, encoding='UTF-8') as file:
+ config_dict: Config = json.load(file)
+ return config_dict
+
+ def save_config(self, config_dict: Config):
+ """
+ 保存配置到文件
+ """
+ with open(self.__file_path, 'w', encoding='UTF-8') as file:
+ json.dump(config_dict, file, indent=2, ensure_ascii=False)
+
+ @property
+ def file_path(self):
+ return self.__file_path
+
+ @staticmethod
+ def verify_config(config_dict: Config):
+ """
+ 验证配置是否合理
+ """
+ scaner_max_depth = config_dict.get('scaner_max_depth', None)
+ scraper_locale = config_dict.get('scraper_locale', None)
+ scraper_connect_timeout = config_dict.get('scraper_connect_timeout', None)
+ scraper_read_timeout = config_dict.get('scraper_read_timeout', None)
+ scraper_http_proxy = config_dict.get('scraper_http_proxy', None)
+ scraper_sleep_interval = config_dict.get('scraper_sleep_interval', None)
+ renamer_template = config_dict.get('renamer_template', None)
+ renamer_exclude_square_brackets_in_work_name_flag = \
+ config_dict.get('renamer_exclude_square_brackets_in_work_name_flag', None)
+
+ strerror_list = []
+
+ # 检查 scaner_max_depth
+ if not isinstance(scaner_max_depth, int) or scaner_max_depth < 0:
+ strerror_list.append('scaner_max_depth 应是一个大于等于 0 的整数')
+
+ # 检查 scraper_locale
+ locale_name_list: list[str] = []
+ for locale in Locale:
+ locale_name_list.append(locale.name)
+ if scraper_locale not in locale_name_list:
+ strerror_list.append(f'scraper_locale 应是 {locale_name_list} 中的一个')
+
+ # 检查 scraper_connect_timeout
+ if not isinstance(scraper_connect_timeout, int) or scraper_connect_timeout <= 0:
+ strerror_list.append('scraper_connect_timeout 应是一个大于 0 的整数')
+
+ # 检查 scraper_read_timeout
+ if not isinstance(scraper_read_timeout, int) or scraper_read_timeout <= 0:
+ strerror_list.append('scraper_read_timeout 应是一个大于 0 的整数')
+
+ # 检查 scraper_sleep_interval
+ if not isinstance(scraper_sleep_interval, int) or scraper_sleep_interval < 0:
+ strerror_list.append('scraper_sleep_interval 应是一个大于等于 0 的整数')
+
+ # 检查 scraper_http_proxy
+ http_proxy_pattern = re.compile(r"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}:(\d+)")
+ str_error_http_proxy = f'scraper_http_proxy 应是形如 "http://127.0.0.1:7890" 的 http 代理;或设为 null,这将使用系统代理'
+ if isinstance(scraper_http_proxy, str):
+ if not http_proxy_pattern.fullmatch(scraper_http_proxy):
+ # 是字符串但不匹配正则规则
+ strerror_list.append(str_error_http_proxy)
+ elif scraper_http_proxy is not None:
+ # 既不是字符串也不是 None
+ strerror_list.append(str_error_http_proxy)
+
+ # 检查 renamer_template
+ if not isinstance(renamer_template, str) or 'rjcode' not in renamer_template:
+ strerror_list.append('renamer_template 应是一个包含 "rjcode" 的字符串')
+
+ # 检查 renamer_exclude_square_brackets_in_work_name_flag
+ if not isinstance(renamer_exclude_square_brackets_in_work_name_flag, bool):
+ strerror_list.append('renamer_exclude_square_brackets_in_work_name_flag 应是一个布尔值')
+
+ return strerror_list
diff --git a/main.py b/main.py
new file mode 100644
index 0000000..c36cf8b
--- /dev/null
+++ b/main.py
@@ -0,0 +1,197 @@
+import logging
+import os
+import sys
+from json import JSONDecodeError
+from threading import Thread
+from typing import Optional, Callable
+
+import wx
+
+from config_file import ConfigFile
+from renamer import Renamer
+from scaner import Scaner
+from scraper import Locale, CachedScraper
+from my_frame import MyFrame
+from wx_log_handler import EVT_WX_LOG_EVENT, WxLogHandler
+
+VERSION = '0.1.0'
+
+
+class MyFileDropTarget(wx.FileDropTarget):
+ def __init__(self, window):
+ wx.FileDropTarget.__init__(self)
+ self.window = window
+
+ def OnDropFiles(self, x, y, filenames):
+ """
+ 当接收到用户拖拽的文件时,运行 renamer
+ """
+ dirname_list = [filename for filename in filenames if os.path.isdir(filename)]
+ self.window.thread_it(self.window.run_renamer, dirname_list)
+ return True
+
+
+class AppFrame(MyFrame):
+ def __init__(self, parent):
+ MyFrame.__init__(self, parent)
+
+ # 使文件能被拖拽到 wx.TextCtrl 组件
+ drop_target = MyFileDropTarget(self)
+ self.text_ctrl.SetDropTarget(drop_target)
+
+ # 配置文件
+ config_file_path = os.path.join('config.json')
+ self.__config_file = ConfigFile(config_file_path)
+
+ # 工作线程。耗时长的任务应放在工作线程执行,避免阻塞 GUI 线程
+ self.__worker_thread: Optional[Thread] = None
+
+ # 为 logger 添加 wxLogHandler
+ self.text_ctrl.Bind(EVT_WX_LOG_EVENT, self.on_log_event)
+ wx_log_handler = WxLogHandler(self.text_ctrl)
+ wx_log_handler.setLevel(logging.INFO)
+ wx_log_handler.setFormatter(logging.Formatter('%(asctime)s - %(message)s'))
+ Renamer.logger.addHandler(wx_log_handler)
+
+ def thread_it(self, func: Callable, *args):
+ """
+ 将函数打包进线程执行
+ """
+ if self.__worker_thread and self.__worker_thread.is_alive():
+ return
+ self.__worker_thread = Thread(target=func, args=args)
+ self.__worker_thread.start()
+
+ def on_log_event(self, event):
+ """
+ 转发日志到 wx.TextCtrl 组件
+ """
+ if event.levelno <= logging.INFO:
+ text_color = wx.BLACK
+ elif event.levelno <= logging.WARNING:
+ text_color = wx.BLUE
+ else:
+ text_color = wx.RED
+ self.text_ctrl.SetDefaultStyle(wx.TextAttr(text_color))
+ msg = event.message.strip("\r") + "\n"
+ self.text_ctrl.AppendText(msg)
+ event.Skip()
+
+ def on_dir_changed_event(self, event):
+ """
+ 当 wx.DirPickerCtrl 组件接收到用户选择的文件夹时,运行 renamer
+ """
+ root_path = self.dir_picker.GetPath()
+ self.thread_it(self.run_renamer, [root_path])
+
+ def __print_info(self, message: str):
+ self.text_ctrl.SetDefaultStyle(wx.TextAttr(wx.BLACK))
+ self.text_ctrl.AppendText(message + '\n')
+
+ def __print_warning(self, message: str):
+ self.text_ctrl.SetDefaultStyle(wx.TextAttr(wx.BLUE))
+ self.text_ctrl.AppendText(message + '\n')
+
+ def __print_error(self, message: str):
+ self.text_ctrl.SetDefaultStyle(wx.TextAttr(wx.RED))
+ self.text_ctrl.AppendText(message + '\n')
+
+ def __before_worker_thread_start(self):
+ thread_id = self.__worker_thread.native_id # 线程 ID
+ self.__print_info(f'******************************运行开始({thread_id})******************************')
+ self.dir_picker.Disable() # 禁用【浏览】按钮
+ self.text_ctrl.SetDropTarget(None) # 禁用文件拖拽
+
+ def __before_worker_thread_end(self):
+ self.dir_picker.Enable() # 恢复【浏览】按钮
+ self.text_ctrl.SetDropTarget(MyFileDropTarget(self)) # 恢复文件拖拽
+ thread_id = self.__worker_thread.native_id # 线程 ID
+ self.__print_info(f'******************************运行结束({thread_id})******************************\n\n')
+
+ def run_renamer(self, root_path_list: list[str]):
+ self.__before_worker_thread_start()
+
+ try:
+ config = self.__config_file.load_config() # 从配置文件中读取配置
+ except JSONDecodeError as err:
+ self.__print_error(f'配置文件解析失败:"{os.path.normpath(self.__config_file.file_path)}"')
+ self.__print_error(f'JSONDecodeError: {str(err)}')
+ self.__before_worker_thread_end()
+ return
+ except FileNotFoundError as err:
+ self.__print_error(f'配置文件加载失败:"{os.path.normpath(self.__config_file.file_path)}"')
+ self.__print_error(f'FileNotFoundError: {err.strerror}')
+ self.__before_worker_thread_end()
+ return
+
+ # 检查配置是否合法
+ strerror_list = ConfigFile.verify_config(config)
+ if len(strerror_list) > 0:
+ self.__print_error(f'配置文件验证失败:"{os.path.normpath(self.__config_file.file_path)}"')
+ for strerror in strerror_list:
+ self.__print_error(strerror)
+ self.__before_worker_thread_end()
+ return
+
+ # 配置 scaner
+ scaner_max_depth = config['scaner_max_depth']
+ scaner = Scaner(max_depth=scaner_max_depth)
+
+ # 配置 scraper
+ scraper_locale = config['scraper_locale']
+ scraper_http_proxy = config['scraper_http_proxy']
+ if scraper_http_proxy:
+ proxies = {
+ 'http': scraper_http_proxy,
+ 'https': scraper_http_proxy
+ }
+ else:
+ proxies = None
+ scraper_connect_timeout = config['scraper_connect_timeout']
+ scraper_read_timeout = config['scraper_read_timeout']
+ scraper_sleep_interval = config['scraper_sleep_interval']
+ cached_scraper = CachedScraper(
+ locale=Locale[scraper_locale],
+ connect_timeout=scraper_connect_timeout,
+ read_timeout=scraper_read_timeout,
+ sleep_interval=scraper_sleep_interval,
+ proxies=proxies)
+
+ # 配置 renamer
+ renamer = Renamer(
+ scaner=scaner,
+ scraper=cached_scraper,
+ template=config['renamer_template'],
+ exclude_square_brackets_in_work_name_flag=config['renamer_exclude_square_brackets_in_work_name_flag'])
+
+ # 执行重命名
+ for root_path in root_path_list:
+ renamer.rename(root_path)
+
+ self.__before_worker_thread_end()
+
+
+def get_application_path():
+ """
+ https://pyinstaller.readthedocs.io/en/stable/runtime-information.html#run-time-information
+ """
+ if getattr(sys, 'frozen', False) and hasattr(sys, '_MEIPASS'):
+ # running in a PyInstaller bundle
+ application_path = sys._MEIPASS
+ else:
+ # running in a normal Python process
+ application_path = os.path.dirname(__file__)
+ return application_path
+
+
+if __name__ == '__main__':
+ app_path = get_application_path()
+ icon_path = os.path.join(app_path, 'Letter_R_blue.ico')
+
+ app = wx.App(False)
+ frame = AppFrame(None)
+ frame.SetIcon(wx.Icon(icon_path))
+ frame.SetTitle(f'DLSite 同人作品重命名工具 v{VERSION}')
+ frame.Show(True)
+ # start the applications
+ app.MainLoop()
diff --git a/my_frame.py b/my_frame.py
new file mode 100644
index 0000000..8a468ab
--- /dev/null
+++ b/my_frame.py
@@ -0,0 +1,55 @@
+# -*- coding: utf-8 -*-
+
+###########################################################################
+## Python code generated with wxFormBuilder (version 3.10.1-0-g8feb16b3)
+## http://www.wxformbuilder.org/
+##
+## PLEASE DO *NOT* EDIT THIS FILE!
+###########################################################################
+
+import wx
+import wx.xrc
+
+
+###########################################################################
+## Class MyFrame
+###########################################################################
+
+class MyFrame(wx.Frame):
+
+ def __init__(self, parent):
+ wx.Frame.__init__(self, parent, id=wx.ID_ANY, title=wx.EmptyString, pos=wx.DefaultPosition,
+ size=wx.Size(500, 300), style=wx.DEFAULT_FRAME_STYLE | wx.TAB_TRAVERSAL)
+
+ self.SetSizeHints(wx.DefaultSize, wx.DefaultSize)
+
+ box_sizer = wx.BoxSizer(wx.VERTICAL)
+
+ self.static_text = wx.StaticText(self, wx.ID_ANY, u"Tip:手动选择文件夹或拖拽文件夹到软件窗口", wx.DefaultPosition, wx.DefaultSize,
+ 0)
+ self.static_text.Wrap(-1)
+ box_sizer.Add(self.static_text, 0, wx.ALL | wx.EXPAND, 5)
+
+ self.text_ctrl = wx.TextCtrl(self, wx.ID_ANY, wx.EmptyString, wx.DefaultPosition, wx.DefaultSize,
+ wx.TE_MULTILINE | wx.TE_READONLY | wx.TE_RICH | wx.TE_RICH2 | wx.HSCROLL | wx.TE_AUTO_URL)
+ box_sizer.Add(self.text_ctrl, 1, wx.ALL | wx.EXPAND, 5)
+
+ self.dir_picker = wx.DirPickerCtrl(self, wx.ID_ANY, '',
+ u"Select a folder", wx.DefaultPosition, wx.DefaultSize,
+ wx.DIRP_DIR_MUST_EXIST)
+ box_sizer.Add(self.dir_picker, 0, wx.ALL | wx.ALIGN_CENTER_HORIZONTAL, 5)
+
+ self.SetSizer(box_sizer)
+ self.Layout()
+
+ self.Centre(wx.BOTH)
+
+ # Connect Events
+ self.dir_picker.Bind(wx.EVT_DIRPICKER_CHANGED, self.on_dir_changed_event)
+
+ def __del__(self):
+ pass
+
+ # Virtual event handlers, override them in your derived class
+ def on_dir_changed_event(self, event):
+ event.Skip()
diff --git a/renamer.py b/renamer.py
new file mode 100644
index 0000000..1ce934b
--- /dev/null
+++ b/renamer.py
@@ -0,0 +1,118 @@
+import logging
+import os
+import re
+
+from requests.exceptions import RequestException, ConnectionError, HTTPError, Timeout
+
+from scaner import Scaner
+from scraper import WorkMetadata, Scraper
+
+# Windows 系统的保留字符
+# https://docs.microsoft.com/zh-cn/windows/win32/fileio/naming-a-file
+# <(小于)
+# >(大于)
+# : (冒号)
+# "(双引号)
+# /(正斜杠)
+# \ (反反)
+# | (竖线或竖线)
+# ? (问号)
+# * (星号)
+WINDOWS_RESERVED_CHARACTER_PATTERN = re.compile(r'[\\/*?:"<>|]')
+
+
+def _get_logger():
+ # create logger
+ logger = logging.getLogger('Renamer')
+ logger.setLevel(logging.DEBUG)
+
+ # create console handler and set level to debug
+ ch = logging.StreamHandler()
+ ch.setLevel(logging.DEBUG)
+ # create formatter
+ formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
+ # add formatter to ch
+ ch.setFormatter(formatter)
+
+ # add ch to logger
+ logger.addHandler(ch)
+
+ return logger
+
+
+class Renamer(object):
+ logger = _get_logger()
+
+ def __init__(
+ self,
+ scaner: Scaner,
+ scraper: Scraper,
+ template: str = '[maker_name][rjcode] work_name cv_list_str', # 模板
+ exclude_square_brackets_in_work_name_flag: bool = False, # 设为 True 时,移除 work_name 中【】及其间的内容
+ ):
+ if 'rjcode' not in template:
+ raise ValueError # 重命名不能丢失 rjcode
+ self.__scaner = scaner
+ self.__scraper = scraper
+ self.__template = template
+ self.__exclude_square_brackets_in_work_name_flag = exclude_square_brackets_in_work_name_flag
+
+ def __compile_new_name(self, metadata: WorkMetadata):
+ """
+ 根据作品的元数据编写出新的文件名
+ """
+ work_name = re.sub(r'【.*?】', '', metadata['work_name']).strip() \
+ if self.__exclude_square_brackets_in_work_name_flag \
+ else metadata['work_name']
+
+ template = self.__template
+ new_name = template.replace('rjcode', metadata['rjcode'])
+ new_name = new_name.replace('work_name', work_name)
+ new_name = new_name.replace('maker_id', metadata['maker_id'])
+ new_name = new_name.replace('maker_name', metadata['maker_name'])
+ new_name = new_name.replace('release_date', metadata['release_date'])
+
+ cv_list = metadata['cvs']
+ cv_list_str = '(' + ' '.join(cv_list) + ')' if len(cv_list) > 0 else ''
+ new_name = new_name.replace('cv_list_str', cv_list_str)
+
+ # 文件名中不能包含 Windows 系统的保留字符
+ new_name = WINDOWS_RESERVED_CHARACTER_PATTERN.sub('', new_name)
+
+ return new_name.strip()
+
+ def rename(self, root_path: str):
+ work_folders = self.__scaner.scan(root_path)
+ for rjcode, folder_path in work_folders:
+ Renamer.logger.info(f'[{rjcode}] -> 发现 RJ 文件夹:"{os.path.normpath(folder_path)}"')
+ dirname, basename = os.path.split(folder_path)
+ try:
+ metadata = self.__scraper.scrape_metadata(rjcode)
+ except Timeout:
+ # 请求超时
+ Renamer.logger.warning(f'[{rjcode}] -> 重命名失败:dlsite.com 请求超时!\n')
+ continue
+ except ConnectionError as err:
+ # 遇到其它网络问题(如:DNS 查询失败、拒绝连接等)
+ Renamer.logger.warning(f'[{rjcode}] -> 重命名失败:{str(err)}\n')
+ continue
+ except HTTPError as err:
+ # HTTP 请求返回了不成功的状态码
+ Renamer.logger.warning(f'[{rjcode}] -> 重命名失败:{err.response.status_code} {err.response.reason}\n')
+ continue
+ except RequestException as err:
+ # requests 引发的其它异常
+ Renamer.logger.error(f'[{rjcode}] -> 重命名失败:{str(err)}\n')
+ continue
+
+ new_basename = self.__compile_new_name(metadata)
+ new_folder_path = os.path.join(dirname, new_basename)
+ try:
+ os.rename(folder_path, new_folder_path)
+ Renamer.logger.info(f'[{rjcode}] -> 重命名成功:"{os.path.normpath(new_folder_path)}"\n')
+ except FileExistsError as err:
+ filename = os.path.normpath(err.filename)
+ filename2 = os.path.normpath(err.filename2)
+ Renamer.logger.warning(f'[{rjcode}] -> 重命名失败:{err.strerror}:"{filename}" -> "{filename2}"\n')
+ except OSError as err:
+ Renamer.logger.error(f'[{rjcode}] -> 重命名失败:{str(err)}\n')
diff --git a/requirements.txt b/requirements.txt
new file mode 100644
index 0000000..8932b43
--- /dev/null
+++ b/requirements.txt
@@ -0,0 +1,19 @@
+altgraph==0.17.2
+certifi==2021.10.8
+charset-normalizer==2.0.10
+cssselect==1.1.0
+future==0.18.2
+idna==3.3
+lxml==4.7.1
+numpy==1.22.0
+peewee==3.14.8
+pefile==2021.9.3
+Pillow==9.0.0
+pyinstaller==4.7
+pyinstaller-hooks-contrib==2021.4
+pyquery==1.4.3
+pywin32-ctypes==0.2.0
+requests==2.27.1
+six==1.16.0
+urllib3==1.26.7
+wxPython==4.1.1
diff --git a/scaner/__init__.py b/scaner/__init__.py
new file mode 100644
index 0000000..2122ecf
--- /dev/null
+++ b/scaner/__init__.py
@@ -0,0 +1 @@
+from scaner.scaner import Scaner
diff --git a/scaner/scaner.py b/scaner/scaner.py
new file mode 100644
index 0000000..90bad98
--- /dev/null
+++ b/scaner/scaner.py
@@ -0,0 +1,23 @@
+import os
+
+from scraper import Dlsite
+
+
+class Scaner(object):
+ def __init__(self, max_depth=5):
+ self.__max_depth = max_depth
+
+ def scan(self, root_path: str, _depth=0):
+ """
+ 生成器。深层遍历所有含 rjcode 的文件夹
+ """
+ if os.path.isdir(root_path): # 检查是否是文件夹
+ folder = os.path.basename(root_path)
+ rjcode = Dlsite.parse_rjcode(folder)
+ if rjcode: # 检查文件夹名称中是否含RJ号
+ yield rjcode, root_path
+ elif _depth < self.__max_depth:
+ dir_list = os.listdir(root_path)
+ for folder in dir_list:
+ folder_path = os.path.join(root_path, folder)
+ yield from self.scan(folder_path, _depth + 1)
diff --git a/scraper/__init__.py b/scraper/__init__.py
new file mode 100644
index 0000000..0ed8e6c
--- /dev/null
+++ b/scraper/__init__.py
@@ -0,0 +1,5 @@
+from scraper.cached_scraper import CachedScraper
+from scraper.dlsite import Dlsite
+from scraper.locale import Locale
+from scraper.scraper import Scraper
+from scraper.work_metadata import WorkMetadata
diff --git a/scraper/cached_scraper.py b/scraper/cached_scraper.py
new file mode 100644
index 0000000..d1d9229
--- /dev/null
+++ b/scraper/cached_scraper.py
@@ -0,0 +1,29 @@
+import json
+
+from scraper.db import db, WorkMetadataCache
+from scraper.locale import Locale
+from scraper.scraper import Scraper
+from scraper.work_metadata import WorkMetadata
+
+
+class CachedScraper(Scraper):
+ def __init__(self, locale: Locale, proxies=None, connect_timeout: int = 10, read_timeout: int = 10, sleep_interval=3):
+ super().__init__(locale, proxies, connect_timeout, read_timeout, sleep_interval)
+ db.connect()
+ db.create_tables([WorkMetadataCache])
+
+ def __del__(self):
+ db.close()
+
+ def scrape_metadata(self, rjcode: str):
+ # 在数据库中查找
+ metadata_cache = WorkMetadataCache.get_or_none(WorkMetadataCache.rjcode == rjcode)
+ if metadata_cache:
+ # 已缓存,返回数据库中缓存的 metadata
+ metadata: WorkMetadata = json.loads(metadata_cache.metadata)
+ return metadata
+ else:
+ # 未缓存,从 scraper 抓取 metadata 并缓存到数据库
+ metadata = super().scrape_metadata(rjcode)
+ WorkMetadataCache.create(rjcode=rjcode, metadata=json.dumps(metadata, indent=2, ensure_ascii=False))
+ return metadata
diff --git a/scraper/db.py b/scraper/db.py
new file mode 100644
index 0000000..f2c27e0
--- /dev/null
+++ b/scraper/db.py
@@ -0,0 +1,11 @@
+from peewee import *
+
+db = SqliteDatabase('cache.db')
+
+
+class WorkMetadataCache(Model):
+ rjcode = CharField(primary_key=True)
+ metadata = TextField()
+
+ class Meta:
+ database = db # This model uses the "work_metadata_cache.db" database.
diff --git a/scraper/dlsite.py b/scraper/dlsite.py
new file mode 100644
index 0000000..4b79fca
--- /dev/null
+++ b/scraper/dlsite.py
@@ -0,0 +1,54 @@
+import re
+from typing import Final
+from urllib.parse import unquote
+
+from scraper.locale import Locale
+from scraper.langs import EN_US, JA_JP, KO_KR, ZH_CN, ZH_TW
+from scraper.translation import Translation
+
+
+# 读取翻译文件
+def _load_translations():
+ translations: dict[Locale, Translation] = {
+ Locale.en_us: EN_US,
+ Locale.ja_jp: JA_JP,
+ Locale.ko_kr: KO_KR,
+ Locale.zh_cn: ZH_CN,
+ Locale.zh_tw: ZH_TW,
+ }
+ return translations
+
+
+class Dlsite(object):
+ TRANSLATIONS: Final = _load_translations()
+ RJCODE_PATTERN: Final = re.compile(r'RJ(\d{6})(?!\d+)')
+ RGCODE_PATTERN: Final = re.compile(r'RG(\d{5})(?!\d+)')
+ SRICODE_PATTERN: Final = re.compile(r'SRI(\d{10})(?!\d+)')
+
+ # 提取字符串中的 rjcode
+ @staticmethod
+ def parse_rjcode(string: str):
+ match = Dlsite.RJCODE_PATTERN.search(string.upper())
+ if match:
+ return match.group()
+ else:
+ return None
+
+ # 根据 rjcode 拼接出同人作品页面的 url
+ @staticmethod
+ def compile_work_page_url(rjcode: str):
+ return f'https://www.dlsite.com/maniax/work/=/product_id/{rjcode}.html'
+
+ # 解析 scraper 链接中携带的参数 (scraper 服务端使用 mod_rewrite 优化 SEO)
+ @staticmethod
+ def parse_url_params(url: str):
+ url = unquote(url)
+ split_url = url.split(r'/=/', 1)
+ params_str = split_url[1] if len(split_url) == 2 else ''
+ params_str_1 = re.sub(r'\?.*$', '', params_str, count=1)
+ params_str_2 = re.sub(r'(\.html)?/?$', '', params_str_1) # 去除 url 的 .html/ 后缀
+ params_list = params_str_2.split('/')
+ params = {}
+ for i in range(0, len(params_list), 2):
+ params[params_list[i]] = params_list[i + 1] if i + 1 < len(params_list) else ''
+ return params
diff --git a/scraper/langs/__init__.py b/scraper/langs/__init__.py
new file mode 100644
index 0000000..8330f45
--- /dev/null
+++ b/scraper/langs/__init__.py
@@ -0,0 +1,5 @@
+from scraper.langs.en_us import EN_US
+from scraper.langs.ja_jp import JA_JP
+from scraper.langs.ko_kr import KO_KR
+from scraper.langs.zh_cn import ZH_CN
+from scraper.langs.zh_tw import ZH_TW
diff --git a/scraper/langs/en_us.py b/scraper/langs/en_us.py
new file mode 100644
index 0000000..13da2e8
--- /dev/null
+++ b/scraper/langs/en_us.py
@@ -0,0 +1,13 @@
+EN_US = {
+ "AGE": "Age",
+ "GENRE": "Genre",
+ "RELEASE_DATE": "Release date",
+ "SERIES_NAME": "Series name",
+ "PRODUCT_FORMAT": "Product format",
+ "EVENT": "Event",
+ "AUTHOR": "Author",
+ "SCENARIO": "Scenario",
+ "ILLUSTRATION": "Illustration",
+ "MUSIC": "Music",
+ "VOICE_ACTOR": "Voice Actor"
+}
diff --git a/scraper/langs/ja_jp.py b/scraper/langs/ja_jp.py
new file mode 100644
index 0000000..038ab55
--- /dev/null
+++ b/scraper/langs/ja_jp.py
@@ -0,0 +1,13 @@
+JA_JP = {
+ "AGE": "年齢指定",
+ "GENRE": "ジャンル",
+ "RELEASE_DATE": "販売日",
+ "SERIES_NAME": "シリーズ名",
+ "PRODUCT_FORMAT": "作品形式",
+ "EVENT": "イベント",
+ "AUTHOR": "作者",
+ "SCENARIO": "シナリオ",
+ "ILLUSTRATION": "イラスト",
+ "MUSIC": "音楽",
+ "VOICE_ACTOR": "声優"
+}
diff --git a/scraper/langs/ko_kr.py b/scraper/langs/ko_kr.py
new file mode 100644
index 0000000..9beacf9
--- /dev/null
+++ b/scraper/langs/ko_kr.py
@@ -0,0 +1,13 @@
+KO_KR = {
+ "AGE": "연령 지정",
+ "GENRE": "장르",
+ "RELEASE_DATE": "판매일",
+ "SERIES_NAME": "시리즈명",
+ "PRODUCT_FORMAT": "작품 형식",
+ "EVENT": "이벤트",
+ "AUTHOR": "저자",
+ "SCENARIO": "시나리오",
+ "ILLUSTRATION": "일러스트",
+ "MUSIC": "음악",
+ "VOICE_ACTOR": "성우"
+}
diff --git a/scraper/langs/zh_cn.py b/scraper/langs/zh_cn.py
new file mode 100644
index 0000000..c5510f9
--- /dev/null
+++ b/scraper/langs/zh_cn.py
@@ -0,0 +1,13 @@
+ZH_CN = {
+ "AGE": "年龄指定",
+ "GENRE": "分类",
+ "RELEASE_DATE": "贩卖日",
+ "SERIES_NAME": "系列名",
+ "PRODUCT_FORMAT": "作品类型",
+ "EVENT": "活动",
+ "AUTHOR": "作者",
+ "SCENARIO": "剧情",
+ "ILLUSTRATION": "插画",
+ "MUSIC": "音乐",
+ "VOICE_ACTOR": "声优"
+}
diff --git a/scraper/langs/zh_tw.py b/scraper/langs/zh_tw.py
new file mode 100644
index 0000000..3cfa1d8
--- /dev/null
+++ b/scraper/langs/zh_tw.py
@@ -0,0 +1,13 @@
+ZH_TW = {
+ "AGE": "年齡指定",
+ "GENRE": "分類",
+ "RELEASE_DATE": "販賣日",
+ "SERIES_NAME": "系列名",
+ "PRODUCT_FORMAT": "作品形式",
+ "EVENT": "活動",
+ "AUTHOR": "作者",
+ "SCENARIO": "劇本",
+ "ILLUSTRATION": "插畫",
+ "MUSIC": "音樂",
+ "VOICE_ACTOR": "聲優"
+}
diff --git a/scraper/locale.py b/scraper/locale.py
new file mode 100644
index 0000000..0135137
--- /dev/null
+++ b/scraper/locale.py
@@ -0,0 +1,10 @@
+from enum import Enum
+
+
+# 枚举类 - scraper 支持的语言
+class Locale(Enum):
+ en_us = 'en_us'
+ ja_jp = 'ja_jp'
+ ko_kr = 'ko_kr'
+ zh_cn = 'zh_cn'
+ zh_tw = 'zh_tw'
diff --git a/scraper/scraper.py b/scraper/scraper.py
new file mode 100644
index 0000000..491ae24
--- /dev/null
+++ b/scraper/scraper.py
@@ -0,0 +1,140 @@
+import re
+import time
+from urllib.request import getproxies
+
+import requests
+from pyquery import PyQuery as pq
+
+from scraper.dlsite import Dlsite
+from scraper.locale import Locale
+from scraper.work_metadata import WorkMetadata
+
+
+def _getproxies():
+ """
+ 获取系统代理
+ """
+ proxies = getproxies()
+ # https://github.com/psf/requests/issues/5943
+ https_proxy = proxies.get('https', None)
+ http_proxy = proxies.get('http', None)
+ if https_proxy and https_proxy.startswith(r'https://'):
+ proxies['https'] = http_proxy
+ return proxies
+
+
+class Scraper(object):
+ def __init__(self, locale: Locale, proxies=None, connect_timeout: int = 10, read_timeout: int = 10, sleep_interval=3):
+ self.__locale = locale
+ self.__connect_timeout = connect_timeout
+ self.__read_timeout = read_timeout
+ self.__sleep_interval = sleep_interval
+ if not proxies:
+ # 获取系统代理
+ proxies = _getproxies()
+ self.__proxies = proxies
+
+ def __request_work_page(self, rjcode: str):
+ url = Dlsite.compile_work_page_url(rjcode)
+ params = {'locale': self.__locale.name}
+ response = requests.get(url,
+ params,
+ timeout=(self.__connect_timeout, self.__read_timeout),
+ proxies=self.__proxies)
+ response.raise_for_status() # 如果返回了不成功的状态码,Response.raise_for_status() 会抛出一个 HTTPError 异常
+ html = response.text
+ time.sleep(self.__sleep_interval)
+ return html
+
+ def __parse_metadata(self, html: str, rjcode: str):
+ d = pq(html)
+ metadata: WorkMetadata = {
+ 'rjcode': rjcode,
+ 'work_name': '',
+ 'maker_id': '',
+ 'maker_name': '',
+ 'release_date': '',
+ 'series_name': '',
+ 'series_id': '',
+ 'age_category': '',
+ 'tags': [],
+ 'cvs': []
+ }
+
+ # parse work_name
+ work_name = d('#work_name').text()
+ metadata['work_name'] = work_name
+
+ # parse maker_name
+ maker_anchor_element = d('span.maker_name > a')
+ maker_name = maker_anchor_element.text()
+ metadata['maker_name'] = maker_name
+
+ # parse maker_id
+ maker_url = maker_anchor_element.attr('href')
+ maker_id = Dlsite.parse_url_params(maker_url).get('maker_id', '')
+ if Dlsite.RGCODE_PATTERN.fullmatch(maker_id):
+ metadata['maker_id'] = maker_id
+
+ translation = Dlsite.TRANSLATIONS[self.__locale]
+ table_rows = d('#work_outline > tr').items()
+ for table_row in table_rows:
+ table_header: str = table_row.children('th').text()
+ table_data = table_row.children('td')
+ # parse release_date
+ if table_header == translation['RELEASE_DATE']:
+ release_url = table_data.children('a').attr('href')
+ if release_url is not None:
+ parse_result = Dlsite.parse_url_params(release_url)
+ year = parse_result.get('year', '')
+ mon = parse_result.get('mon', '')
+ day = parse_result.get('day', '')
+ if year and mon and day:
+ release_date = f'{year}-{mon}-{day}'
+ metadata['release_date'] = release_date
+ # parse series_id & series_name
+ elif table_header == translation['SERIES_NAME']:
+ series_anchor_element = table_data.children('a')
+ series_name = series_anchor_element.text()
+ series_url = series_anchor_element.attr('href')
+ metadata['series_name'] = series_name
+ keyword_work_name = Dlsite.parse_url_params(series_url).get('keyword_work_name', '')
+ split_keyword_work_name = keyword_work_name.split('+')
+ if len(split_keyword_work_name) == 2:
+ series_id = split_keyword_work_name[1]
+ if Dlsite.SRICODE_PATTERN.fullmatch(series_id):
+ metadata['series_id'] = series_id
+ # parse age_category
+ elif table_header == translation['AGE']:
+ age_icon_element = table_data.find('span')
+ age_icon_name = age_icon_element.attr('class')
+ if re.fullmatch(r'icon_(GEN|R15|ADL)', age_icon_name):
+ if age_icon_name == 'icon_GEN':
+ metadata['age_category'] = 'GEN'
+ elif age_icon_name == 'icon_R15':
+ metadata['age_category'] = 'R15'
+ else:
+ metadata['age_category'] = 'ADL'
+ # parse tags
+ elif table_header == translation['GENRE']:
+ genre_anchor_elements = table_data.children('div.main_genre').children('a').items()
+ for genre_anchor_element in genre_anchor_elements:
+ genre_name = genre_anchor_element.text()
+ if genre_name:
+ metadata['tags'].append(genre_name)
+ # parse cvs
+ elif table_header == translation['VOICE_ACTOR']:
+ cv_anchor_elements = table_data.children('a').items()
+ for cv_anchor_element in cv_anchor_elements:
+ cv_name = cv_anchor_element.text()
+ if cv_name:
+ metadata['cvs'].append(cv_name)
+ return metadata
+
+ def scrape_metadata(self, rjcode: str):
+ rjcode = rjcode.upper()
+ if not Dlsite.RJCODE_PATTERN.fullmatch(rjcode):
+ raise ValueError
+ html = self.__request_work_page(rjcode)
+ metadata = self.__parse_metadata(html, rjcode)
+ return metadata
diff --git a/scraper/translation.py b/scraper/translation.py
new file mode 100644
index 0000000..89576be
--- /dev/null
+++ b/scraper/translation.py
@@ -0,0 +1,16 @@
+from typing import TypedDict
+
+
+# 同人作品页面翻译
+class Translation(TypedDict):
+ AGE: str # 年龄指定
+ GENRE: str # 分类
+ RELEASE_DATE: str # 贩卖日
+ SERIES_NAME: str # 系列名
+ PRODUCT_FORMAT: str # 作品类型
+ EVENT: str # 活动
+ AUTHOR: str # 作者
+ SCENARIO: str # 剧情
+ ILLUSTRATION: str # 插画
+ MUSIC: str # 音乐
+ VOICE_ACTOR: str # 声优
diff --git a/scraper/work_metadata.py b/scraper/work_metadata.py
new file mode 100644
index 0000000..80e549c
--- /dev/null
+++ b/scraper/work_metadata.py
@@ -0,0 +1,15 @@
+from typing import TypedDict
+
+
+# 同人作品元数据
+class WorkMetadata(TypedDict):
+ rjcode: str
+ work_name: str
+ maker_id: str
+ maker_name: str
+ release_date: str
+ series_id: str
+ series_name: str
+ age_category: str
+ tags: list[str]
+ cvs: list[str]
diff --git a/wx_log_handler.py b/wx_log_handler.py
new file mode 100644
index 0000000..1598299
--- /dev/null
+++ b/wx_log_handler.py
@@ -0,0 +1,41 @@
+import logging
+
+import wx
+import wx.lib.newevent
+
+# create event type
+wxLogEvent, EVT_WX_LOG_EVENT = wx.lib.newevent.NewEvent()
+
+
+class WxLogHandler(logging.Handler):
+ """
+ A handler class which sends log strings to a wx object
+ https://stackoverflow.com/a/2820928
+ """
+
+ def __init__(self, wx_dest: wx.Window):
+ """
+ Initialize the handler
+ @param wx_dest: the destination object to post the event to
+ """
+ logging.Handler.__init__(self)
+ self.__wxDest = wx_dest
+ self.level = logging.DEBUG
+
+ def flush(self):
+ """
+ does nothing for this handler
+ """
+
+ def emit(self, record):
+ """
+ Emit a record.
+ """
+ try:
+ msg = self.format(record)
+ evt = wxLogEvent(message=msg, levelno=record.levelno)
+ wx.PostEvent(self.__wxDest, evt)
+ except (KeyboardInterrupt, SystemExit) as err:
+ raise err
+ except Exception:
+ self.handleError(record)