Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

挂载夸克,删除大量文件时,内存会爆 #7088

Closed
4 tasks done
PHCSJC opened this issue Aug 27, 2024 · 24 comments · Fixed by #7123
Closed
4 tasks done

挂载夸克,删除大量文件时,内存会爆 #7088

PHCSJC opened this issue Aug 27, 2024 · 24 comments · Fixed by #7123
Labels
bug Something isn't working

Comments

@PHCSJC
Copy link

PHCSJC commented Aug 27, 2024

Please make sure of the following things

  • I have read the documentation.
    我已经阅读了文档

  • I'm sure there are no duplicate issues or discussions.
    我确定没有重复的issue或讨论。

  • I'm sure it's due to AList and not something else(such as Network ,Dependencies or Operational).
    我确定是AList的问题,而不是其他原因(例如网络依赖操作)。

  • I'm sure this issue is not fixed in the latest version.
    我确定这个问题在最新版本中没有被修复。

AList Version / AList 版本

v3.36.0

Driver used / 使用的存储驱动

夸克

Describe the bug / 问题描述

挂载夸克,删除大量文件时,内存会爆,会占用到2G+的内存,直至主机死机

Reproduction / 复现链接

夸克网盘上传或转存100+个小文件,随便找个webdav程序挂载alist,然后全部选中,删除,观察内存变化,即使没爆,增加的内存也不会回收,下次再操作又上涨,直至爆了

Config / 配置

{
"force": false,
"site_url": "",
"cdn": "",
"jwt_secret": "aaaaaaaa",
"token_expires_in": 48,
"database": {
"type": "sqlite3",
"host": "",
"port": 0,
"user": "",
"password": "",
"name": "",
"db_file": "data/data.db",
"table_prefix": "x_",
"ssl_mode": "",
"dsn": ""
},
"meilisearch": {
"host": "http://localhost:7700",
"api_key": "",
"index_prefix": ""
},
"scheme": {
"address": "0.0.0.0",
"http_port": 5244,
"https_port": -1,
"force_https": false,
"cert_file": "",
"key_file": "",
"unix_file": "",
"unix_file_perm": ""
},
"temp_dir": "data/temp",
"bleve_dir": "data/bleve",
"dist_dir": "",
"log": {
"enable": true,
"name": "data/log/log.log",
"max_size": 50,
"max_backups": 30,
"max_age": 28,
"compress": false
},
"delayed_start": 0,
"max_connections": 0,
"tls_insecure_skip_verify": true,
"tasks": {
"download": {
"workers": 5,
"max_retry": 1
},
"transfer": {
"workers": 5,
"max_retry": 2
},
"upload": {
"workers": 5,
"max_retry": 0
},
"copy": {
"workers": 5,
"max_retry": 2
}
},
"cors": {
"allow_origins": [
""
],
"allow_methods": [
"
"
],
"allow_headers": [
"*"
]
},
"s3": {
"enable": false,
"port": 5246,
"ssl": false
}

Logs / 日志

No response

@PHCSJC PHCSJC added the bug Something isn't working label Aug 27, 2024
Copy link

welcome bot commented Aug 27, 2024

Thanks for opening your first issue here! Be sure to follow the issue template!

@PHCSJC
Copy link
Author

PHCSJC commented Aug 30, 2024

再次反馈:我从v3.36.0退回到v3.25.1后,意外发现一切都正常了,docker限制内存只给700M,实际使用中最大也没超过350M,配置没有做任何改动

@PHCSJC
Copy link
Author

PHCSJC commented Aug 31, 2024

再再次反馈:v3.25.1虽然内存正常,但发现夸克看视频缓冲要2,3秒,可能是驱动太老了,于是从最新的v3.36.0一直往回试,发现直到v3.34.0时内存正常了,夸克速度也正常了,目前的配置是缓存过期时间1分钟,内存基本不会超过300M

{
  "force": false,
  "site_url": "",
  "cdn": "",
  "jwt_secret": "xxxxxxxxxxx",
  "token_expires_in": 48,
  "database": {
    "type": "sqlite3",
    "host": "",
    "port": 0,
    "user": "",
    "password": "",
    "name": "",
    "db_file": "data/data.db",
    "table_prefix": "x_",
    "ssl_mode": "",
    "dsn": ""
  },
  "meilisearch": {
    "host": "http://localhost:7700",
    "api_key": "",
    "index_prefix": ""
  },
  "scheme": {
    "address": "0.0.0.0",
    "http_port": 5244,
    "https_port": -1,
    "force_https": false,
    "cert_file": "",
    "key_file": "",
    "unix_file": "",
    "unix_file_perm": ""
  },
  "temp_dir": "data/temp",
  "bleve_dir": "data/bleve",
  "dist_dir": "",
  "log": {
    "enable": true,
    "name": "data/log/log.log",
    "max_size": 50,
    "max_backups": 30,
    "max_age": 28,
    "compress": false
  },
  "delayed_start": 0,
  "max_connections": 0,
  "tls_insecure_skip_verify": true,
  "tasks": {
    "download": {
      "workers": 5,
      "max_retry": 1
    },
    "transfer": {
      "workers": 5,
      "max_retry": 2
    },
    "upload": {
      "workers": 5,
      "max_retry": 0
    },
    "copy": {
      "workers": 5,
      "max_retry": 2
    }
  },
  "cors": {
    "allow_origins": [
      "*"
    ],
    "allow_methods": [
      "*"
    ],
    "allow_headers": [
      "*"
    ]
  },
  "s3": {
    "enable": false,
    "port": 5246,
    "ssl": false
  }
}

@PHCSJC
Copy link
Author

PHCSJC commented Aug 31, 2024

v3.34.0版本简直离谱,闲置一段时间后内存占用减小到只有39M
20240831110247

@Mmx233
Copy link
Contributor

Mmx233 commented Aug 31, 2024

根据上文,我在稍微查阅代码后推测可能为:

  • go-resty 内存泄漏
  • sqlite driver 内存泄漏
  • http server 某处存在内存泄漏
  • 只是大并发操作下使用过多内存,未发生泄漏

@Mmx233
Copy link
Contributor

Mmx233 commented Aug 31, 2024

我正在本地尝试复现问题,使用的环境:Linux amd64,k3s v1.30.4 +k3s1 (Containerd)mmx233/alist:v3.36.0-beta2-ffmpeg (PR #7073 测试用镜像),Database:MySQL 8.2.0,Driver:Quark-UC,Monitor:Netdata latest

使用 DD 生成 2M 大小文件 100 个,从 local 复制到 Quark,六分钟后执行删除

@Mmx233
Copy link
Contributor

Mmx233 commented Aug 31, 2024

未在此环境复现此 Issue 提到的问题

上传:

image

删除:

image

@Mmx233
Copy link
Contributor

Mmx233 commented Aug 31, 2024

创建测试专用容器,切换到 Database:sqlite,Image:xhofe/alist:v3.36.0,Configs:All Default

在 Quark Driver 中列出文件存在问题,无法很好地完成批量删除操作,但是基本可以确定不合理的内存占用没有发生

image

@Mmx233
Copy link
Contributor

Mmx233 commented Aug 31, 2024

上述监控图表所示指标为 RES(进程占用的物理内存)和 SHR(共享内存)的和

image

@Mmx233
Copy link
Contributor

Mmx233 commented Aug 31, 2024

作为补充,在 Windows 下挂载 WebDav 并删除 100 个 Local Storage Driver 中的文件(测试环境1,MySQL),未能复现问题:

image

上述测试均未复现内存高占用、内存未回收的问题。请提供更多相关信息或本地测试结果

@PHCSJC
Copy link
Author

PHCSJC commented Sep 1, 2024

感谢回复,可能还是环境不同的原因,我这里实测v3.36.0,随便刷几下,2分钟不到就可以刷爆(docker限制700M),但在v3.34.0下确实没问题,我再详细说一下我的复现步骤:

1.系统是debian11,alist安装的docker版本,安装命令如下:

docker run -d \
--restart=unless-stopped \
--net=macnet --ip=192.168.2.5 \
-v /opt/alist:/opt/alist/data \
-e PUID=0 \
-e PGID=0 \
-e UMASK=022 \
--name="alist" \
xhofe/alist:latest

再限制一下内存为700M

docker update alist --memory-swap -1 -m 700M

2.安装完成后,挂载夸克网盘,挂载路径/quark,本地代理,填上cookie,其他全部默认

3.夸克网盘随便找个电视剧转存一下,比如凡人修仙传,有100集

4.手机上安装流舟文件app(免费的),用webdav的方式挂载alist(我这直接用admin用户),打开/quark,找到电视剧,点进去,然后下拉刷新,此时会开始生成缩略图,观察alist内存,会发现一直上涨,且基本不会回收,如果内存没爆,就关闭流舟app,重复刚才的操作,基本上不到2分钟内存就会到达700M然后爆了,alist进程会自动重启变成几十M。

@Mmx233
Copy link
Contributor

Mmx233 commented Sep 1, 2024

Alist 本身只在本地存储支持在本地生成缩略图,如果这个是流舟文件 app 的功能,那这很可能就是内存占用过高的主要原因

@PHCSJC
Copy link
Author

PHCSJC commented Sep 1, 2024

@Mmx233 其实其他操作也会导致内存爆,我只是那样举例哈,比如100个文件同时选中再删除也会爆,在v3.34.0下就都正常

@Mmx233
Copy link
Contributor

Mmx233 commented Sep 1, 2024

鉴于我仍然无法在本地复现,请你在本地执行调试:

  1. 在创建 Docker 容器时,指定 Command 和 Args:docker run <image_name> <command> <args>, 指定为 docker run <image_name> /opt/alist/alist server --no-prefix --debug
  2. 尝试打开 pprof allocs 调试信息页面:http(s)://YOUR_DOMAIN/debug/pprof/allocs?debug=1,如果是 alist 界面说明第一步未成功
  3. 进行一些操作,使内存暴增但未引起进程退出
  4. 刷新第二步打开的页面,复制所有内容保存到 txt
  5. 打开 http(s)://YOUR_DOMAIN/debug/gc,页面内容应为 ok
  6. 重复第四步
  7. 将两个 txt 文件使用步骤序号命名后上传到此 issue

@PHCSJC
Copy link
Author

PHCSJC commented Sep 1, 2024

@Mmx233 第1个txt是500多M时保存的,第2个是800多M时保存的
1.txt
2.txt

@Mmx233
Copy link
Contributor

Mmx233 commented Sep 1, 2024

根据日志,80% 的内存都被 WebDav 的 HttpServer 的 net.NewBuffer 消耗

试试 mmx233/alist:v3.36.0-gamma1-ffmpeg 还会不会泄漏。此镜像使用 Mmx233@52d9023 构建

@PHCSJC
Copy link
Author

PHCSJC commented Sep 1, 2024

@Mmx233 刚试了,还是有泄漏

@Mmx233
Copy link
Contributor

Mmx233 commented Sep 1, 2024

内存高占用和内存超过五分钟不会回收是两个分开的问题,请确认两个问题都还存在

@Muione
Copy link
Contributor

Muione commented Sep 1, 2024

4.手机上安装流舟文件app(免费的),用webdav的方式挂载alist(我这直接用admin用户),打开/quark,找到电视剧,点进去,然后下拉刷新,此时会开始生成缩略图,观察alist内存,会发现一直上涨,且基本不会回收,如果内存没爆,就关闭流舟app,重复刚才的操作,基本上不到2分钟内存就会到达700M然后爆了,alist进程会自动重启变成几十M。

看这个描述,应该是 流舟文件app 会自动下载文件然后在手机生成略缩图;

  • 你是否使用了 nginxalist 进行了反代?(没有正确设置 Range 会导致 alist 将文件全部下载然后保存在内存中)

@PHCSJC
Copy link
Author

PHCSJC commented Sep 2, 2024

@Mmx233 mmx233/alist:v3.36.0-gamma1-ffmpeg这个版本,用到400多M,静默了12个小时还是400多M,没有回收内存,如果是v3.34.0版本内存会回收到50M以下

@Muione 没有反代,直接访问5244端口,打开"流舟文件app",观察流量,alist确实在从夸克网盘下载,然后内存暴涨,问题是v3.34.0版本没有任何问题,内存不会一直上涨,基本不会超过300M,而且静默几十分钟后就会回收到50M以下

@Mmx233
Copy link
Contributor

Mmx233 commented Sep 2, 2024

抱歉,前面提供的 image tag 复制错了。有修复补丁的 image 应是 v3.36.0-gamma2v3.36.0-gamma2-ffmpeg

@PHCSJC
Copy link
Author

PHCSJC commented Sep 2, 2024

@PHCSJC 试了mmx233/alist:v3.36.0-gamma2这个版本,正常了,和v3.34.0几乎完全一样,内存不会持续暴涨,也会自动回收到50M以下

@Mmx233
Copy link
Contributor

Mmx233 commented Sep 2, 2024

峰值内存占用也很低吗,我看见 Server Buffer 的 size 和重复利用还有可优化的空间

@PHCSJC
Copy link
Author

PHCSJC commented Sep 2, 2024

@Mmx233 峰值没超过300M

xhofe pushed a commit that referenced this issue Sep 3, 2024
* chore(webdav): fix warnings in HttpServe

* fix(webdav): HttpServe memory leak
Three-taile-dragon pushed a commit to loognsss/blist that referenced this issue Sep 26, 2024
* chore(webdav): fix warnings in HttpServe

* fix(webdav): HttpServe memory leak
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants
@PHCSJC @Mmx233 @Muione and others