-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Active probing weakness found in the Xray implementation of Shadowsocks #625
Comments
解决方案首先感谢你的工作,这是我们已知的协议边界探测问题,对于这类问题,我的计划是全局 error->drain。此前我发现了现行 VMess AEAD 协议的数个漏洞并报告给了 v2fly 团队,也包括全局 error->drain 的主张:v2fly/v2ray-core#940 (comment) 关于 Shadowsocks 协议Xray-core 的 Shadowsocks 实现最初继承自 v2ray-core,在此基础上我进行了:
其中第四点仍在进行中,这也是我在 shadowsocks-org 提出以下主张的原因:
此外,众所周知,我还发现了两个现行 Shadowsocks & AEAD 的漏洞 / 弱点:
总体上,我认为 Shadowsocks 已经是一个漏洞百出的协议。根本上,它缺乏现代的前向安全,亦无法彻底抵御重放攻击。(实际上 Xray-core 的 Shadowsocks 也没有 bloom filter,因为这个方案治标不治本,Victoria Raymond 应该也是看到了这一点。此外,SS 正在研究是否移除 bloom filter:shadowsocks/shadowsocks-rust#556 )然后,贯穿 Shadowsocks 的“往返流量无关”的设计导致了一大堆本可避免的漏洞,0-RTT 又带来了一堆令人左右为难的副作用。还有不得不提,TCP、UDP 同端口跑未知流量完全可以被精准识别。 Shadowsocks 协议具有历史意义,所以 Xray-core 将继续尽可能兼容它。但长期来看,我倾向于设计一个 Securesocks 作为替代品。 |
希望能早日看到你的shadowsocks实现中的这方面改进。 |
Dear GFW Report, Thanks for your work. After a simple code review, I think this problem is caused by my wrong code. I will do some tests and try to fix this problem in my free time. Yours sincerely, |
我的其中一台服务器上使用弱密码的 Shadowsocks-libev 端口没有被封禁,而 Shadowsocks 使用 Xray-core 实现的端口被封禁。 这似乎可以间接证明这个问题的影响。 接下来,我将把所有服务器的 Xray-core 替换为 #629 补丁的版本。 |
@wc7086 需要更多信息,比如使用程度 @gfw-report 可能是为 SS 加动态增删用户的 API 时原本的“根据第一个用户信息而 drain”的行为被删掉了导致的问题(之前我以为是后面的问题),麻烦测试一下 #629 是否修复了它,谢谢 @maker2002 等待测试 |
@gfw-report BTW,如果需要的话,我可以为 Xray-core 的 Shadowsocks 和 VMess 协议加一个主动探测行为记录器,字节级时序数据 |
It seems that the problem has not been fixed completely. We will be more than happy to help with more testings.
That would be great for monitoring the active probings against Xray-core in the future. The following design may be sufficient to alarm us when active probing occurs: ./xray --enable-probe-logging probe.csv --whitelist 1.1.1.1 2.2.2.2/24 where The priority of this feature can be |
那台服务器每月 300GB 流量都是差不多用完的,使用 xary-core 的 ss 走的流量不会超过当月总流量的十分之一,在 xray-core 的 ss 端口封禁前 443 端口就被封禁了,ss-libev 至少有八个人在使用。在五月之前那台服务器上只有 ss-libev 和 SSH 的端口对外开放,且在同一个端口使用同一密码已经不间断运行将近两年。
看来我应该停止使用 Xray-core 的 Shadowsocks 实现。 |
@wc7086 ss-libev 的加密方式是? |
都是 Chacha20-ietf-poly1305 |
@RPRX 请问关于这个问题有没有修订进展呀?感谢! |
最近在网上找资料,无意中找到这个issue,见到好像说xray的ss有漏洞,有些大哥搭的ss也好像被封禁了,( 禁端口)... 我不是很懂程序码、抓包、侦错之类的,我只是一个搭建服务器fq的一个普通用户。 最近我在xray搭了ss (套warp+haproxy负载均衡),已经稳定运行了5天,处理了43GB多的数据,在haproxy的日志里看到有墙的IP主动探测(不多,可能几个小时两三个),但居然挺了这么多天没有问题,所以我想知道是墙忽然大发慈悲饶了我服务器一命?还是我用的加密墙侦测不到?又会不会这个搭xray+ss的方式误打误撞令墙的侦测失效呢?(这个方式我找遍网上也没见有人这样搭啊) 我不懂看数据包、程序码之类的,如果因为我觉得「误打误撞」令墙的检测失效而让觉位见笑,请见谅,因为不懂,就拿出来讨论一下。 这是我在xray里搭的ss所使用的配置,加密:aes-256-gcm,密码:11位数字,无混淆,无插件,只限tcp,xray 1.42 我是这样搭起ss的: 我有6台服务器(全都是外国vps,没有国内中转),6台都装了xray+ss,配置都一样,都是套了CF warp 我用一台服务器,(Server A),开了一个端口9999,用来挡刀,以haproxy监听9999,配上负载均衡,balance roundrobin,把进入9999的数据,转送给后面6台服务器(包括Server A的localhost)的ss端口60000。 每台服务器的60000端口,都用iptables配置了除了CF WARP的IP 8.0.0.0/8之外,不让任何IP连接进来。 但9999这个端口,就没有添加任何限制,(因为我是动态IP,限制了,我就连自己的客户端也不能从外面连进服务器了) 。 我现在的情况是,我用ss连上Server A:9999, 上网是正常的,而且速度也不慢,开了五天,没有任何干扰和封端口的情况出现,而且确实做到了负载均衡的效果(6个IP轮换)。 由于我所有服务器都套了WARP,而且ss端口也只接受warp的IP连进来,所以我每台服务器日志纪录到的客户端的IP都是8.x.x.x,我见到有墙的IP扫描我:(墙的IP我是在xay日志纪录错误的时间再加上haproxy的日志里在该时间段找出不是我自己的IP而得出来的。),见到墙的IP扫描我时,都是出现下面这三个错误: 2021/07/30 15:21:04 8.37.43.15:55436 rejected proxy/shadowsocks: failed to match an user > cipher: message authentication failed 2021/07/30 14:49:44 8.37.43.15:59740 rejected proxy/shadowsocks: failed to read 50 bytes > read tcp xxx.xxx.xxx.xxx:60000->8.37.43.15:59740: i/o timeout 2021/07/30 23:17:22 8.39.127.139:50422 rejected proxy/shadowsocks: failed to read 50 bytes > unexpected EOF 上面这三个错误,分布于6台服务器,6台服务器都有上面的错误,但这种扫描不多,不频密。 我用来挡刀的服务器(Server A)能存活了5天而不死,会不会是我套了WARP的效果,扰乱了墙的主动检测呢? 我说找资料的意思,是想找找网上有没有和我用同一个方案配置出来的ss,如果有,看会不会有「抗封锁」的效果,但我找不到像我这样配置的。 当然我的「抗封锁」可能是我自己的一厢情愿,说不定我的Server A明天就会被封了,但因为不懂,所以才拿了一个傻话题出来讨论一下,再次希望大家不要见笑。 |
@cwyin7788 你的做法對抵抗邊界協定探測沒有任何幫助… |
谢谢回覆 ,因为我见到有一篇文章说搭ss配白名单IP可以有效阻止墙的主动检测,但因为我是动态IP,又没搭中转,想来想去我就只有想到这个方法了,但我又不知道深入的原理,就上来问问。希望我的ss能继续撑下去吧。 🙏🙏🙏🙏🙏🙏 |
在我上个回复的那段时间内,几乎每天都被封端口,每次端口被封就更换IP,而且被封的IP端口总是集中在几台服务器上(这几台服务器可能有固定的用户群体 可能是手机,移动用户居多),后来我看到clash16.1的漏洞:https://github.com/Dreamacro/clash/issues/1468 ,而后提示所有用户升级自己的客户端版本,最近2-3周以来端口被封的概率小了很多,大概1周被封一次(所以我怀疑被探测到与客户端也有关系?)希望我的发现能给 @RPRX @gfw-report @AkinoKaede 一些排查的方向和灵感。 |
v2ray-core的shadowsocks实现修订了一些问题,xray这边能否考虑也合并一些更新?@AkinoKaede
再次感谢你们的辛勤付出! |
@moranno,布隆过滤器能够防止重放攻击,但是添加大量的数据之后也会逐渐提升假阳性的概率。目前 shadowsocks-rust 已经移除了布隆过滤器,而 V2Ray 也没有默认启用这个特性。 |
@moranno xray 内置的ss应该是没有这个问题的,有问题的应该是ss rust |
@Johnny256Dawson 不知道现在有没有修复 |
如果我前置机器后面的ss端口都配置只接受WARP的IP (8.x.x.x.)连进来,运行了应该都有一个多月吧,没事,但当我把这个"只限WARP IP连进来"的规则从iptables删除了(即是任何IP都可以连进来),纯粹只为测试,不到一天,前端机器的ss接收端口就被封了,在后置机器再把那条规则"开启"后又没事,现在继续使用那条规则观察中... 但不使用那条规则玩ss我是不敢再试的了 |
请问你用的客户端是什么?我没有限制连入IP,也没有出现1天就被封的情况,被封端口的时间很不规律,被封锁情况可以查看我这个回复:#625 (comment) |
shadowrocket、qx、v2ray-ng、openwrt的ssr+也有,搭了这个出来快有2个月吧,1个月正常,后来把规则拿掉,一天左右就被封了端口,现在把规则加回,运行了快有2个星期了,也没事。 一周被封一次也算是很频密了,建议你使用aes-256-gcm,chacha听说已经有点不可靠了。 |
可喜可贺,#629 终于被合并了,@gfw-report Could you please test it again if the problem has been solved? |
Outline need to read 50 bytes to authenticate the users, too. |
Thank you @AkinoKaede for spending time and efforts trying to fix this problem. Our testing shows that the problem has not been fixed completely as of this commit 63d0cb1. Specifically, the server still reacts inconsistently as reported in #629 (comment). Below is how we tested it and you can try to reproduce it yourself: Open the first terminal to build and run Xray. The git clone https://github.com/XTLS/Xray-core.git
cd Xray-core
go build -o xray -trimpath -ldflags "-s -w -buildid=" ./main
./xray < config.json Open the second terminal to capture traffic: sudo tcpdump -i lo port 12345 Open the third terminal to send our own probes: Case 1: After receiving 1 byte of invalid data, the server will wait for 60 seconds and then timeout by sending a FIN+ACK to close the connection. (python3 -c "print('a' * 1, end='')"; cat) | ncat -v localhost 12345 The server log shows: [::1]:45826 rejected proxy/shadowsocks: failed to read 50 bytes > read tcp [::1]:12345->[::1]:45826: i/o timeout
[Info] [2353414061] app/proxyman/inbound: connection ends > proxy/shadowsocks: failed to create request from: [::1]:45826 > proxy/shadowsocks: failed to read 50 bytes > read tcp [::1]:12345->[::1]:45826: i/o timeout Case 2: After receiving 50 byte of invalid data, the server will wait for 60 seconds and then timeout by sending a FIN+ACK to close the connection: (python3 -c "print('a' * 50, end='')"; cat) | ncat -v localhost 12345 The server log shows: [Info] [1350332059] app/proxyman/inbound: connection ends > proxy/shadowsocks: failed to create request from: [::1]:45856 > proxy/shadowsocks: failed to match an user > proxy/shadowsocks: Not Found
[::1]:45856 rejected proxy/shadowsocks: failed to match an user > proxy/shadowsocks: Not Found Note that in case 1 and 2, sending more bytes in the following 60 seconds will not refresh server's timeout value. This is good because it will not cause new active probing attack based on refreshed timeout value. Case 3: When receiving 500 bytes, the server will, with some probabilities, either 1) close the connection immediately with a FIN+ACK; or 2) wait until the 60-second timeout: (python3 -c "print('a' * 500, end='')"; cat) | ncat -v localhost 12345 In both cases, the server log is the same as follows: [Info] [3382172859] app/proxyman/inbound: connection ends > proxy/shadowsocks: failed to create request from: [::1]:45864 > proxy/shadowsocks: failed to match an user > proxy/shadowsocks: Not Found
[::1]:45864 rejected proxy/shadowsocks: failed to match an user > proxy/shadowsocks: Not Found Don't get discouraged by not having it fixed completely yet, @AkinoKaede: the problem can be hard to fix and you've already done a great job on changing the active probing fingerprints. We would suggest having a new release. This is because, although the active probing vulnerability is not fixed completely yet, it usually takes time for the censor to adapt to the new fingerprint. We will thus gain more time to have a complete fix. What do you think, @yuhan6665 ? |
@gfw-report Thanks for your testing and detailed explanation. Will continue learning and try some work on the issue. Waiting release from @badO1a5A90. |
Hmm, I think the timeout is caused by default policy config. |
You are probably right that the timeout is controlled by the config. The timeout in case 1 and 2 is not the problem. In fact, our goal is to let the server always timeout when receiving a probe. The problem is the server will behave inconsistently in Case 3. We don't want the server to sometimes close the connections immediately, while sometimes timeout. We want the server always timeout. |
Thanks. Although I think this is the expected result, I try to make the server always timeout. https://github.com/AkinoKaede/Xray-core/tree/feat-shadowsocks-unlimited-drain |
I tested it on my computer and I think it works. |
We confirm that, as of this commit https://github.com/AkinoKaede/Xray-core/commit/c136ff0bf554c6d7ad9b0f8f6e06ea4783b51529, the server will always timeout when receiving random probes with varied length. In specific, we tested by sending random probes with length varied from 1 byte to 1500 bytes to the server, and the server sent a FIN/ACK to close each connection after 60 seconds. Thank you so much for your time and effort, @AkinoKaede! You did an awesome contribution to the community! |
@AkinoKaede thanks for your work! I'd like you merge you code but I have some questions It seems your new drain method doesn't have a end by itself. Will this approach open other attack opportunities or could be just Denial-of-Service by simply create many connections with random characters? |
In fact, VMess has the same problem. |
I did some tests with xray and v2fly using the method mentioned above Xray - 63d0cb1 - VMESS
Xray - 63d0cb1 - shadowsocks
v2fly v4.43.0 - VMESS
v2fly v4.43.0 - shadowsocks
Does it mean we have broad problems with xray/v2fly's drain logic? @gfw-report @AkinoKaede |
So I said it is expected result, I think the edge of close connections and wait is random, that is the the purpose of design Certainly, read-forever may be a better method. |
Could you update your code to read-forever? Thanks for your extraordinary work! |
Hi @AkinoKaede , could you PR this fix, thanks! |
This requires the maintainer to decide whether to modify the code. |
Our archaeological team has discovered this relic. May I ask if the ancient cache of technology in this relic are applicable to modern stuff |
曾经和 @yuhan6665 讨论过要不要把全加密协议的抗主动探测措施全下掉,毕竟现在 GFW 已经在靠检测流量一封一个准了 而且全随机数是全加密协议的固有特征,没得蹦跶,不像 TLS 还能在包长度和时序特征上做做文章,所以回落机制是有用的 |
Hi @Fangliding, thank you for reporting this issue. You did a great job on testing and discovering it. Hi @RPRX,
I agree with you that the defense against active probing is still necessary. As emphasized by Wu et al. (https://gfw.report/publications/usenixsecurity23/zh/#sec:active-probing):
I'm curious that by saying "fall back", what type of services/logic do you have in mind to handle the active probing? What do you think of the prior conclusion by Frolov et al. that using "reading forever" as the reaction to any active probing as it is "[t]he most popular behavior" for hosts on the Internet? (See Fig. 13: https://censorbib.nymity.ch/pdf/Frolov2020a.pdf#page=11)
Speaking of the high entropy of the fully encrypted traffic, what do you think of a design that lowers the entropy like this patch (post and patch)? |
@gfw-report |
Ironically, despite REALITY's advanced techniques, its servers are blocked in Iran within four days, while Shadowsocks with a clean IP often lasts much longer. :)) |
I must clarify that, apart from some additional cryptographic security and secure forged SNI, REALITY is no different from regular TLS, including traffic characteristics. There are some seemingly foolish but effective methods to fool local firewall, which is not the topic of this discussion. |
Shadowsocks 流量是可以被识别的,政府允许你使用此方式绕过防火墙时才不封禁服务器。 |
Hi @Fangliding, thank you for letting us know.
We totally understand that Xray may find better ways to spend time and efforts. We just want to clarify that this issue is fixable and not unavoidable. As summarized in net4people/bbs#26 (comment), many tools have adopted the "reading forever" strategy, which is very effective against this types of length-based active probing.
Thank you for the introduction! |
I think the new SS2022 may have already solved this problem (see https://github.com/Shadowsocks-NET/shadowsocks-specs/blob/main/2022-1-shadowsocks-2022-edition.md )We may not update the old shadowsocks |
@Fangliding |
You have the comment, if I can't reproduce the problem, there is nothing I can do |
Dear Xray-core developers,
@Maker2002 reported that:
While we have no evidence that the blocking is caused by the active probing attack, our initial testing indeed suggests that the Xray implementation of Shadowsocks has active probing weaknesses. These active probing attacks have been previously proposed by Frolov et al. and found being used by the GFW (Alice et al.).
We document how we spotted the weakness because similar active probing weaknesses may be quickly spotted in other parts of the Xray or in other circumvention tools.
First, we get the latest xray binary (v1.4.2):
Second, we save the following configurations to
config.json
. The server listens on port12345
, and usesaes-128-gcm
:Third, we started the Xray using the configuration above:
./xray < config.json
. One can also open another terminal to monitor the traffic with:sudo tcpdump -i lo port 12345 -v
.Forth, we open another terminal and send random/invalid bytes to its listening port
12345
:The Xray implementation of Shadowsocks using
aes-128-gcm
exhibits the following fingerprint:FIN/ACK
.FIN/ACK
; or read-forever;The thresholds vary when different encryption methods are used. And the thresholds can be more complex when different encryption methods are used on the same port (this feature is supported by xray).
We understand that Xray has already had some mitigation attempts, like read-forever (https://github.com/XTLS/Xray-core/blob/main/proxy/shadowsocks/protocol.go#L162-L165) and varied drain size (https://github.com/XTLS/Xray-core/blob/main/proxy/shadowsocks/protocol.go#L61-L67). However, it seems that more efforts are required to eliminate these distinguishable fingerprints demonstrated above.
You may find the following links helpful:
The text was updated successfully, but these errors were encountered: