-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Replace Java Security Manager (JSM) #1687
Comments
@nknize suggested we remove security manager in 2.0, labelling issue as such - once we have agreed here on what to do for this issue let's open a campaign parent issue in https://github.com/opensearch-project/opensearch-plugins/ |
@dblock would you mind if I submit a small patch for 1.3.x+ so it could be run on JDK 18? Thank you PS: To clarify why, JDK 18 is scheduled to be released in March, right around 1.4.x (planned) release, I suspect a number of people may give it a try. The change is only adding the command line property, non breaking. |
I'm A-OK with anything non-breaking on 1.x. |
I suspect tests will blow up since the test infrastructure leverages a custom SecurityManger via |
I think the issue is written up correctly. You'll want to set Lucene uses a custom security manager too, no issues on JDK18. we just initialize it differently than opensearch, right at JVM startup time: But in your case here, it is a little different because system starts up with no security manager, then parses some config files and maybe does a few evil things on startup, then it installs security manager via |
Separately, as far as alternatives, I can suggest a few things:
I don't recommend directly going the LSM route (AppArmor, SELinux, etc). There's a lot of complexity to those, and its so system-specific which if any are even available. I'd start with systemd which is basically universal now on linux systems, and it gets you the biggest wins anyway (e.g. filtering filesystem and so on). |
Another win for stuff like but that strategy won't work for all the code: There's no one-size/fits-all solution. For example, things like analysis modules/plugins are extremely performance sensitive, and really need to just be passed to IndexWriter. At the same time, these plugins have less security risk (compared to e.g. Tika or scripting languages), so it's not a huge deal: they are just exposing lucene analyzers :) |
Thank you very much, @rmuir
That is right. |
I've also made my opinion loudly clear on twitter that removing SecurityManager without replacement is a bad idea for java right now. At least providing a "replacement" first (ideally enabled by default), to help protect server-side apps against the worst vulnerabilities, is really needed. Java is filled with security landmines. Doubt anything will change on the java side, but I tried. I don't have the resources/energy to write up JEP proposals or anything to try to make real change here though, sorry. |
Thanks @rmuir , I think the large part with respect to "what the replacement should be" is still unknown, as it is dictated by Project Loom that is not there yet. But I do 💯 agree on the point: removing |
if you think of the entire internet (not just opensearch), i really do feel that something similar to the openbsd but there's also the separate problem that java includes insecure functionality like JDNI ("landmines"), by default. Besides sandboxing, we need to get good secure defaults here and disable dangerous crap by default. it is a multi-pronged approach. |
@Pallavi-AWS the recent (one of many) discussions on OpenJDK mailing list hint there won't be replacements for [1] https://mail.openjdk.java.net/pipermail/security-dev/2022-April/029643.html |
i recommend to keep using it until it completely stops working. why would you voluntarily disable a security feature unless you have to? |
It's already deprecated in the jdk and can be found in the build logs:
This is still being worked and there are already some great suggestions on this issue. In the meantime, we planned to keep using it until it stops working and will converge on a plan before upgrading to a jdk that removes it completely. |
Use of the SecurityManager and AccessController have been deprecated and will be removed in java versions after 17. While this is an issue its also one that will take a concerted effort to resolve. These warning messages making discovering build errors and other warnings more difficult; hence adding this supression logic. For tracking the effort to replace these components look into opensearch-project/OpenSearch#1687 Signed-off-by: Peter Nied <[email protected]>
@reta just an update from my conversation with GraalVM folks on their slack channel. Sandboxing in GraalVM is not yet supported for JAVA. It on their roadmap, but we don't have any dates when/if it would be delivered. slack discussion (GraalVM public channel): https://graalvm.slack.com/archives/CPSD12R71/p1731769241953729 |
In case anyone is wondering: SM API Compatibility across all Java Platforms: We can no longer call System::getSecurityManager or System::setSecurityManager, many permission checks call System::getSecurityManager, but don't have to:
Use checkGuard instead:
Continue using AccessController::doPrivileged and Subject::doAs methods. Use -Djava.security.manager=default to set a SecurityManager on supported platforms. This will allow your software to support all Java platforms. |
Thanks @pfirmstone very correct (and the same applies to |
One possible strategy might be to update OpenSearch to provide binary compatibility with Java platforms prior to and following 24, while security restoration options are explored, with the caveat that anyone running without SM do so at their own risk, this way, developers can commence testing on Java 24, with a view to support readiness at some later point once new security mechanisms are in place. |
Hi folks, I think one follow up I would like to have is whether there is any point in maintaining the effort with the Java SM. And perhaps it’s better to just remove it before upgrading to future Java versions.
They are also making additional important points, but this one above is aligned with out experience as well. As we enable plugins, our main concerns where really around these areas described above, which the SM didn’t help prevent in anyway. |
I'm sure you mean well, it's always good to explore options and try to see consider other perspectives, in this case Opensearch is attempting to address the dangerous issues SM did address, and have been investigating all options, including using agents as per OpenJDK advice. Even if there was only one dangerous issue, that would be justification enough. OpenJDK just doesn't want the expense to maintain a feature that's not commonly used and are delegating that burden back onto developers. Criticisms in JEP411 apply to implementation code in OpenJDK, but many of those issues are easily addressed and had been addressed outside of OpenJDK for a long time. Perhaps if 20 years ago, we had good tooling to manage policy, things would be different today... I've been working on making significant improvements to SM, to increase the number of vulnerabilities it can intercept:
Discussion on OpenJDK lists revealed that Oracle company policy didn't allow public collaboration on security issues, however OpenJDK had no objection to the community maintaining it, which is what I'm doing, although obtaining a TCK license doesn't appear likely, unless one of the existing licensees are willing to assist. |
Actually, what's interesting, since OpenJDK removed Authorization, it's up 6 points from 24 to 18, improper privilege management is up 7 from 22 to 15 and code injection is up 12 points from 23 to 11, exposure of sensitive information to an unauthorized actor is up 13, from 30 to 17. https://cwe.mitre.org/top25/archive/2024/2024_cwe_top25.html |
Really funny, since base test class here OpenSearchTestCase subclasses LuceneTestCase and uses randomized-runner and test setup is similar. When you have thousands of tests you need such isolation just to maintain test suite. SecurityManager stops the problems before you see them at "operating the service" and allows your tests to safely run in parallel without stomping on each other's files, binding to each other's ports, etc. Fails on such problems before they get merged. Fails on shenanigans from third-party libraries at test -time before they get merged and cause chaos in CI or maybe elsewhere. I remember how much "fun" CI builds were before this was there: tests doing exactly these things and meddling with each other. You can't even fix the tests as fast as developers add new ones doing new crazy things. And developers might use Windows or MacOS, not some AWS environment with no multicast, etc. Its important to fail on them early in development lifecycle (e.g. on their machine), and to be able to reproduce failures from CI. If you are a security guy looking at this like "oh I've never seen this thing stop me from getting owned", you are looking at the problem wrong. Sure, security manager sucks, security guys dont understand it, developers don't understand it, its this complex beast in no-man's land. But the guarantees that it gives in the test process alone are not easily replaced. |
Link to very simple policy used by lucene to keep 16000+ tests in order: https://github.com/apache/lucene/blob/main/gradle/testing/randomization/policies/tests.policy Similar stuff happening here in opensearch, the setup is just more complex here, so going thru that much simpler lucene tests policy file is easy to reason about, when thinking about sandboxing test suite and preventing trouble from entering the codebase in the first place: You can consider using systemd sandboxing for test VM execution, it may help contain the filesystem at least. might not be so terrible now that IDEs have widespread devcontainer support. But you have to implement such devcontainer setup and force everyone to use it and adjust gradle test execution to run each jvm with separate namespaces and so on. Maybe even with some fancy seccomp setup you can prevent the tests from binding to anything except localhost ephemeral ports, too. But fancy devcontainer setup still won't solve problems such as preventing tests from messing with things like environment variables and system properties, these have side effects for other tests, for stuff like that, security manager is good. If you don't stop it, developers will do it. |
Security manager is not flawed for what it does, it's flawed for how it does its job. If you look at GraalVM (Oracle) polygot sandboxing policies, the concepts are similar to what security manager does, but it does it in a more modern and cleaner way.
I can't remember of any recent attack as nasty as the Log4j remote code execution vulnerability. Opensearch was protected from that attack; thanks to security manager. Security is built in layers, no single protection mechanism can fully protect against all sorts attack vectors. A simple example is: Is IAM sufficient for security in cloud, the answer is big NO. We (Opensearch) may not be able to find a full alternative, nor do we really want to find a full replacement (to keep some things simple); but we are clear that we need to strengthen Opensearch in lack of SM. |
We use SM for Authorization, but we don't just use it for code as JEP411 authors assume, we use it to grant permission to users using specific code, often the code or user alone doesn't have the permission, so the user can't use the permission with foreign code and the code doesn't have permission with a different user. Parsed data comes from users, who should be authenticated. OpenJDK itself bases permissions around code and often uses AllPermission, end points often don't run with the authenticated user's Subject, eg RMI doesn't. It is unfortunate that SM has roots that go deep into the JDK, support for permission is also implemented c++ code, not just Java code. Currently OpenJDK doesn't prevent loading of untrusted code and has no mechanism to do so, anyone who can find a way to inject a URL into string that's passed to URLClassLoader will be capable of injecting code. This will be blamed on the code that didn't parse input properly, also there's a lot of library code and no one audits everything. OpenJDK developers are assuming that server code is static, audited and external data input is properly checked during parsing. This assumption eliminates the possibility of using dynamic class loading safely. https://www.exploit-db.com/papers/45517 One lesson from history is, attackers use privileged context to set SecurityManager null to disable it, this was the last step in many gadget chain attacks. This could have been easily addressed simply by throwing an IllegalArgumentException in Security::setSecurityManager if sm is null. Injection attacks always focused on obtaining privileged context, so we limit privileged context, but now OpenJDK has made everything privileged context, it's going to be much harder to defend against gadget attacks. Historically, attacks on Java's sandbox have done a lot of good in hardening Java, it's an arms race, now OpenJDK has given up that arms race, they've lost the client market, thanks to flaws in Java Serialization's design, this occurred during Sun Microsystems final days, prior to Oracle when funding was limited. I reimplemented Java Serialization over a decade ago, when I needed to secure it. I had to give up circular object graphs, used a standard constructor signature and isolated parameters from each class within their inheritance hierarchy, it reads ahead to ensure parameter types are correct before instantiation and has limit checks to defend against billion laugh style attacks. When I presented it to OpenJDK and offered to donate it... https://github.com/pfirmstone/JGDMS/wiki#atomic-serialization-example I think too much emphasis was placed on backward compatibility over security and too little too late was done to fix java Serialization, it's the gift the keeps giving. Jdk-with-authorization is more than just preserving SecurityManager, it's about improving security, making it simpler to reason about and taking advantage of the historical security hardening developed over decades, while taking advantage of modern features in recent Java releases. Recently I refactored Permission for immutability and PermissionCollection classes to use generics. I addressed race conditions in Permission implementations as their specification requires them to be immutable and threadsafe, but many weren't. Permission and PermissionCollection's are no longer Serializable, changes in implementation and support for old serial form meant the implantations couldn't be immutable and support Serialization. OpenJDK chose to sacrifice the safety and security provided by immutability and thread safety, to preserve backward compatibility with Serialization. I suspect this is why the default SecurityManager and Policy provider didn't perform, had OpenJDK developers made them non-blocking and performant, they would have had to deal with the race conditions. In JGDMS we called methods that initialized fields in Permission instances before publishing them to other threads. |
Summarizing our next steps and plan of action for 3.0 release. GoalWe try to answer below meta questions —
Ideally, we want the latest and greatest version of JAVA to be used in the Opensearch. We would like to use JDK-24 for 3.0 release of Opensearch expected to land in April 2025. Based on the known alternatives and their protection domain, we will to take a call what options are sufficient to place us in a confident state to live without security manager. Do we need a replacement?The open-source distribution heavily depends on security manager acting as a first line of security defense. Hence we must find a replacement for security manager. Again, we will not look for a full replacement. Until we are convinced with the new available security posture; we cannot upgrade Opensearch core and Plugins to JDK-24 — obviously we don’t want to remain pinned an older JDK version while a new (better) version is available. RequirementsBefore diving into alternatives to the Security manager, let’s first examine the types of protections it currently provides in OpenSearch. These will serve as the baseline requirements for identifying suitable alternatives.
Priority A
Priority B(not a blocker for 3.0)
Alternatives1 Systemd sandboxing[GH issue: https://github.com//issues/16729] Systemd provides security features that can be used to isolate processes from each other as well as from the underlying operating system. In other words it allow you to setup privilege separation between the different components of the OS. Today, there already exists a systemd setup which you can optionally use to start you Opensearch process. Moving ahead, we will suggest starting your Opensearch process with systemd as the most preferred and secure way. Most importantly it requires no infrastructure to setup on linux systems and hence distribution and usage becomes really useful. While there are whole lot of configs out there to build a highly secure sanboxed environment we will discuss the ones which interests our requirements and their usages (for clarity). Infact some of the configs available could bring in more protection than security-manager.
Overall this option does a great job to secure the Opensearch process against common side effects of vulnerabilities and untrusted code disrupting the OpenSearch process. Limitations —
2 GraalVM sandboxing[GH issue :https://github.com//issues/16861] Oracle GraalVM is a high-performance JDK that enhances Java and JVM-based applications through its Ahead-Of-Time (AOT) compiler. Beyond performance improvements, GraalVM also offers a sandboxing mechanism, which is particularly relevant for securely executing guest code within a host application.
This isolation ensures that guest code executes in a restricted and controlled environment, separate from the host's privileges. However, as of now, GraalVM supports JavaScript as a guest language, with full support for Java as a guest language is WIP refer [GR-49729] [Espresso] Support running without native access] While full guest Java support is still under development, GraalVM’s existing features (Expresso) can be used to:
The overall idea is to spawn a Guest GraalVM JVM with security manager enabled and guest and host share their objects via low level GraalVM interoperability API. Next lets’s see some high level steps to achieve this. You can also refer the PoC for a better understanding #16863 ProposalHost Environment:
Guest Environment:
A GraalVM Engine :
Limitations
Take-aways — Overall this approach allows to move forward with Java versions (JDK-24 and beyond) while preserving usage of security manager as it is used today.
3 Plugin level systemd[GH issue: https://github.com//issues/16753] Earlier we proposed to strengthen the Opensearch core security model via additional systemd configs such as limiting access to sockets and files. An advancement / extension of such sandboxing would be to run (some) plugins as a separate systemd unit (aka separate process), each of it with its own restrictive systemd config . This is akin to security-manager having plugin level security policies. This will also allow some plugins to run with elevated privileges without elevating the privileges of Core. The overall idea would be to expose a secure REST server within Opensearch core where plugin ↔ core interactions will be over secure, fast, bidirectional IPC. Such as IPC could be over Unix domain sockets which is fast, lightweight and can be modelled to use POSIX permissions to lock down access to the file descriptor (FD). This idea is an overlap of work being proposed as part of Project Extensions which is being currently halted for 4 JDK fork (not preferred)The idea is to maintain a fork of JDK preserving the security manager in JDK-24 and beyond. However, this approach is not ideal, as it would introduce significant overhead in maintaining the fork, particularly in porting bug fixes and updates from the upstream JDK. This solution should only be considered as a last resort if none of the previously discussed alternatives prove to be viable. ConclusionAssuming 3.0 lands in April 2025 with JDK-24, we are left with around three months of room from to pick alternatives which makes us feel comfortable to live without security manager. While this doc discussed multiple overlapping alternative, not all of these alternatives might be needed to be implemented necessarily for the 3.0 release. [1] Systemd sandboxing alone is very powerful and covers for a lot of what security manager already does today. It will protect Opensearch from most security risks. This will become our first line of defence. I would say, we are 90% covered with just [1]. Its a low hanging fruit and even if we are not able to ship 3.0 with JDK-24, we would still like to ship [1]. When it comes to [2] GraalVM sandboxing, it essentially means continuing usage of security manager even with JDK-24. The hardest part of the integration with GraalVM was already done by Andriy in his POC (#16861) and we would now assume that the integration could be delivered by March 2025. Callouts for [2]:[1] Plugins which are run in sandbox JVM, can only be upgraded to JDK-23. Once we have the full sandboxing available in Graal oracle/graal#10239, then these plugins can be upgraded to JDK-24 or beyond. [2] Not all plugins actually need to be instantiated within the Graal based forked sandbox, plugins which are Tightly coupled with OpenSearch Core, Trusted or Performance-critical can continue to run in the host JVM without sandboxing on >=JDK-24. We believe that [1] and [2] provide enough confidence to proceed with upgrading to JDK 24, with delivery expected by mid-March 2025. Once [2] evolves into a fully developed sandboxing environment (anticipated in Q2 2025), we plan to treat #1 and [2] collectively as a replacement for the Security Manager. We are temporarily setting aside [3], as it represents a significant amount of work, and meeting the April deadline seems unlikely. If GraalVM sandboxing integration proves problematic (e.g., harder debugging, unexpected bugs, perf issue etc.) within our ecosystem, we will revisit [3]. However, GraalVM community is very supportive and it has been smooth working with them. On the other hand, if GraalVM integration aligns well with our needs, we may reconsider using Extensions on GraalVM. This presents a major potential advantage, making the risk of GraalVM integration worthwhile. |
A few thoughts / questions: Is there a way to avoid needing SecurityManager in the Graal guest environment? In JGDMS there's a declared @AtomicSerial API for serialization / deserialization, for use with any protocol, I was working on support for ASN.1, but halted work after JEP411, until a solution was found for SM. This API is hardened against gadget attacks by failure atomicity and provides utility methods for input validation. JGDMS also has JERI (Jini Extensible Remote Invocation), which was designed by the people who designed RMI to address the pitfalls with RMI. If someone wanted, these features could be copied from JGDMS (AL2.0 license), and stripped down to their bare minimum, to use for communications between Host and Guests. I can provide guidance on how it works. As an aside, the fork of OpenJDK I'm currently maintaining with SM, contains significant performance enhancements and security improvements, if people would like to test and provide performance comparisons and feedback, that would be greatly appreciated. The maintenance cost has been less than expected and I've been able to make significant SM improvements in a short space of time. Whether I continue to maintain a fork is dependent on community interest and viability of other possible solutions. Recent build artifacts based on fork of OpenJDK 25, master branch: Linux x64: https://github.com/pfirmstone/jdk-with-authorization/actions/runs/12497991476/artifacts/2362229379 There's also a OpenJDK 24 fork branch here: The use of a hybrid Graal Systemd solution is compelling. If the guest is to use encryption over network connections, I think that might need to be performed by the host, for the guest, as it's not safe for the guest to have access to encryption keys, etc. On second thoughts, maybe independent truststore/ keystore's could be provided for each guest? |
Just documenting my forking strategy here in case it has been misunderstood:
There were a large number of merge conflicts during JEP 486, not unexpected. Release branches follow the same strategy, so that all upstream fixes and patches are included with weekly merges. Permission checks were like shotgun surgery, as they were spread throughout OpenJDK, it was a big job to remove them. We have a discord channel if anyone wants to become involved, let me know. The largest maintenance task isn't merging from upstream; it's looking at new JEP features and determining how they need to be protected by new permission checks. Some recent fixes: |
I wanted to sync with you on the outcome of the PoC before including it here. I was not clear if the PoC was finally working end-to-end. Secondly, I wanted an opinion if we'd need it if we had the Graal integration. |
@pfirmstone (going to answer some of your comments and will come back to others later)
this is a temporary hack. It won't be needed once oracle/graal#10239 is addressed.
that's the biggest concern for in-proc communication between plugins and core (discussed as con in Option 3).
I don't think we/I misunderstood the intentions here. We understand the dedication and amount of work you have put in to get this working. The challenge with fork is not only maintainability. A. This is not a long term solution, if we have a long term solution (GraalVM), we would like to pursue it. B. Cloud providers (such as AWS) or other organizations consuming a fork has to be convinced of usage of forked JDK given Open JDK states that security manager is not the right tooling for securing Java applications (although we know how useful security manager is). In general, we want to move away from what is deprecated and use more modern tools (if available). If an alternative is not available, we will stick with it. GraalVM usage with security manager is a small step to help us migrate to JDK-24. When JAVA sandboxing is available in GraalVM, we will remove usage of of security manager. That's the long term goal. That step is risky too, because GraalVM is very new, so we also don't want to overcommit and take baby steps. |
I think I may know a solution for that, but it requires modification to suit your use case. Currently it depends on SecurityManager, for authentication and authorization. But I don't think you need encryption, authorization and authentication for inter-process communications, it implements a subset of Java serialization (using a common constructor signature), without support for circular object graphs (million laugh attacks), it has defensive mechanisms that expect periodical stream resets, array and stream size limits, it doesn't serialize collections, instead it uses serializers that serialize an unmodifiable copy (not entirely true as it is array based, so could be modified in stream) and has api tooling to assist developers to perform type and input validation, such as checking collection's contain the correct types before copying their contents to a new collection. The api also allows invariant checks between subclass and superclasses, prior to calling a superclass and each class in an object has its own namespace for constructor arguments. https://github.com/pfirmstone/JGDMS/tree/trunk/JGDMS/jgdms-jeri IMHO Java serialization vulnerabilities destroyed the client Java market. A lot more could have been done sooner to address it, but I think timing and limited resources had a lot to do with it. SM is battle hardened, so I'm just basically leveraging that and addressing well documented published issues by security researchers (low hanging fruit). I have made some breaking changes, Permission's are no longer Serializable and it's no longer possible to set SM null (usually the last trick in a gadget attack), removed static permissions granted by code (prevents URL injection attacks) and reduced the size of the trusted platform to the java.base module. But it's also possibly an interim measure until something better comes along. It's also possible nothing better will come along, as security needs to be designed in at a language level, so it could become a long term interim measure. OpenJDK was very fast moving from deprecation to removal. It seems they've bet the farm on virtual threads, the asynchronous concurrency features hide valuable debugging information, so it makes sense they want to address that, however these aren't needed for high scalability, immutability, thread confinement, garbage collection, safe publication and NIO are more than sufficient for most, I suspect virtual threads will be a fizzer, I could be wrong, but I think they're trying to find a solution for a non-problem, but then there are some very promising, like the foreign function api, future possibilities such as reified generics. I still use primitive types, bit-shift operations etc, when I need performance and nothing else will cut it. Some of the tricks used in pooling threads in the past was to reduce their assigned memory, smaller object headers, there's plenty of good stuff in the pipeline. |
@kumargu I would like to see your efforts succeed. |
Yes, it is working end-to-end (for the socket connection as PoC), thanks @kumargu |
@kumargu It appears Graal doesn't use marshalling, it appears to be using memory access to java object structures... |
I think that is true, only if you use GraalVM building a native image. We are not going to use the native image, we just leverage sandboxing. |
Is your feature request related to a problem? Please describe.
It has been announced a while ago that
SecurityManager
is going to be phased out from the JDK. The first step, the deprecation of the SecurityManager (JEP-411), has been landed in JDK 17 and issues the following warnings on OpenSearch builds or server startup:The JDK 18 pushes it even further and now fails on startup (see please https://bugs.openjdk.java.net/browse/JDK-8270380), running OpenSearch builds or server on JDK 18 EA fails with:
It now requires JVM command line option to enable it explicitly using (see please [1]):
Describe the solution you'd like
There is no alternative or replacement for the
SecurityManager
(to understand why, Project Loom is to "blame"), see please [2]. One of the options is to just drop it, it sounds risky but combined with Plugin Sandbox (see please [3], [4]) it may sounds like a viable option. Other options include (but not limited to): bytecode instrumentation, java agent, custom classloader.Describe alternatives you've considered
We could keep it as long as we can, but once removed from the JDK, it will be a problem.
Additional context
The upcoming JDK-24 release disables
SecurityManager
permanently [6].See please links.
[1] https://inside.java/2021/12/06/quality-heads-up/
[2] https://inside.java/2021/04/23/security-and-sandboxing-post-securitymanager/
[3] #1572
[4] #1422
[5] A possible JEP to replace SecurityManager after JEP 411
[6] openjdk/jdk#21498
The text was updated successfully, but these errors were encountered: