{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":21663285,"defaultBranch":"main","name":"chapel","ownerLogin":"chapel-lang","currentUserCanPush":false,"isFork":false,"isEmpty":false,"createdAt":"2014-07-09T18:15:54.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/7597261?v=4","public":true,"private":false,"isOrgOwned":true},"refInfo":{"name":"","listCacheKey":"v0:1721405193.0","currentOid":""},"activityList":{"items":[{"before":"3f36e349f4246c68228ff25e9f1c975e989579e9","after":"9c0f91cee4ffe4ae50d2be3e0a19ded4a535bdb6","ref":"refs/heads/main","pushedAt":"2024-09-12T03:29:46.000Z","pushType":"pr_merge","commitsCount":3,"pusher":{"login":"DanilaFe","name":"Daniel","path":"/DanilaFe","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/4361282?s=80&v=4"},"commit":{"message":"Add support for qualified interface access (#25924)\n\nCloses https://github.com/chapel-lang/chapel/issues/25829.\r\n\r\nStrangely, even though we have an `isym` when building the class\r\nhierarchy, I chose to use the symbol's name to construct an `implements`\r\nstatement. This works when the name is available in scope without\r\nqualification, but not when qualified access is used. The principled\r\nsolution is to allow interface statements to be constructed using a\r\nknown interface symbol. This PR does that, and switches the class\r\nhierarchy logic to avoid using a name.\r\n\r\nReviewed by @jabraham17 -- thanks!\r\n\r\n## Testing\r\n- [x] paratest","shortMessageHtmlLink":"Add support for qualified interface access (#25924)"}},{"before":"e4c1b6f399b3b2e24f9ee85158d1a77bb3f18f86","after":"3f36e349f4246c68228ff25e9f1c975e989579e9","ref":"refs/heads/main","pushedAt":"2024-09-12T02:32:08.000Z","pushType":"pr_merge","commitsCount":3,"pusher":{"login":"bradcray","name":"Brad Chamberlain","path":"/bradcray","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/7536222?s=80&v=4"},"commit":{"message":"Revert #25754 (improvements to localAccess() calls) (#25936)\n\n[trivial, not reviewed]\r\n\r\nWhile #25754 did a nice job of improving the error messages for local\r\naccesses, its implementation was sufficiently heavyweight that it\r\nresulted in new timeouts in testing for tests that made a lot of use of\r\nlocalAccesses, primarily noticed in configurations that were already\r\nslow (valgrindexe, memleaks, baseline).\r\n\r\nOn one hand, I didn't fully appreciate how much overhead the new\r\nimplementation added; but more importantly, I incorrectly thought it\r\nwould only come into play rarely—for codes that explicitly used\r\nlocalAccess() (with checks on), which I think of as being an uncommon\r\noccurrence. What I forgot is that our auto-local-access (ALA)\r\noptimization would also cause it to fire for many codes that don't\r\ncontain explicit localAccess calls. As a particularly bad example, Engin\r\nfound that test/studies/hpcc/PTRANS/old/PTRANS.chpl went from 35 to 62\r\nseconds on his laptop despite not containing any localAccess calls.\r\n\r\nWe could just turn off the new checks by default for this release, but\r\nit felt simpler/saner to me to just revert the PR for now given that\r\nit's code-freeze day and that this wasn't a release-critical change.\r\n\r\nI think the way to improve this overhead going forward is probably to\r\nchange the logic from the current approach of:\r\n\r\n* gather all local subdomains a locale owns\r\n - see if the index we're accessing is within any of them\r\n - print the subdomains out in the error message if it's not\r\n\r\nto:\r\n\r\n* ask the distribution itself whether the index is stored locally\r\n- this is trivial for most distributions; slightly less-so for the\r\nStencil distribution due to fluff\r\n* only gather all local subdomains if we get a locality error and need\r\nto print out which indices the local locale owns (or maybe just have the\r\ndistribution print that out itself?)\r\n\r\nEven in that case, we could improve on the current logic by doing more\r\nspecialization for the (common) \"locale only owns a single sublocale\"\r\ncase than I did in the current approach. However, I think it's arguably\r\nfine to spend more time gathering the subdomains when we're about to\r\nprint an error and exit the program anyway.","shortMessageHtmlLink":"Revert #25754 (improvements to localAccess() calls) (#25936)"}},{"before":"56a7c7512e3a908dd7f72eb695c1ab1cd9845c1f","after":"e4c1b6f399b3b2e24f9ee85158d1a77bb3f18f86","ref":"refs/heads/main","pushedAt":"2024-09-11T23:45:12.000Z","pushType":"pr_merge","commitsCount":2,"pusher":{"login":"e-kayrakli","name":"Engin Kayraklioglu","path":"/e-kayrakli","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/4141670?s=80&v=4"},"commit":{"message":"Prevent offset access with block arrays to thwart dynamic ALA (#25933)\n\nThis PR makes sure that ALA is applicable for:\r\n\r\n```chpl\r\nvar A = blockDist.createArray(...)\r\nconst InnerDomain = A.domain.expand(-1);\r\n\r\nforall i in InnerDomain {\r\n A[i] = A[i-1];\r\n}\r\n```\r\n\r\nThings of note:\r\n\r\n- this is a dynamically optimized loop, because the compiler can't\r\nstatically determine that `InnerDomain` and `A.domain` are aligned.\r\n- the body contains both regular (`A[i]`) and offset (`A[i-1]`) access,\r\nwhere the latter is not optimizable for a block-distributed array\r\n\r\nThe problem is, https://github.com/chapel-lang/chapel/pull/25712\r\nmodified the dynamic checks in scenarios like this. After that PR, the\r\ngenerated optimization looked like\r\n\r\n```chpl\r\nparam staticRegularFlag = alaStaticallySupported(A); // added for A[i] (true)\r\nparam staticOffsetFlag = alaOffsetStaticallySupported(A); // added for A[i-1] (false)\r\n\r\nif ( (!staticRegularFlag || alaDynamicallySupported(A)) &&\r\n alaOffsetCheck(A, 1)) {\r\n // optimized loop\r\n}\r\nelse {\r\n // unoptimized loop\r\n}\r\n```\r\n\r\nNote that ALA has static override (`!staticRegularFlag ||` part) which\r\nallows optimization and the dynamic checks to be reverted in case a\r\nstatic check fails. This allows some accesses to be optimized while some\r\nothers fails. However, the same override is not there for the offset\r\ncheck. This PR adds a similar `!staticOffsetFlag ||` for similar\r\npurposes. This allows `A[i]` to be optimized as before while `A[i-1]` is\r\nnot.\r\n\r\n[Reviewed by @stonea]\r\n\r\nTest:\r\n- [x] performance goes back to where we were before\r\n- [x] gasnet \r\n- [x] linux64","shortMessageHtmlLink":"Prevent offset access with block arrays to thwart dynamic ALA (#25933)"}},{"before":"958af5ed7328f7ef2ea0e4112a93f58ab2d8a8ea","after":"56a7c7512e3a908dd7f72eb695c1ab1cd9845c1f","ref":"refs/heads/main","pushedAt":"2024-09-11T23:28:08.000Z","pushType":"pr_merge","commitsCount":2,"pusher":{"login":"jabraham17","name":"Jade Abraham","path":"/jabraham17","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/15747900?s=80&v=4"},"commit":{"message":"Remove future for bug that has been fixed (#25935)\n\nRemoves `test/classes/deinitializers/deinit-from-throws.chpl` which now\r\npasses on main after https://github.com/chapel-lang/chapel/pull/25919.\r\n\r\nThis PR removes the test, because\r\nhttps://github.com/chapel-lang/chapel/pull/25919 actually added a\r\nduplicate test\r\n\r\n[Not reviewed - trivial]","shortMessageHtmlLink":"Remove future for bug that has been fixed (#25935)"}},{"before":"3d23b8e59eef6a1dec9361417459ed101407fd24","after":"958af5ed7328f7ef2ea0e4112a93f58ab2d8a8ea","ref":"refs/heads/main","pushedAt":"2024-09-11T23:23:32.000Z","pushType":"pr_merge","commitsCount":3,"pusher":{"login":"e-kayrakli","name":"Engin Kayraklioglu","path":"/e-kayrakli","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/4141670?s=80&v=4"},"commit":{"message":"Be more selective in terms of clang version when using a flag in CUB compilation (#25930)\n\nhttps://github.com/chapel-lang/chapel/pull/25918 added\r\n`-Wno-error=deprecated-builtins` when compiling the runtime's CUB\r\nwrappers. However, that flag is not supported in some older clangs.\r\nSpecifically, we added that to enable builds with clang 18, but clang 14\r\ndoesn't have that flag. This PR adds the problematic flag only if the\r\nclang version supports it.\r\n\r\n[Reviewed by @jabraham17]\r\n\r\nTest:\r\n- [x] build proceeds on a system with the same issue","shortMessageHtmlLink":"Be more selective in terms of clang version when using a flag in CUB …"}},{"before":"c22e376419e0ac316bd6919a46483231a8bf3379","after":"3d23b8e59eef6a1dec9361417459ed101407fd24","ref":"refs/heads/main","pushedAt":"2024-09-11T22:31:43.000Z","pushType":"pr_merge","commitsCount":2,"pusher":{"login":"jhh67","name":"John H. Hartman","path":"/jhh67","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/3190372?s=80&v=4"},"commit":{"message":"Unload cce prior to loading PrgEnv-gnu (#25934)\n\nThe cce module appears to cause a problem when used with the\r\nPrgEnv-gnu and libfabric modules loaded, as the `hwloc`\r\nconfiguration fails because -lfabric is not found. Unload the cce\r\nmodule prior to loading the PrgEnv-gnu module and let it reload cce\r\nshould it want to, but this way cce isn't loaded if PrgEnv-gnu\r\ndoesn't need it.","shortMessageHtmlLink":"Unload cce prior to loading PrgEnv-gnu (#25934)"}},{"before":"4b989050b60e9815df38e03807463cafaaf8e25e","after":"c22e376419e0ac316bd6919a46483231a8bf3379","ref":"refs/heads/main","pushedAt":"2024-09-11T18:02:18.000Z","pushType":"pr_merge","commitsCount":2,"pusher":{"login":"e-kayrakli","name":"Engin Kayraklioglu","path":"/e-kayrakli","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/4141670?s=80&v=4"},"commit":{"message":"Remove an unused variable in the runtime causing errors (#25929)\n\nThe variable in question slipped by as a copy/paste leftover from the\r\nNVIDIA implementation. This PR removes the variable.\r\n\r\n[Trivial nightly fix, not reviewed]\r\n\r\nTest:\r\n- [x] amd","shortMessageHtmlLink":"Remove an unused variable in the runtime causing errors (#25929)"}},{"before":"cbaea6091b210616e5f0447e026a4723f33daf0b","after":"4b989050b60e9815df38e03807463cafaaf8e25e","ref":"refs/heads/main","pushedAt":"2024-09-11T17:37:36.000Z","pushType":"pr_merge","commitsCount":6,"pusher":{"login":"stonea","name":"Andy Stone","path":"/stonea","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2591890?s=80&v=4"},"commit":{"message":"separate 1-node and 16-node ex perf jobs (#25910)\n\nOur currently nightly EX perf testing isn't passing -perflabel ml so\r\neven though we call it a \"16 node\" job, in actuality it's only running\r\nour single locale tests.\r\n\r\nThis PR fixes this so the job passes the perf label and we have another\r\n1-node ex perf test.\r\n\r\n[reviewed-by: @jabraham17]","shortMessageHtmlLink":"separate 1-node and 16-node ex perf jobs (#25910)"}},{"before":"5c4e3cf572c2da01910692784b95b97d88c2fae5","after":"cbaea6091b210616e5f0447e026a4723f33daf0b","ref":"refs/heads/main","pushedAt":"2024-09-11T00:10:25.000Z","pushType":"pr_merge","commitsCount":10,"pusher":{"login":"e-kayrakli","name":"Engin Kayraklioglu","path":"/e-kayrakli","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/4141670?s=80&v=4"},"commit":{"message":"Remove c_sublocid_any and adjust direct on logic (#25898)\n\nThis PR removes `c_sublocid_any` and replaces all occurances with\r\n`c_sublocid_none`. With that change, it also adjusts the `on` statement\r\nlogic for the GPU locale model.\r\n\r\nResolves https://github.com/chapel-lang/chapel/issues/24259\r\n\r\n### Why remove `c_sublocid_any`?\r\n\r\nThe notion of \"any sublocale\" doesn't apply to the current state we are\r\nin. It goes back to the days of the NUMA locale model, where tasks can\r\nprobably be assigned to sublocales more freely. Today, there is no NUMA\r\nlocale model and the \"sublocale\" concept is only relevant for the GPU\r\nlocale model. In the GPU locale model, tasks must have a strict\r\nsublocale. Otherwise, they are not targetting any device.\r\n\r\nAs we inherited some of the GPU locale implementation from NUMA (and\r\nAPU) locale models, the existence of `c_sublocid_any` has created some\r\nconfusion, leading to the bug reported in #24259. In the GPU locale\r\nmodel, we want to use `c_sublocid_none` to indicate the task being run\r\non the host, however, the runtime used `c_sublocid_any` almost\r\nexclusively, further contributing to the confusion. Thus, we are\r\nremoving `c_sublocid_any` and using `c_sublocid_none` in places where it\r\nwas used before.\r\n\r\n[Reviewed by @jabraham17]\r\n\r\nTest:\r\n- [x] linux64\r\n- [x] gasnet\r\n- [x] nvidia","shortMessageHtmlLink":"Remove c_sublocid_any and adjust direct on logic (#25898)"}},{"before":"dd457a827eb960df776a797834734510cb47b62a","after":"5c4e3cf572c2da01910692784b95b97d88c2fae5","ref":"refs/heads/main","pushedAt":"2024-09-10T22:46:32.000Z","pushType":"pr_merge","commitsCount":3,"pusher":{"login":"e-kayrakli","name":"Engin Kayraklioglu","path":"/e-kayrakli","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/4141670?s=80&v=4"},"commit":{"message":"Adjust flags to suppress/disable warnings from CUB 11 with newer Clangs (#25918)\n\nThis PR suppresses some warnings coming from CUB 11. We discovered these\r\nwhen on one of the test machines was upgraded from clang 14 to 18. I am\r\nnot sure at which clang version we start to see the warnings in\r\nquestion. So, the changes do not check for the clang version. It'd be\r\ngreat if we could check for the CUDA version, but unfortunately that's\r\nnot wired in right now.\r\n\r\n[Reviewed by @jabraham17]\r\n\r\nTest:\r\n- [x] clean build in a local environment where I was able to repro the\r\nissue\r\n- [x] standard\r\n- [x] nvidia\r\n- [x] amd","shortMessageHtmlLink":"Adjust flags to suppress/disable warnings from CUB 11 with newer Clan…"}},{"before":"15e0d6efe6d12674e51a9f54c30e97a563bb730f","after":"dd457a827eb960df776a797834734510cb47b62a","ref":"refs/heads/main","pushedAt":"2024-09-10T22:22:09.000Z","pushType":"pr_merge","commitsCount":8,"pusher":{"login":"e-kayrakli","name":"Engin Kayraklioglu","path":"/e-kayrakli","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/4141670?s=80&v=4"},"commit":{"message":"Workaround a bug that prevents deallocating GPU memory from CPU (#24300)\n\nThis works around a bug that @ShreyasKhandekar discovered while working\r\non Arkouda.\r\n\r\nWhen a GPU-allocated piece of memory is stored in a CPU-based data\r\nstructure like a `map`, that GPU-allocated memory needs to be freed by\r\nthe CPU (outside of the GPU sublocale that it was initially allocated\r\non). Our runtime implementation was not ready for that. This PR fixes\r\nthat.\r\n\r\n[Reviewed by @ShreyasKhandekar]\r\n\r\nTest:\r\n- [x] ~the arkouda case works fine~ we currently need this fix for\r\nsomething else and it helps\r\n- [x] simpler reproducer on nvidia\r\n- [x] simpler reproducer on amd \r\n- [x] full nvidia\r\n- [x] full amd","shortMessageHtmlLink":"Workaround a bug that prevents deallocating GPU memory from CPU (#24300)"}},{"before":"24de4d2ad6c55f5394e172f2f97fd84a824d0a03","after":"15e0d6efe6d12674e51a9f54c30e97a563bb730f","ref":"refs/heads/main","pushedAt":"2024-09-10T22:04:11.000Z","pushType":"pr_merge","commitsCount":21,"pusher":{"login":"bradcray","name":"Brad Chamberlain","path":"/bradcray","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/7536222?s=80&v=4"},"commit":{"message":"Add/improve errors for `.localAccess()` calls to non-local elements (#25754)\n\n[reviewed by @e-kayrakli ]\r\n\r\nThis PR adds and improves errors for `.localAccess()` calls that access\r\nremote elements.\r\n\r\nThe original motivation for the PR was to fix the case noted in issue\r\n#25747 in which `.localAccess()` on a default rectangular array would\r\nalways pass, whether that array was stored locally or on a remote locale\r\n(captured in `test/arrays/locality/localAccess/localAccess.chpl`). It\r\nalso improves the error message for `.localAccess()` violations on block\r\narrays by expressing the error as a locality error rather than a simple\r\nout-of-bounds error, as before (also reported in #25747, captured in\r\n`test/arrays/locality/localAccess/localAccess-block*.chpl`). These new\r\ntests make sure that these errors are generated for read and write cases\r\nas well as for POD vs. non-POD arrays, since our modules have distinct\r\ndsiAccess overloads for these cases.\r\n\r\nThe check itself is implemented as an additional/optional check in\r\n`checkAccess()` which is called for all array accesses when checking is\r\non. It now takes an optional `ensureLocal` param that is false by\r\ndefault, but set to true for all `.localAccess()` code paths to enable\r\nthe bounds checking. The logic itself generates different error messages\r\ndepending on whether the array is completely local to a single (remote)\r\nlocale, distributed but the current locale owns nothing, or distributed\r\nbut the index is outside the current locale's bounds.\r\n\r\nThe implementation approach I originally pursued to determine whether an\r\naccess was in-bounds was to use the array's `localSubdomains()` call to\r\ncheck that the localAccess's indices were in-bounds. However, this\r\ncaused problems for the Stencil distribution since its localSubdomain(s)\r\n(correctly) only refer to the indices that the locale truly owns, but\r\nnot ones that are part of its \"fluff\" (cached ghost cells). Yet, we also\r\nwant that fluff to be accessible using localAccess() since that's how we\r\nget performance when the compiler can't optimize them automatically,\r\nrequiring a slightly more sophisticated approach described in the next\r\nparagraph. New test `test/distributions/stencil/localAccess.chpl` locks\r\nin a very simple case of this Stencil distribution pattern, with some\r\naccesses passing due to true ownership, some due to fluff ownership, and\r\none generating an error due to a remote access.\r\n\r\nThe Stencil distribution challenge led to the approach taken here, which\r\nis to introduce a new internal iterator called\r\n`chpl__localStoredSubdomains` whose purpose is to yield a locale's\r\nsubdomains, including indices that it doesn't truly own, like the fluff\r\nof the stencil distribution. This is implemented using an optional call\r\non the array, `doiLocalStoredSubdomains()`, and if not supported, we\r\nfall back to using the `localSubdomains()` call instead. I then\r\nimplemented `doiLocalStoredSubdomains()` on Stencil distributions.\r\n\r\nAlong the way, I found that `dsiLocalSubdomains()` calls weren't\r\nimplemented for Block, Cyclic, and Stencil, so added those as well.\r\n\r\nThen, while testing this, I found that our current tests that lock in\r\nsupport for using oversubscribed block/cyclic/stencil distributions\r\n(that is, ones in which a single locale appears in targetLocales\r\nmultiple times) were getting away with using `.localAccess()` to access\r\nelements that weren't actually local, so I updated them to only update\r\nlocal array elements. The reason for this is that the `dsiAccess()`\r\nroutine on these distributions simply falls back to a normal\r\n`dsiAccess()` in the event of oversubscription—meaning that we're also\r\nlosing some performance there in addition to not actually doing a local\r\naccess... I also found that the stencil variant of this test was never\r\nset up to run with multiple locales, so updated it to do so.\r\n\r\nNote that much of the code in this effort may not be optimal\r\nperformance-wise (e.g., iterating over local subdomains when the common\r\ncase is that a locale only has a single subdomain), but that this seemed\r\nacceptable—at least for the time being—since this code is only called\r\nwhen checks are on.\r\n\r\nTwo bad files `test/expressions/if-expr/*.bad` had to be updated due to\r\nshifting IDs in the AST… :(\r\n\r\nResolves #25747.","shortMessageHtmlLink":"Add/improve errors for .localAccess() calls to non-local elements (#…"}},{"before":"eb45d7425de43fc9595a0e2a9f4ea474392d5d73","after":"24de4d2ad6c55f5394e172f2f97fd84a824d0a03","ref":"refs/heads/main","pushedAt":"2024-09-10T18:09:46.000Z","pushType":"pr_merge","commitsCount":3,"pusher":{"login":"DanilaFe","name":"Daniel","path":"/DanilaFe","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/4361282?s=80&v=4"},"commit":{"message":"Ensure variable declarations at the end of try blocks are cleaned up (#25919)\n\nCloses https://github.com/chapel-lang/chapel/issues/25548, in which\r\n`try! whatever()` caused the result of `whatever()` to not be cleaned\r\nup, even if no exception occurred.\r\n\r\nTo see why this happens, consider the following (simplified) code\r\ndesugaring `try! expr`:\r\n\r\n```Chapel\r\nvar x1 = ... \r\nvar x2 = ...\r\nvar x3 = ...\r\nif (error) {\r\n goto error handling;\r\n}\r\n```\r\n\r\n`x3 = …` is presumably the call that creates the error, so `x3` may be\r\nuninitialized at the time we do `if (error)`. When traversing this code\r\nlinearly and inserting autodestroys, to avoid processing `x3` (which may\r\nbe uninit'ed), we go right into the `if (error)` just before we visit\r\n`x3` (going out of order). This way, when we insert auto-destroys for\r\nerror handling (which happens to auto-destroy all variables in scope,\r\nsince we are unwinding), we don’t auto-destroy `x3`, which may be\r\nuninit’ed.\r\n\r\nHowever, we use the same code to do this early visit into `if (error)`\r\nas we do for any other statement. And other logic in that code says\r\n“well if this statement is the last mention of any variables, insert\r\nauto-destroys”. Thus, we insert auto-destroys while doing this early\r\nprocessing of `if (error)`, and mark all variables as having been\r\ndeinited… except that we haven’t marked `x3` as inited yet (previous\r\nparagraph), so it doesn’t get an auto-destroy, so it leaks. We mark it\r\nas deinited without having inserted the deinitialization code.\r\n\r\ninterestingly we visit `if (error)` again, normally, after `x3`, and it\r\ntries to do “insert auto-destroys” as well, but it’s a nop-op since\r\nthey’ve already been inserted and all the variables have been marked\r\nalready-uninitialized. This PR just makes the early `if (error)` not do\r\nthe cleanup (since it will miss `x3`), and lets that cleanup fall\r\nthrough to the non-early if (error) handling. The early `if (error)`\r\nstill inserts auto-destroys for unwinding (as one would expect).\r\n\r\nReviewed by @jabraham17 -- thanks!\r\n\r\n## Testing\r\n- [x] paratest\r\n- [x] paratest (memleaks)","shortMessageHtmlLink":"Ensure variable declarations at the end of try blocks are cleaned up (#…"}},{"before":"64b8a6d12cbd564be0a504231b9b5f051454be1b","after":"eb45d7425de43fc9595a0e2a9f4ea474392d5d73","ref":"refs/heads/main","pushedAt":"2024-09-10T18:09:00.000Z","pushType":"pr_merge","commitsCount":5,"pusher":{"login":"mppf","name":"Michael Ferguson","path":"/mppf","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/3653132?s=80&v=4"},"commit":{"message":"Avoid array allocation for moving to equal domain (#25896)\n\nResolves #25741\r\n\r\n`chpl__coerceMove(_array)` already has the ability to omit an array copy\r\nwhen the source and destination arrays refer to the same domain.\r\n\r\nHowever, when the domains have equal value but are separate instances,\r\n`chpl__coerceMove` would allocate a separate array and then move\r\nelements into it from the original array.\r\n\r\nThis leads to unnecessary memory usage in some cases; for example, a\r\nrecursive function that returns an array and so declares the return\r\ntype.\r\n\r\nThis commit adds a mechanism for array implementations to provide a way\r\nto steal the data buffers from another array which can be used in this\r\nscenario. In some ways this is similar to `doiOptimizedSwap`. Note that\r\n`doiOptimizedSwap` cannot be used here because we want to avoid\r\nallocation for one of the arrays.\r\n\r\nFuture Work:\r\n* add an implementation of `doiBuildArrayMoving` for more array types\r\n(most notably, Block, Cyclic, and Stencil arrays)\r\n* figure out how to get this optimization working for arrays-of-arrays.\r\nCurrently, they are disabled, because something is going wrong with\r\ncorrectly setting up the domain for the nested arrays.\r\n\r\nReviewed by @benharsh - thanks!\r\n\r\n- [x] primers pass with valgrind\r\n- [x] full comm=none testing\r\n- [x] full comm=gasnet oversubscribed testing","shortMessageHtmlLink":"Avoid array allocation for moving to equal domain (#25896)"}},{"before":"943c49c08f8565668a2c12a515c07fa8b612bfdf","after":"64b8a6d12cbd564be0a504231b9b5f051454be1b","ref":"refs/heads/main","pushedAt":"2024-09-10T16:36:38.000Z","pushType":"pr_merge","commitsCount":3,"pusher":{"login":"jhh67","name":"John H. Hartman","path":"/jhh67","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/3190372?s=80&v=4"},"commit":{"message":"Remove FI_DELIVERY_COMPLETE in message-order-fence mode (#25921)\n\nThe Cassini-1 NIC does not support FI_DELIVERY_COMPLETE and specifying\r\nit can cause hangs. See Cray/chapel-private#1661 and\r\nCray/chapel-private#6677 for details.\r\n\r\n[Reviewed by @jabraham17, thank you.]","shortMessageHtmlLink":"Remove FI_DELIVERY_COMPLETE in message-order-fence mode (#25921)"}},{"before":"ca4ae0fa05eafc6fe95f227286cf31382c8eaece","after":"943c49c08f8565668a2c12a515c07fa8b612bfdf","ref":"refs/heads/main","pushedAt":"2024-09-10T16:28:09.000Z","pushType":"pr_merge","commitsCount":6,"pusher":{"login":"mppf","name":"Michael Ferguson","path":"/mppf","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/3653132?s=80&v=4"},"commit":{"message":"Add CHPL_LLVM_GCC_INSTALL_DIR as an alternative to CHPL_LLVM_GCC_PREFIX (#25913)\n\nThis PR adds `CHPL_LLVM_GCC_INSTALL_DIR` as an alternative to\r\n`CHPL_LLVM_GCC_PREFIX`.\r\n\r\nThis PR is motivated by the discussion in\r\nhttps://chapel.discourse.group/t/cannot-make-gpu-enabled-chapel/37046/12\r\n.\r\n\r\n`CHPL_LLVM_GCC_PREFIX` allows one to select a GCC installation. The\r\ntrouble is, on some systems (E.g. Ubuntu 24.04), there can be multiple\r\nversions of GCC installed at the same time, with the same prefix (`/usr`\r\nin this case).\r\n\r\n`clang` actually supports `--gcc-install-dir` as of `clang` 16. That\r\nflag can be used to request a particular version of GCC on Ubuntu.\r\n\r\nTo understand what one might want to include:\r\n* run `clang++ -v hello.cc` where `hello.cc` might just contain `int\r\nmain() { return 0; }`\r\n * you can see the GCC installations it considers at the beginning, e.g.\r\n ```\r\nFound candidate GCC installation:\r\n/usr/bin/../lib/gcc/x86_64-linux-gnu/11\r\nFound candidate GCC installation:\r\n/usr/bin/../lib/gcc/x86_64-linux-gnu/12\r\nFound candidate GCC installation:\r\n/usr/bin/../lib/gcc/x86_64-linux-gnu/13\r\nFound candidate GCC installation:\r\n/usr/bin/../lib/gcc/x86_64-linux-gnu/14\r\n Selected GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/14\r\n ```\r\n* you can try to change which version should be used, e.g. with `clang++\r\n-v hello.cc --gcc-install-dir=/usr/bin/../lib/gcc/x86_64-linux-gnu/13 `\r\n\r\nReviewed by @e-kayrakli - thanks!\r\n\r\n- [x] able to build a GPU-enabled runtime on Ubuntu 24.04 with GCC 14\r\ninstalled\r\n- [x] full comm=none testing","shortMessageHtmlLink":"Add CHPL_LLVM_GCC_INSTALL_DIR as an alternative to CHPL_LLVM_GCC_PREF…"}},{"before":"f74a0a0acfb6f5965addd3b40b0fba0c9f599fa2","after":"ca4ae0fa05eafc6fe95f227286cf31382c8eaece","ref":"refs/heads/main","pushedAt":"2024-09-10T16:17:03.000Z","pushType":"pr_merge","commitsCount":3,"pusher":{"login":"jhh67","name":"John H. Hartman","path":"/jhh67","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/3190372?s=80&v=4"},"commit":{"message":"Revert \"Require FI_ORDER_RMA_WAW with message-order-fence MCM\" (#25920)\n\nThis reverts commit 9575c8567aa72e65af46f7c6364c76f1a11f4981.\r\n\r\nFI_ORDER_RMA_WAW is not required in message-order-fence because the\r\ncompiler does not issue non-blocking PUTs, and even it if did\r\nchpl_comm_put_nb is currently implemented using blocking PUTs. So\r\nthere is no opportunity for PUTs by a single task to be re-ordered.\r\n\r\n[Reviewed by @jabraham17, thank you.]","shortMessageHtmlLink":"Revert \"Require FI_ORDER_RMA_WAW with message-order-fence MCM\" (#25920)"}},{"before":"6ed664aebafbab3c930f3611c900365a6924c421","after":"f74a0a0acfb6f5965addd3b40b0fba0c9f599fa2","ref":"refs/heads/main","pushedAt":"2024-09-09T23:56:11.000Z","pushType":"pr_merge","commitsCount":3,"pusher":{"login":"jhh67","name":"John H. Hartman","path":"/jhh67","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/3190372?s=80&v=4"},"commit":{"message":"Fix GASNet co-locale performance test scripts (#25917)\n\nFix GASNet co-locale performance test scripts.\r\n\r\n[Reviewed by @jabraham17, thank you.]","shortMessageHtmlLink":"Fix GASNet co-locale performance test scripts (#25917)"}},{"before":"f4edb336ae469191aa052ff0b813ade3849a3e07","after":"6ed664aebafbab3c930f3611c900365a6924c421","ref":"refs/heads/main","pushedAt":"2024-09-09T22:39:32.000Z","pushType":"pr_merge","commitsCount":4,"pusher":{"login":"e-kayrakli","name":"Engin Kayraklioglu","path":"/e-kayrakli","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/4141670?s=80&v=4"},"commit":{"message":"Improve \"Can't find nvidia/amd toolkit\" error (#25912)\n\n1. Suggested by @mppf, this PR adds `Try setting CHPL_CUDA_PATH to the\r\ncuda installation path` to the error message. Before, `CHPL_CUDA_PATH`\r\nwas not mentioned in the error message.\r\n2. The error says \"nvidia toolkit\" or \"amd toolkit\". Those are not real\r\nthings. We need \"cuda toolkit\" or \"rocm toolkit\". This PR adjusts for\r\nthat. Capitalizations are still not perfect, but I don't want to wire a\r\nnew variable in on these scripts at this point.\r\n\r\n\r\nTested on a system with no GPUs that:\r\n\r\n```\r\n> export CHPL_GPU=nvidia\r\n> printchplenv\r\n\r\nError: Can't find cuda toolkit. Try setting CHPL_CUDA_PATH to the cuda installation path. To avoid this issue, you can have GPU code run on the CPU by setting 'CHPL_GPU=cpu'. To turn this error into a warning set CHPLENV_GPU_REQ_ERRS_AS_WARNINGS.\r\n\r\n> export CHPL_GPU=amd\r\n> printchplenv\r\n\r\nError: Can't find rocm toolkit. Try setting CHPL_ROCM_PATH to the rocm installation path. To avoid this issue, you can have GPU code run on the CPU by setting 'CHPL_GPU=cpu'. To turn this error into a warning set CHPLENV_GPU_REQ_ERRS_AS_WARNINGS.\r\n```\r\n\r\nScripts continue to function normally on a system with a GPU.\r\n\r\n[Reviewed by @vasslitvinov]","shortMessageHtmlLink":"Improve \"Can't find nvidia/amd toolkit\" error (#25912)"}},{"before":"3fe6f70bfd2a8caba06ea51ae3a9436995414183","after":"f4edb336ae469191aa052ff0b813ade3849a3e07","ref":"refs/heads/main","pushedAt":"2024-09-09T22:22:49.000Z","pushType":"pr_merge","commitsCount":2,"pusher":{"login":"e-kayrakli","name":"Engin Kayraklioglu","path":"/e-kayrakli","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/4141670?s=80&v=4"},"commit":{"message":"Use the correct environment to detect the backend in CHAMPS testing (#25897)\n\nCHAMPS uses `CHPL_TARGET_COMPILER` to set the actual backend. To my\r\nsurprise, it runs with `CHPL_LLVM=system` even with the C backend. I am\r\nnot sure if it has any implications. In any case, this PR adjusts the\r\nrecent fix to skip a problematic example to use the correct environment\r\nvariable.\r\n\r\n[Trivial fix for a nightly config, not reviewed]","shortMessageHtmlLink":"Use the correct environment to detect the backend in CHAMPS testing (#…"}},{"before":"4a3d3445dc199ccbb5bd14c80f02e1d7f83f1216","after":"3fe6f70bfd2a8caba06ea51ae3a9436995414183","ref":"refs/heads/main","pushedAt":"2024-09-09T22:11:43.000Z","pushType":"pr_merge","commitsCount":4,"pusher":{"login":"jabraham17","name":"Jade Abraham","path":"/jabraham17","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/15747900?s=80&v=4"},"commit":{"message":"Fix scoping for interfaces inside of functions (#25916)\n\nFixes an issue where `record R: I` was not being correctly resolved, due\r\nto an issue with how the `ImplementsStmt` AST node was being created.\r\n\r\nResolves https://github.com/chapel-lang/chapel/issues/25838 and\r\nhttps://github.com/chapel-lang/chapel/issues/25911\r\n\r\n- [x] tested with a full paratest with/without comm\r\n\r\n[Reviewed by @DanilaFe]","shortMessageHtmlLink":"Fix scoping for interfaces inside of functions (#25916)"}},{"before":"1ce054d43e0edfd964529e8aea7eade5af5c509b","after":"4a3d3445dc199ccbb5bd14c80f02e1d7f83f1216","ref":"refs/heads/main","pushedAt":"2024-09-09T22:08:59.000Z","pushType":"pr_merge","commitsCount":5,"pusher":{"login":"DanilaFe","name":"Daniel","path":"/DanilaFe","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/4361282?s=80&v=4"},"commit":{"message":"Fix index variable leak when forall body is empty (#25915)\n\nCloses https://github.com/chapel-lang/chapel/issues/25893, reverts\r\nhttps://github.com/chapel-lang/chapel/pull/25894.\r\n\r\nI was happy to find logic that notes down loop index variables as\r\nsomething that needs to be auto-destroyed in a scope. However, while\r\nstepping through in the debugger, I noticed that the cleanup only\r\nhappens \"per statement\", which means that loops with empty bodies over\r\niterators do not have their bounds cleaned up:\r\n\r\n```Chapel\r\n[x in myiter()] ;; // does not clean up var\r\n```\r\n\r\nOne of the reasons was that an \"anchor statement\" was needed to know\r\nwhere to put all the auto-destroys. This PR fixes the problem by\r\ndetecting the situation (empty block statement but need to clean up\r\nvariables), inserting a dummy `noop`, and using that as anchor.\r\n\r\nReviewed by @jabraham17 -- thanks!\r\n\r\n## Testing\r\n- [x] paratest","shortMessageHtmlLink":"Fix index variable leak when forall body is empty (#25915)"}},{"before":"28a9b6897d6baf7199192ad6ab8c61970be791aa","after":"1ce054d43e0edfd964529e8aea7eade5af5c509b","ref":"refs/heads/main","pushedAt":"2024-09-09T20:20:39.000Z","pushType":"pr_merge","commitsCount":4,"pusher":{"login":"vasslitvinov","name":"Vass Litvinov","path":"/vasslitvinov","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/8039635?s=80&v=4"},"commit":{"message":"Fix gpu num threads (#25909)\n\nThis fixes a bug introduced in #25855 where the attribute\r\n @gpu.itersPerThread\r\nwas ignored when calculating the number of threads used to set up\r\na GPU kernel. The number of threads is calculated correctly now.\r\n\r\nr: @DanilaFe","shortMessageHtmlLink":"Fix gpu num threads (#25909)"}},{"before":"54fd90dc7829dd8f9e75cea8a41eba7b202a6edf","after":"28a9b6897d6baf7199192ad6ab8c61970be791aa","ref":"refs/heads/main","pushedAt":"2024-09-09T18:57:50.000Z","pushType":"pr_merge","commitsCount":3,"pusher":{"login":"mppf","name":"Michael Ferguson","path":"/mppf","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/3653132?s=80&v=4"},"commit":{"message":"Two tiny cleanups to convert-uast (#25906)\n\nThis PR makes two tiny changes to convert-uast.cpp:\r\n 1. It removes an unused function that is local to this file.\r\n2. It switches from using a query to decide if scope resolution should\r\nbe attempted to checking the current `modTag`. (Note: we are currently\r\nusing `MOD_STANDARD` for package modules, and as a result, only\r\n`MOD_USER` code is not in a bundled module. See also this code that sets\r\n`modTag`\r\nhttps://github.com/chapel-lang/chapel/blob/08f4df28ccbc1428b35d1b96b3cbe79fb997947e/compiler/passes/parseAndConvert.cpp#L299-L309\r\n\r\nNo behavior changes in this PR.\r\n\r\nReviewed by @arezaii - thanks!\r\n\r\n- [x] full comm=none testing","shortMessageHtmlLink":"Two tiny cleanups to convert-uast (#25906)"}},{"before":"08f4df28ccbc1428b35d1b96b3cbe79fb997947e","after":"54fd90dc7829dd8f9e75cea8a41eba7b202a6edf","ref":"refs/heads/main","pushedAt":"2024-09-09T16:58:13.000Z","pushType":"pr_merge","commitsCount":2,"pusher":{"login":"ShreyasKhandekar","name":"Shreyas Khandekar","path":"/ShreyasKhandekar","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/60454060?s=80&v=4"},"commit":{"message":"Unstable anon script for 2.2: special case & (#25892)\n\nThis adds a few things to the unstable warning anonymizer script in\r\norder to update it for the 2.2 release.\r\n\r\n- Add handling for a warning starting with the word ``:\r\npreviously, we were assuming that a warning either starts with a Chapel\r\nfile name (like `foo.chpl`) or with `` but a new\r\nunstable warning was added this release which starts with `` and it also doesn't have a number following it (the other two\r\nhave numbers following them which indicate a line number of argument\r\nnumber)\r\n- Add special case handling for the ambiguous modules warning (which is\r\nwhat also required the change from the 1st bullet) since it exposes\r\nimplementation details (module names), we scrub these names now.\r\n- Add testing for the special case\r\n\r\n[Reviewed by @lydia-duncan]","shortMessageHtmlLink":"Unstable anon script for 2.2: special case & <command line> (#25892)"}},{"before":"3284b94298461acb1ef0bd7c712243d0d86125c5","after":"08f4df28ccbc1428b35d1b96b3cbe79fb997947e","ref":"refs/heads/main","pushedAt":"2024-09-09T14:08:18.000Z","pushType":"pr_merge","commitsCount":4,"pusher":{"login":"mppf","name":"Michael Ferguson","path":"/mppf","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/3653132?s=80&v=4"},"commit":{"message":"Fix error with certain `include module` patterns (#25888)\n\nResolves #25569\r\n\r\nThis PR fixes a problem with using `import super.something` from within\r\na submodule that was stored in a different file using `include module`.\r\n\r\nReviewed by @DanilaFe - thanks!\r\n\r\n- [x] full comm=none testing","shortMessageHtmlLink":"Fix error with certain include module patterns (#25888)"}},{"before":"922d7c1904c13169e045a10ff3429796fdaed4dc","after":"3284b94298461acb1ef0bd7c712243d0d86125c5","ref":"refs/heads/main","pushedAt":"2024-09-06T05:39:49.000Z","pushType":"pr_merge","commitsCount":6,"pusher":{"login":"vasslitvinov","name":"Vass Litvinov","path":"/vasslitvinov","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/8039635?s=80&v=4"},"commit":{"message":"Add a test for @gpu.itersPerThread (#25895)\n\nAdd a test for the @gpu.itersPerThread feature added in #25855.\r\n\r\nWhile there, minor fixes to ensure that itersPerThread-attributed kernels\r\nrun on GPUs:\r\n\r\n* Instead of adding a CForLoop, add a WhileDoLoop, to avoid calls to chpl_error\r\n that would be inserted incleanupForeachLoopsGuaranteedToRunOnCpu().\r\n\r\n* Instead of PRIM_ASSIGN use PRIM_MOVE, as normalizing the former\r\n introduces new temps.\r\n\r\nTrivial, not reviewed.","shortMessageHtmlLink":"Add a test for @gpu.itersPerThread (#25895)"}},{"before":"ad2cd657b3086fffdfff729e96ccf75e352ef984","after":"922d7c1904c13169e045a10ff3429796fdaed4dc","ref":"refs/heads/main","pushedAt":"2024-09-06T00:05:45.000Z","pushType":"pr_merge","commitsCount":3,"pusher":{"login":"jabraham17","name":"Jade Abraham","path":"/jabraham17","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/15747900?s=80&v=4"},"commit":{"message":"Workaround memory leak bug in sort test (#25894)\n\nCaptures the result of `sorted` in a dummy variable to avoid memory\r\nleaks due to https://github.com/chapel-lang/chapel/issues/25893\r\n\r\nTested that `start_test\r\ntest/library/standard/Sort/errors/sortDomainArray.chpl --memLeaks`\r\npasses\r\n\r\n[Reviewed by @DanilaFe]","shortMessageHtmlLink":"Workaround memory leak bug in sort test (#25894)"}},{"before":"17117caeffe6863d8f2e31ca41e0fda0800c2f47","after":"ad2cd657b3086fffdfff729e96ccf75e352ef984","ref":"refs/heads/main","pushedAt":"2024-09-05T22:53:15.000Z","pushType":"pr_merge","commitsCount":2,"pusher":{"login":"stonea","name":"Andy Stone","path":"/stonea","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2591890?s=80&v=4"},"commit":{"message":"Arkouda annotations for array transfer performance (#25890)\n\nAnnotations impacting array transfer performance:\r\n\r\n```\r\n8/19/24:\r\n - Array transfer perf fix (Bears-R-Us/arkouda#3671)\r\n8/27/24:\r\n - Fix performance regression in to_ndarray (Bears-R-Us/arkouda#3697)\r\n```","shortMessageHtmlLink":"Arkouda annotations for array transfer performance (#25890)"}},{"before":"20de6bb75e9cbc2a677ce67bd8c6f21be84d135b","after":"17117caeffe6863d8f2e31ca41e0fda0800c2f47","ref":"refs/heads/main","pushedAt":"2024-09-05T22:15:46.000Z","pushType":"pr_merge","commitsCount":2,"pusher":{"login":"mppf","name":"Michael Ferguson","path":"/mppf","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/3653132?s=80&v=4"},"commit":{"message":"Work around a GCC 13 error (#25889)\n\nFollow-up to #25853. This PR avoids errors from GCC 13 when building\r\nwith `make DEBUG=0 WARNINGS=1 ASSERTS=0 OPTIMIZE=1 compiler`. The\r\nwarnings look like this:\r\n\r\n```\r\n/usr/include/c++/13/bits/stl_algobase.h:437:30: error: ‘void* __builtin_memmove(void*, const void*, long unsigned int)’ forming offset 40 is out of the bounds [0, 40] of object ‘’ with type ‘chpl::resolution::MatchingIdsWithName’ [-Werror=array-bounds=]\r\n 437 | __builtin_memmove(__result, __first, sizeof(_Tp) * _Num);\r\n | ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\r\n/home/mppf/w/main/frontend/lib/resolution/scope-queries.cpp: In function ‘const bool& chpl::resolution::emitMultipleDefinedSymbolErrorsQuery(chpl::Context*, const Scope*)’:\r\n/home/mppf/w/main/frontend/lib/resolution/scope-queries.cpp:3585:35: note: ‘’ declared here\r\n 3585 | v = lookupNameInScopeTracing(context, scope, { }, name, config,\r\n | ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\r\n 3586 | traceResult);\r\n | ~~~~~~~~~~~~\r\n```\r\n\r\nIt is currently unclear to me if the error is a false positive. However,\r\nI figured out a workaround by introducing a new variable to store the\r\nresult of `lookupNameInScopeTracing` here.\r\n\r\nTrivial and not reviewed.\r\n\r\n- [x] full comm=none testing","shortMessageHtmlLink":"Work around a GCC 13 error (#25889)"}}],"hasNextPage":true,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"cursor":"Y3Vyc29yOnYyOpK7MjAyNC0wOS0xMlQwMzoyOTo0Ni4wMDAwMDBazwAAAASz5C4Q","startCursor":"Y3Vyc29yOnYyOpK7MjAyNC0wOS0xMlQwMzoyOTo0Ni4wMDAwMDBazwAAAASz5C4Q","endCursor":"Y3Vyc29yOnYyOpK7MjAyNC0wOS0wNVQyMjoxNTo0Ni4wMDAwMDBazwAAAASuV9XI"}},"title":"Activity · chapel-lang/chapel"}