vsoch commented on issue skypilot-org/skypilot#3777.
I think @romilbhardwaj gave the presentation, and I was talking to him - maybe he has a good suggestion for a next step. I do think the larger design idea of having a Kubernetes as a base is a good one, regardless of flux!…
vsoch commented on issue hanwen/go-fuse#543.
That’s perfect! …
vsoch commented on issue singularityhub/singularity-hpc#676.
As long as they are public and open, PRs to add them are welcome and encouraged!…
vsoch commented on issue hanwen/go-fuse#543.
I’d need to be able to pass the wrapper, but it won’t let me do that: …
vsoch commented on issue hanwen/go-fuse#543.
> you have to take care that a Flush operation cannot occur in parallel to any other operation that involves the Fd, or you risk a race condition (after close, the Fd may be reused for a different file.) …
vsoch commented on issue flux-framework/flux-core#6437.
Oh no! But sbatch works? I
vsoch commented on issue flux-framework/flux-docs#288.
I might not have the best advice (and our flux devs will step in during work hours) but I would check flux resource list
to see if you actually have two nodes. It could be that running flux start
a la carte like that is just getting one node. So I might try an srun in the allocation for flux start that ensures the command goes across the nodes, e.g., here is what I’m doing in a development environment with slurm (and flux installed):
…
vsoch commented on issue flux-framework/flux-core#6437.
@wadudmiah I don’t see that you tried an srun command targeted at two nodes after you launched the allocation - I just see you trying an ssh command, and flux start in absence of a launching or bootstrap mechanism. Please do salloc with two nodes, and then an srun that also targets two nodes and runs flux start. The example that I provided (simplified): …
vsoch commented on issue flux-framework/flux-core#6411.
Thanks!…
vsoch commented on issue singularityhub/docker2singularity#135.
Sure thing! We can try adding the arm build, but they tend to be very slow on GHA, and using singularity directly is the preferred solution I think….
vsoch commented on issue hpc-social/jobs#23.
Not that I’m aware of - as far as I understand it, their API is concerned with logging in (as a user) and then posting or sharing. https://developer.linkedin.com/product-catalog. It’s also not a good candidate for scraping because of the design of the pages, and needing to be a logged in member of the group to see said jobs. …
vsoch commented on issue converged-computing/performance-study#68.
Done….
vsoch commented on issue converged-computing/flux-views#10.
Note: this will eventually be OK to merge, but spack isn’t building a newer version of flux, it’s pinned at 0.61.2 even in noble. Likely we need to wait for the spack bases to update….
vsoch commented on issue etetoolkit/ete#762.
Awesome, thank you!…
vsoch commented on issue flux-framework/flux-core#6411.
> Ooh, sorry that wasn’t clear. Thanks for making it better for next intrepid explorer! …
vsoch commented on issue kubernetes/enhancements#3370.
I’m bad at following all these issues - what is the current (summary) of the status here?…
vsoch commented on issue flux-framework/spack#245.
@trws I’m disabling all automation for flux-core - it’s a huge number of notifications I get daily and I don’t know how to fix. Flux core won’t be automatically updated in spack, but the others will continue to be….
vsoch commented on issue flux-framework/flux-core#6394.
Gotcha, will move there. I got it working without, but likely this would be nice to have in the future. Thanks @grondo !…
vsoch commented on issue flux-framework/flux-core#4569.
That’s perfect - thank you!…
vsoch commented on issue flux-framework/flux-core#4569.
Adding a note here (and apologies if I missed a detail, there is a lot in this thread!) but I’d like to be able to add arbitrary events that are rpc (not necessarily just the job journal) to be triggered with handle.reactor_run() and then allow me to trigger a callback. For example, for a custom heartbeat, I can subscribe to it and receive the message, but I have to do that synchronously and do some loop that blocks (and prevents me from running my one call to the reactor). My solution now is not to use flux and create a separate thread, but I’d ideally like everything (events wise) coming from one source, and from flux….
vsoch commented on issue hanwen/go-fuse#536.
Yes it’s resolved - sorry I didn’t say that explicitly. I am happy to close the issue. Thanks!…
vsoch commented on issue hanwen/go-fuse#536.
woo!! It totally worked!! …
vsoch commented on issue hanwen/go-fuse#536.
I didn’t have time this week but I should be able to make time this weekend (and point at code). Stay tuned and thanks again for the help….
vsoch commented on issue pydicom/deid#271.
Thanks @wetzelj !…
vsoch commented on issue oras-project/oras-py#164.
The bug is here @tarilabs - the prefix is expected to be for the registry, but with the refactor it was removed. It was originally derived here: https://github.com/oras-project/oras-py/blob/36ef98afb6036eb4e3b70890aa941a8236937613/oras/provider.py#L69 and so an easy fix is to move it, or pass forward….
vsoch commented on issue kubeflow/training-operator#1906.
We probably don’t want to close this….
vsoch commented on issue hanwen/go-fuse#536.
Thank you @hanwen ! I think there was something wrong with my environment - I restarted and was able to run the command. Now what happens is that I get a permission denied (even though the context is my PWD) and when I remove the read only, it freezes and I need to manually unmount. I’m wondering now (haven’t tested yet) if the issue was placing the PWD at the mount root (where I wouldn’t be allowed to run a command, even if it requites write)? I’m out and AFK and I can test things when I get back, if you have suggestions or an example to point to! …
vsoch commented on issue flux-framework/spack#245.
Same bug, so wasn’t that!…
vsoch commented on issue opencontainers/wg-image-compatibility#21.
It looks like in some prototypes we just have key value pairs - OK with me :laughing: …
vsoch commented on issue hanwen/go-fuse#536.
It looks like it should be doing that here: …
vsoch commented on issue flux-framework/spack#245.
I’ve been burned by that for two other projects this week! I don’t have any ideas either, so I’ll give that a shot. The ubuntu images are definitely latest so we are getting a newer Python. I can’t reproduce the error locally so having trouble debugging. I’ll try installing setuptools and report back….
vsoch commented on issue hanwen/go-fuse#535.
The first thing I tried is to try and inherit the functions, e.g., …
vsoch commented on issue expfactory/experiments#61.
It looks great! :space_invader: …
vsoch commented on issue converged-computing/fluxnetes#20.
Note that we had the same bug in fluence: https://github.com/flux-framework/flux-k8s/pull/85…
vsoch commented on issue urlstechie/urlchecker-python#93.
Thank you @mabraham!…
vsoch commented on issue expfactory/experiments#61.
Excellent! See if you can add some notes to the README, and then deploy the site to GitHub pages and put the URL in the description so a visitor can easily preview it….
vsoch commented on issue urlstechie/urlchecker-python#92.
You can also just mimic what we do in our own docker image: …
vsoch commented on issue singularityhub/singularity-hpc#673.
- If force is not used in install (outside of the view), I would do the implementation without force for now, and then, come back to do work just on force - adding/remove where we think makes sense. …
vsoch commented on issue oras-project/oras-py#162.
Thank you! Much appreciated….
vsoch commented on issue flux-framework/flux-core#6352.
I think we are good to close here - we can always come back and embellish or change the example. Thanks again @wihobbs @grondo @garlick it was a good thread….
vsoch commented on issue urlstechie/urlchecker-python#92.
When it crashes like that, it’s a mismatch between the chrome you have and the driver. …
vsoch commented on issue singularityhub/singularity-hpc#673.
Sounds like we have a path forward! Let’s start with: …
vsoch commented on issue protocolbuffers/protobuf#18096.
This was really helpful! My full workflow (after doing the curl / unzip install above) is to run the grpc one in python first, then protoc, and then I still need to do some sed fu to get the imports right. …
vsoch commented on issue oras-project/oras-py#161.
We can move the “starts with” to be later, and yes other order is what I meant. Apologies - I keep switching state between deep programming and trying to be articulate - doesn’t always work super well. …
vsoch commented on issue flux-framework/flux-core#6352.
> Interesting question. It sounds like you are looking to do different things based on the state of the event, @vsoch? …
vsoch commented on issue flux-framework/flux-python#11.
Not at all: https://pypi.org/project/flux-python/0.66.0/…
vsoch commented on issue eksctl-io/eksctl#6870.
Thank you @cPu1 , rawr! :t-rex: …
vsoch commented on issue awslabs/soci-snapshotter#1389.
Oh no, I figured this all out, sorry didn’t follow up! Here is the post: https://vsoch.github.io/2024/container-pulling/ and my solution: https://github.com/converged-computing/soci-installer. Thanks!…
vsoch commented on issue awslabs/soci-snapshotter#1389.
I got it working! But I need to figure out if I can automate it, will report back….
vsoch commented on issue singularityhub/singularity-hpc#673.
hmm :/ @marcodelapierre what do you think?…
vsoch commented on issue singularityhub/singularity-hpc#673.
Let’s start with just the install set - I’m not sure about the latter. For this one: …
vsoch commented on issue pydicom/deid#268.
another idea is to have a set of helper scripts - not integrated into the main client, but optional to install (and support this extra functionality)….
vsoch commented on issue spack/spack#46703.
Not it! :point_right: :nose: …
vsoch commented on issue singularityhub/singularity-hpc#673.
I don’t think I’m convinced by needing this addition, especially if (as you say) the same can be accomplished with an shpc install
. This would add a lot of confusion for not a lot of new functionality. If this is a problem:
…
vsoch commented on issue pydicom/deid#266.
The change / fix seems pretty small - change read_file to dcmread. Would you care to PR to update the library here? We will need to do another more major bump in terms of version as well to indicate it is a breaking change….
vsoch commented on issue singularityhub/singularity-hpc#673.
Thanks @Ausbeth ! This introduces a confusing point for the user - having “update” and “upgrade” (that have different interactions). What exactly is upgrade doing that update is not, and did you think about a way to consolidate the need? …
vsoch commented on issue regclient/regclient#775.
I did some thinking tonight about how this might fit into containerd and then Kubernetes - let’s talk about our plan for testing with containerd (and then we can build into a testing Kubernetes setup) soon….
vsoch commented on issue eksctl-io/eksctl#6870.
Thank you! And no worries about stale bot - it can be very helpful. I’m subscribed to the thread and am good to ping when it needs to be reopened….
vsoch commented on issue kubernetes-sigs/node-feature-discovery#1845.
> Considering the danymic situation, extending scheduler with plugin to validate the image compatibility and node features seems to be more viable. …
vsoch commented on issue expfactory/experiments#61.
All set! https://github.com/expfactory-experiments/space_adventure_pd. You can take a look at the other respositories to get a sense of the organization. In summary - most of the experiment files you are adding here should go there, and then only the metadata is added in this PR. Ping me in an issue there if you have questions. And please expect my responses to not be immediate - I’m on travel until early October….
vsoch commented on issue hpc-social/hpc-social.github.io#78.
This would be great! You can open a PR and add it to this page: …
vsoch commented on issue converged-computing/performance-study#59.
Ping @asarkar-parsys this is your :baby: ! Oh no wait, no it’s not. The other other kind of baby. The one you throw out with the bath water. Just kidding :P :water_polo: …
vsoch commented on issue hpc-social/hpc-social.github.io#77.
> I can’t seem to open a pull request, so I’ll revise the content here. …
vsoch commented on issue oras-project/oras-py#157.
Do you mean passing the auth backend directly instead of calling get_auth_backend? Yes that would be OK - likely we can have: …
vsoch commented on issue kubernetes-sigs/scheduler-plugins#722.
@Huang-Wei the bot closed the issue, but did you ever get to test this?…
vsoch commented on issue singularityhub/singularity-cli#227.
If you
vsoch commented on issue singularityhub/singularity-cli#227.
We definitely don’t have support for that. How would a custom user make sense in a Singularity container? You’ll always be running as your user….
vsoch commented on issue oras-project/oras-py#156.
That would work - could you please open a PR? I didn’t do a release for that change so we should be able to still tweak and not be worried it went into a release….
vsoch commented on issue converged-computing/performance-study#51.
@amarathe84 @asarkar-parsys @milroy configs are added: …
vsoch commented on issue converged-computing/performance-study#47.
I also want to clarify there is no urgency here - I am just documenting what I find as I go, because I won’t remember. Maybe a colored spreadsheet would do better so I don’t open issues for each one….
vsoch commented on issue oras-project/oras-py#155.
@xarses you will need to sign the DCO to merge this fully - click on the little “Details” next to the Robot that says “DCO” in the checks section….
vsoch commented on issue kubernetes-sigs/jobset#672.
No I don
vsoch commented on issue spack/spack#46192.
Done with https://github.com/spack/spack/pull/46215#event-14236382997…
vsoch commented on issue spack/spack#46215.
Looks like it’s fixed!…
vsoch commented on issue kubernetes-sigs/jobset#672.
> My concern is where do we stop. You add this feature now, then someone comes and asks “what if we add X”, and suddenly we have a workflow tool. …
vsoch commented on issue flux-framework/flux-k8s#84.
All set @cmisale for another review we are green! It was a change to flux-sched. I think they are trying to reduce the number of strings that are kept in the graph structure, and moving a subsystem key here was one of them. For our future selves: https://github.com/flux-framework/flux-sched/commit/d8265240dd63e5a67ba2598b919bf6e7a7f3d9a2….
vsoch commented on issue vsoch/pytest-github-report#1.
That would be logical - please feel free to open a PR to suggest a change for this case….
vsoch commented on issue nextflow-io/nextflow#5229.
This now LGTM! :+1: …
vsoch commented on issue flux-framework/flux-k8s#83.
oh fluence, you are becoming very hard to maintain….
vsoch commented on issue oras-project/oras-py#147.
@tarilabs I don’t have bandwidth at the moment to work on thees components, but I can fully support you and provide review for what you are able to contribute. Thank you!…
vsoch commented on issue kubernetes-sigs/jobset#672.
This would be great! I agree to not try to create a workflow tool, but allowing this basic dependency structure is something that workflow tools can use. +1 from me….
vsoch commented on issue conda-forge/oras-py-feedstock#28.
@wolfv what would you suggest here - adding setuptools akin to how the bot suggests, or something else?…
vsoch commented on issue spack/spack#45625.
This is now set with https://github.com/spack/spack/pull/46191. Closing here, thanks!…
vsoch commented on issue nextflow-io/nextflow#5229.
@ewels could you please address my questions/comments so we can move the PR forward?…
vsoch commented on issue kubernetes-sigs/kubebuilder#4130.
Assuming all the CI passes, this looks very helpful! Thanks @camilamacedo86 …
vsoch commented on issue flux-framework/flux-k8s#82.
I think what is happening are upstream changes (many in the last 2 months) https://github.com/kubernetes-sigs/scheduler-plugins/commits/master/ so fluence builds but isn’t functioning as expected. I’m too zonked to debug now or soon, but if someone wants to jump on it, please do!…
vsoch commented on issue oras-project/oras-py#150.
Ah ok. Please keep the tests in GitHub the same as what you are doing (and what is working) locally and let’s try making more space on the builder - these first three lines before the tests to cleanup and add space should be sufficient: …
vsoch commented on issue converged-computing/performance-study#10.
@amarathe84 is this ready for review?…
vsoch commented on issue oras-project/oras-py#150.
Here is what I see: …
vsoch commented on issue hpc-social/hpc-social.github.io#77.
We don’t have current support for parsing RSS (but someone could jump on it). For the time being you can open a pull request and update the page content here https://github.com/hpc-social/hpc-social.github.io/blob/main/_projects/podcasts.md. I’d suggest providing links to the most common podcast venues (spotify, apple) and let people click to get the latest episodes….
vsoch commented on issue singularityhub/singularity-cli#225.
This looks good - if you’ve verified it is working as you like, could you please add a note to CHANGELOG.md and bump the corresponding version in spython/version.py?…
vsoch commented on issue flux-framework/flux-core#6226.
> I’d like to get more feedback if this example should just go in an EXAMPLES section of flux-submit(1) directly or something. …
vsoch commented on issue singularityhub/singularity-python#102.
I think you meant to post on https://github.com/singularityhub/singularity-cli/?…
vsoch commented on issue singularityhub/singularity-cli#223.
An ENV
is converted to the environment section, which we see:
…
vsoch commented on issue oras-project/oras-py#150.
Nice! Let’s run these tests now….
vsoch commented on issue flux-framework/Tutorials#43.
@ilumsden and @hariharan-devarajan thank you for your work on this! I’ll try to review it before the end of the week….
vsoch commented on issue eksctl-io/eksctl#6870.
Updated to include the same 2023 files. I also tested this today (this evening) and it fixed the issue I posted above - my cluster came up with all efa nodes. I’ll need to try the experiments for the two clusters that failed tonight, but with less funds now, tomorrow. Thanks for the help and TBA speedy review!…
vsoch commented on issue singularityhub/singularity-cli#222.
As suggested, you need to write into a script. The library here uses subprocess, which expects a list of commands (using shlex split) and here is what is happening: …
vsoch commented on issue oras-project/oras-py#150.
@isinyaaa please see my previous review comment - we need explicit tests for the chunked upload….
vsoch commented on issue nextflow-io/nextflow#5229.
And this title is not correct: …
vsoch commented on issue kubernetes/kubernetes#125852.
Interesting! This is unrelated to the issue here, but should the flux operator be setting equivalent resources to the init containers?…
vsoch commented on issue flux-framework/spack#223.
@trws I updated to ubuntu-latest but it looks like it’s still selecting gcc 10x. Is this possibly still using focal somewhere and I should look deeper?…
vsoch commented on issue flux-framework/flux-operator#232.
I think that was a failed CI run - take a look here: https://github.com/flux-framework/flux-operator/releases/tag/0.2.1 and let me know if that works!…
vsoch commented on issue snakemake/snakemake-interface-executor-plugins#71.
@johanneskoester that coder rabbit is so cool!!!…
vsoch commented on issue singularityhub/singularity-cli#222.
It’s probably something to do with how the command is executed - perhaps try running that via a script instead?…
vsoch commented on issue converged-computing/oras-csi#32.
Not currently, but that would be a fairly easy addition. You would want to retrieve the secret via likely a secret -> environment variable and then add to the requests here. https://github.com/converged-computing/oras-csi/blob/c83555fee4bccbe8287b781a059bd4bafd935a4b/pkg/oras/blob.go#L13 PRs welcome!…
vsoch commented on issue spack/spack#44678.
I think you did! Thank you!…
vsoch commented on issue flux-framework/Tutorials#30.
This update was already done: https://github.com/flux-framework/Tutorials/blob/05b0c50494ec750f2147138a4ba72e500ca9f5f7/2024-RADIUSS-AWS/JupyterNotebook/docker/Dockerfile.spawn#L1 so I’m going to rename the issue to get DYAD working in this environment….
vsoch commented on issue spack/spack#45625.
@alecbcs sounds great! I saw your ping - let’s definitely sync up….
vsoch commented on issue spack/spack#45625.
@bernhardkaindl is there a container or means to reproduce the CI environment? I think step 1 is reproducing that error. If we can’t reproduce it’s hard if not impossible to debug. From some searching, it seems like this problem has come up before: …
vsoch commented on issue singularityhub/singularity-cli#221.
These are the options available to you: …
vsoch commented on issue kubernetes/kubernetes#125852.
@haircommander we have details and our testing of the above, and @lisejolicoeur will be posting notes on Monday. Thanks again and have a good weekend!…
vsoch commented on issue flux-framework/spack#217.
I opened this PR here this morning https://github.com/spack/spack/pull/45619 because I didn’t see it. That merge will close here….
vsoch commented on issue kubernetes/kubernetes#125852.
Thanks for the insights @haircommander - we did not test SizeMemoryBackedVolumes
and did not have swap enabled, but will do another round of testing tomorrow and report back with all the details you need. Stay tuned!…
vsoch commented on issue schemaorg/schemaorg#2061.
I am happy to pick up work / development (and support you to contribute) if there is interest! In terms of actually using it, I haven’t found a good use case for is so largely have not. …
vsoch commented on issue dask/dask-jobqueue#605.
Gotcha - this PR is old enough that it would be worth trying again. I’ll start fresh and open a new one if I have any success. Thanks for keeping on top of this @jacobtomlinson….
vsoch commented on issue bpfman/bpfman#1143.
hey @anfredette ! I wound up stepping back and I’m writing little eBPF programs on a VM to monitor containers - I am planning to start that way and then move to Kubernetes. Feel free to close the issue if you like - I’ll get back to this but not for some time. I can always open a new one….
vsoch commented on issue flux-framework/flux-sched#1138.
And @trws one more question for you I was thinking about just now - does qmanager hold some memory of jobs out until eternity? I know I can easily see finished job stats, but I don’t see that logic with qmanager, so I’m guessing that belongs somewhere with kvs / job manager? Apologies I haven’t read the code yet - been doing nothing most of today and kind of digging that too :P Happy Sunday!…
vsoch commented on issue riverqueue/river#495.
Yep that was it - the queue name was wrong :upside_down_face: …
vsoch commented on issue kubeflow/training-operator#2171.
One last question - any reason to differentiate schedulerName
and GangScheduler
? Can we not just call them the same thing? To use a gang scheduler you should still use schedulerName
. If we use the same term in both places we can simplify the abstractions we describe, and not limit the scope of what the type means for the future (in case a different term or kind of scheduler arises appropriate to be used as a plugin here).
…
vsoch commented on issue opencontainers/image-spec#751.
Nah, nobody cares about ontologies, really….
vsoch commented on issue riverqueue/river#488.
Ah - I think I need a way to keep firing the waitForJob function. In the example https://github.com/riverqueue/river/blob/master/example_subscription_test.go we just need to wait for one job each, and I think that’s what I might have missed (or at least this is my current hypothesis)! …
vsoch commented on issue flux-framework/flux-sched#1113.
I am planning to play around more later too - been running a chonky build on my laptop (with browser, slack, and VSCode open) so I can’t afford to open another VSCode session!…
vsoch commented on issue flux-framework/flux-sched#1118.
This was written before we had that I think - December 2023, and added February this year! …
vsoch commented on issue singularityhub/docker2singularity#133.
Wonderful! Thank you for reporting the issue! I’m going to close it, and please ping again if something arises….
vsoch commented on issue flux-framework/Tutorials#40.
A few nits preparing for future review: …
vsoch commented on issue converged-computing/oras-csi#31.
Closing issue unless there is something else, thanks!…
vsoch commented on issue singularityhub/docker2singularity#133.
@Soratake-HirotakaYajima this morning I built a new release: quay.io/singularity/docker2singularity:latest
and here is my suggestion. That image has singularity 4.1.4 and I also added the development headers of that library mentioned. Give that a try. If it doesn’t work, please shell into the container interactively and try doing something else with Singularity (a singularity run of a docker uri would suffice) to see if you can reproduce the issue. If you can, then please try doing the same run in one of the images here: https://quay.io/repository/singularity/singularity. The main difference is that they are built with an ubuntu base (vs. here is alpine), so we can rule that out. Hopefully some of the above investigation can help shed some light on the issue. Thanks!…
vsoch commented on issue singularityhub/docker2singularity#133.
If you want to convert docker to singularity, you no longer need this tool. Just do: …
vsoch commented on issue flux-framework/Tutorials#31.
> Oh, is that the deadline? I can definitely help out then. I just won’t be able to do a ton before I’m back in Knoxville on August 11. But, once I’m back, I could easily put this at the top of my to-do list …
vsoch commented on issue flux-framework/Tutorials#31.
@jacobtkeio the updates are done, the testing should be the same as before: …
vsoch commented on issue urlstechie/urlchecker-action#110.
A post-deploy suggestion (and one that might actually make sense if you don’t deploy frequently, but a link can still go 404) is to have the check done after a download of the html for the live site. If you can find a tool that would download your rendered pages (or some subset) you could run a static checker in CI….
vsoch commented on issue riverqueue/river#469.
Oh hmm looking at the bottom of https://github.com/riverqueue/river/blob/bf61772d4693ed498f66a7e1697819554bf3f0cb/client.go#L652. Do I just call Start and give it the context that I have (that I showed above)?…
vsoch commented on issue skypilot-org/skypilot#3751.
@romilbhardwaj I opened a PR #3777 with some things to discuss! Let me know the best venue for this - I am thinking a combination of the PR discussion plus (if it’s allowed) a small amount of time for the next batch-wg meeting, e.g., if we want to talk more about the idea to generalize a “Kubernetes Cloud.” I ask now because I’d want to put it on the agenda….
vsoch commented on issue skypilot-org/skypilot#3751.
hey @romilbhardwaj! I’m pretty far along and getting ssh to work. It’s saying there is an invalid host identifcation string, which seems to be debian specific: …
vsoch commented on issue vsoch/scif#74.
@pgierz try inserting an IPython embed around like 90 there and walking through the logic. I don’t think scif was intended to receive input and that’s the issue you are running into. If you have a suggestion for a PR / change I’d be happy to review it!…
vsoch commented on issue spack/spack#45344.
Thanks @teaguesterling I really appreciate it!…
vsoch commented on issue vsoch/scif#74.
This is a concerning issue first: …
vsoch commented on issue spack/spack#45344.
Did you try installing flux-security on the PR branch? …
vsoch commented on issue kubernetes/kubernetes#125852.
It also seems like there is a hint of the comment above here, …
vsoch commented on issue oras-project/oras-py#149.
@ccronca I can run the CI, but you’ll need to sign the DCO (click the link in the DCO check above)….
vsoch commented on issue kubernetes/kubernetes#125852.
For some more detail, I think by default /dev/shm
is a tmpfs type that (when params aren’t set) defaults to half the size of actual RAM, otherwise you can get OOM. I’m reading here: https://docs.kernel.org/filesystems/tmpfs.html and specifically this table:
…
vsoch commented on issue skypilot-org/skypilot#3751.
@romilbhardwaj gotcha - so would the assumption be that Flux is already running somewhere, and then skypilot would submit jobs to it? Would sky pilot also be deploying the flux cluster? And if so - what methods are typically done to do that (we have been using a combination of operators, cloud SDKs, and Terraform). For some of those we require custom base image builds (which we have private which would need to be made public)….
vsoch commented on issue flux-framework/flux-sched#1243.
> AFAIK all fluxion commands are prefixed with ion- so that the beginning of the command reads “fluxion” …
vsoch commented on issue spack/spackbot#95.
I would say maybe rename the second command (and then the first to be consistent) to make it clear it’s not an instruction to merge into the main branch, something like: …
vsoch commented on issue flux-framework/flux-python#11.
Apologies didn’t post a follow up - this finished building yesterday. https://pypi.org/project/flux-python/0.63.0/…
vsoch commented on issue flux-framework/flux-python#11.
Coming right up!…
vsoch commented on issue singularityhub/singularity-compose-simple#3.
You very likely want to check the version, defined here: https://github.com/singularityhub/singularity-compose-simple/blob/fe27a04df6134a3240bda8d9495d37f579c8bcaf/app/Singularity#L41 against the install.sh script that uses it here: https://github.com/singularityhub/singularity-compose-simple/blob/master/app/nginx/install.sh and try updating things….
vsoch commented on issue eksctl-io/eksctl#6743.
All set!…
vsoch commented on issue singularityhub/singularity-compose-simple#3.
Singularity compose is just running singularity under the hood, so please try singularity-compose --debug build
to get the singularity build command being issues, and we can debug from there….
vsoch commented on issue flux-framework/flux-k8s#71.
> hm I have to say I don’t remember that well how to define jobspecs.. I was much better before lol. …
vsoch commented on issue hpc-social/jobs#22.
Just so you know, there is a Google form that backs this entire thing, so we can delete here for an immediate fix, but the larger fix is deleting from that Google Sheet (which I’ve done just now). Hopefully that will remove the job entirely….
vsoch commented on issue flux-framework/flux-sched#1230.
yep this definitely looks to be the old version: …
vsoch commented on issue apptainer/singularity#847.
I would say that Singularity 2.6.1 is so old it can’t be supported anymore, but I’m not a developer here….
vsoch commented on issue oras-project/oras-py#147.
Thank you! And if there is a specific registry I can access as well, I can definitely help to reproduce (and then see if I can do a fix). …
vsoch commented on issue oras-project/oras-py#147.
This is great - thank you! For each of the cases that don’t work, it’s a bit arduous but what I do is test in an interactive IPython terminal, so when it fails I have a catch in the code that does: …
vsoch commented on issue opencontainers/oci-conformance#106.
But the reference to $value has me concerned it’s some turducken combination of multiple things… :grimacing: …
vsoch commented on issue flux-framework/flux-k8s#71.
@cmisale @milroy I’m working on the second bullet above, and wanted to have discussion about the format that we want. We currently do something like (and please correct me if I’m wrong - I get this confused with jobspec nextgen): …
vsoch commented on issue betterscientificsoftware/bssw.io#1633.
I’m going to unsubscribe from notifications here - good luck @markcmiller86 !…
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#51.
In that case, we just need @johanneskoester blessing….
vsoch commented on issue flux-framework/flux-k8s#78.
I’ll try to keep them as small / scoped as possible - that first one was… :exploding_head: …
vsoch commented on issue oras-project/oras-py#144.
@tarilabs let’s pick up discussion here https://github.com/oras-project/oras-py/issues/147…
vsoch commented on issue flux-framework/flux-k8s#78.
No rush to review these @milroy - please take care of the :fire: first. …
vsoch commented on issue c0mm4nd/go-hwloc#1.
@0xffffa are you planning to contribute via PR here so we can use the upstream for our projects? Or are the changes not acceptable for adding here?…
vsoch commented on issue oras-project/oras-py#145.
These changes look good! Is there any reason not to add a test for it here as well (as opposed to just locally)?…
vsoch commented on issue flux-framework/flux-python#11.
Give this a try: https://pypi.org/project/flux-python/0.62.0/…
vsoch commented on issue spack/spack#44775.
This change is saying that if flux is provided as an external, it’s up to the external provider to set the LUA paths, which I think is logical. The issue is that adding it as an external is going to look for spec with lua, and that likely is not present if lua was only a dependent for flux. It would still not work to add it, because it might be an actual mismatch in version. My only critique might be to check for external and just return early, and then not indent the rest of the function. It would also be good to add a comment above that, explaining the logic for a future package.py reader….
vsoch commented on issue oras-project/oras-py#142.
In case you missed the message above, we already have the docker client included and “login” is called from it: …
vsoch commented on issue eksctl-io/eksctl#6743.
Hi @cPu1 could you please give feedback to the CI errors? I’m seeing them show up in other PRs and it looks to be that an incorrect function signature is being used, for example: …
vsoch commented on issue vsoch/pull-request-action#105.
I would sanity check a similar run against a public repository. I’m guessing the 404 error is correct in that the token you have in the GitHub action runner cannot “see” the repository. The thing to try after that would be using a PAT instead….
vsoch commented on issue spack/spack#44682.
Thanks to you both! My testing run worked - I’ll close the issue tonight when our automated builds are successful too….
vsoch commented on issue CognitiveAtlas/cogat-python#14.
I’m not exactly sure why install_aliases
was there in the first place, but I suspect it’s not needed. I left Poldracklab in 2016 and don’t maintain this anymore so I defer to @rwblair….
vsoch commented on issue vsoch/pull-request-action#105.
I don’t see a repository under ingka-group-digital/resolutions
. Is this regular GitHub (not enterprise) and you have write? I suspect there is a permissions / visibility issue somewhere….
vsoch commented on issue oras-project/oras-py#143.
This would be good to add to the repository, perhaps as an example or helper script? …
vsoch commented on issue Parsl/parsl#2713.
Nice!…
vsoch commented on issue spack/spack#44602.
Thanks @alecbcs!…
vsoch commented on issue oras-project/oras-py#139.
This looks good @tarilabs - not sure why the DCO isn’t turning green but that’s the one thing we are waiting on….
vsoch commented on issue hpc-social/hpc-social.github.io#73.
@alansill if you can rebase with the main branch this should pass / preview now….
vsoch commented on issue converged-computing/oras-csi#29.
@nstielau could you please rebase? Apologies for the break / delay - it was a test that seems to work locally but not in the environment here - need to look into it further….
vsoch commented on issue compspec/compspec#18.
This issue is deprecated - the proposals are moving forward but (the extent to which we can follow them exactly) is not clear to me. Arguably, we are going to be making our own jobspec nextgen format for compatiility, and then use JGF, so I’m not sure any of this applies directly. Going to close for now until I rethink it….
vsoch commented on issue volcano-sh/volcano#3503.
That’s pretty neat! Thanks for answering - we can definitely close the issue (and label as a question if needed)….
vsoch commented on issue converged-computing/flux-usernetes#8.
I’m going to use a reference to the repository for now, but let’s make a Zenodo record if/when you can (and we can do during a next Hackathon if that is easier)!…
vsoch commented on issue urlstechie/urlchecker-python#91.
My quick read is that SARIF is for static analysis tools relevant for security (e.g., code scanning) and I’m not sure a broken link detected falls under that scope. The RFC linked is for this page https://www.rfc-editor.org/rfc/rfc3986 which has nothing to do with an invalid URL, it’s just that it is down. …
vsoch commented on issue lima-vm/lima#2351.
I think that was probably just my mistake to not use sudo, thanks for the tips @afbjorklund….
vsoch commented on issue spack/spack#44205.
@trws @alalazo neither of those worked so I will need to ask for you help again - what should we try next?…
vsoch commented on issue flux-framework/flux-k8s#76.
@cmisale and @milroy - the changes here will tweak the Flux JGF a tiny bit so each has a containment path with (I think) correct indices. There are of course could be bugs, so I’d like to look it over together (and myself again). I tried my best to keep the commits scoped, but in practice there is one large one and a few smaller ones. I will hopefully get better at this. …
vsoch commented on issue flux-framework/flux-k8s#4.
I think this was probably answered and is safe to close. @cmisale please reopen if you want to continue discussion!…
vsoch commented on issue rootless-containers/usernetes#322.
So if we use the libfabric API that connects directly to the device (hardware) I’m guessing we are bypassing all of that….
vsoch commented on issue spack/spack#44205.
@alalazo looks like we have the same issue - let me know what you’d like to do….
vsoch commented on issue spack/spack#44205.
Tested locally with the gcc pin - let’s see if that changes clang too. If not, I need some guidance on what to set for that….
vsoch commented on issue pydicom/deid#261.
You could also just check to see that the recipe is not None on the class. …
vsoch commented on issue hpc-social/jobs#20.
@ndusek we ask that you please post jobs directly to our form - https://hpc.social/jobs/about/ …
vsoch commented on issue flux-framework/flux-k8s#69.
To update from my post in slack (which will be lost into the black hole of segfaults): …
vsoch commented on issue oras-project/oras-py#131.
You’ll want to finish up by: …
vsoch commented on issue bpfman/bpfman#1143.
Making progress! It was able to load, and now this appears to be issues with my code (not surprising) :laughing: …
vsoch commented on issue bpfman/bpfman#1143.
Done! …
vsoch commented on issue lima-vm/lima#2351.
Actually this looks to be the same one I started from, albeit many months ago. :cry: I’ll start fresh anyway in case something is different….
vsoch commented on issue spack/spack#44205.
Ping @trws and @grondo - what do you think about pinning a C++ version? Can we do that?…
vsoch commented on issue eksctl-io/eksctl#6752.
hey folks! My PR to add this support was again closed. Can we get this added soon, or do we need to continue to use custom bulids?…
vsoch commented on issue flux-framework/rfc#414.
@wihobbs you’ve probably already exceeded most of us in knowledge of jobtap plugins, but in case you are interested we are having a little (hopefully fun) hackathon this week on Thursday - grandmaster @trws is going to show us the ropes for making plugins, and :exclamation: I think there might be rust involved! If you are interested, even for the fun or sharing your expertise, I can send along the invite….
vsoch commented on issue flux-framework/flux-coral2#163.
And a question I have is if your APIException
type is going to be considered the same as the one that watch would raise. For example (this is probably a dumb example, but similar in concept):
…
vsoch commented on issue flux-framework/flux-k8s#69.
@milroy commit with response to review is here: https://github.com/flux-framework/flux-k8s/commit/cbeffceb04502a22396da984f620e8f9cd9ff99a …
vsoch commented on issue snakemake/snakemake#2500.
That’s OK with me, we can likely close this - it’s …
vsoch commented on issue flux-framework/spack#169.
This was fixed by installing libsodium from GitHub, but now I’m seeing timeouts for xz (sourceforge): …
vsoch commented on issue singularityhub/singularity-compose-examples#8.
All set: https://pypi.org/project/singularity-compose/0.1.19/ thanks!…
vsoch commented on issue singularityhub/singularity-compose-examples#8.
Please see (and test) https://github.com/singularityhub/singularity-compose/pull/72. You should be able to add network -> type
with bridge to get the functionality you were describing….
vsoch commented on issue spack/spack#19365.
It didn’t seem to - I did try that variant with the double equals. …
vsoch commented on issue oras-project/oras-py#133.
What we probably need is to separate the auth flow into modules - so you can select a module that has a particular behavior. I’d be open to a PR for that - I won’t have time myself imminently soon….
vsoch commented on issue flux-framework/flux-k8s#69.
> I’m in the process of reviewing this PR, but it is taking quite a while. There’s a lot of churn in the commits, e.g., fluence.go is updated 17 times across commits. …
vsoch commented on issue flux-framework/flux-core#5917.
I really just want to have differently named jobs (that I’m controlling everything for) and say “Mr. Job B, you depend on A.” So just …
vsoch commented on issue snakemake/snakemake-storage-plugin-gcs#31.
I don’t understand the question - I think there is already support for downloading a directory? https://github.com/snakemake/snakemake-storage-plugin-gcs/blob/30175aa758c654c438b62f6c6384059ff6a8bc7c/snakemake_storage_plugin_gcs/init.py#L366-L367. Is there a specific example or use case not working?…
vsoch commented on issue singularityhub/singularity-compose-examples#8.
Ah gotcha. So this is just needed when fakeroot
is set? This must be newer (or at least wasn’t an option when it was first developed). Do you want to open a PR to make the change?…
vsoch commented on issue singularityhub/shpc-registry#227.
-B
is a bind request (the other form of that flag is --bind
). and those commands show binding the environment file from the wrapper directory to the shpc environment. That isn’t the actual command, because the actual command has a full path to that. If you do an exec you do need to provide a command. if you use shell you should shell in, and run will hit the container entrypoint.
…
vsoch commented on issue oras-project/oras-py#133.
Sounds like we need to keep the basic auth then and this registry does not have support for Bearer? Would that fix the issue?…
vsoch commented on issue oras-project/oras-py#133.
So azure just wants to keep the basic auth, or is it just missing the scope? …
vsoch commented on issue flux-framework/flux-k8s#69.
@milroy I went through and changed most of the short variables (e.g., pgMgr) to be fully written out. I left a few that are convention for operators (reconcilers) like “mgr” (manager) and “req” (request). The one I didn’t change from the original code was “ps” primarily because I don’t know what that means. Let me know if there are others you’d like expanded. Thanks for starting to take a look!…
vsoch commented on issue kubeflow/mpi-operator#610.
> https://github.com/microsoft/DeepSpeedExamples/tree/master/training/HelloDeepSpeed, it do not involve the communication process. The communication setup by pdsh with the hostfile provided by the mpi-operator. …
vsoch commented on issue pydicom/deid#260.
Your best bet (and for your learning) is to look into the code, specifically at the DicomParser that get_identifiers uses, and understand what it is doing (and if you want to change it). The save as will fall back to using pydicom, so any issues there should be asked to that project….
vsoch commented on issue flux-framework/flux-core#5917.
…
vsoch commented on issue LLNL/Kripke#54.
That worked great - thank you!…
vsoch commented on issue kuzudb/kuzu#3406.
That would be great! I don
vsoch commented on issue kubernetes-sigs/kueue#2093.
I think the administrative use case is good, but it seems much smaller than what I hoped for with respect to this tool - a way to manage and understand the running workloads (the user case, for which I think there are many more than administrators). Looking forward (hoping) to see the latter….
vsoch commented on issue kubeflow/training-operator#2091.
Ah - this looks more promising. https://github.com/kubeflow/mpi-operator/pull/567/files…
vsoch commented on issue kubeflow/mpi-operator#610.
It looks like it defaults to CPU, but it’s not clear to me how communication is setup. Is it just using a shared volume at /workspace
? if that’s the case, what’s the point of an operator that supports MPI?…
vsoch commented on issue flux-framework/flux-core#5917.
@grondo if we put together a PR to flux-core, could it be considered? I took a look at the design this weekend, and it could be that we add a job-manager/plugins/dependency-name.c
but also could work to add an entry for dependency.name
to dependency-after.c
…
vsoch commented on issue flux-framework/flux-core#5917.
I can try writing one! And I understand this point: …
vsoch commented on issue snakemake/snakemake-storage-plugin-gcs#41.
Thansk @jjjermiah ! You probably want to remove the __pycache__
stuff….
vsoch commented on issue memgraph/memgraph#1975.
Thanks! So if I understand correctly, I’d need to have everything represented in key values pairs associated with nodes. For example, if a physical node resource is scheduled, it might have scheduled=true, or (because that is too simple for a need to find a time into the future) more likely, a timestamp when the node will next be free that increases as it is scheduled. …
vsoch commented on issue memgraph/memgraph#1975.
> Hi @vsoch, not sure what exactly is the context of representing the state, are you looking to store it in a property or is there other solution you’re looking for? …
vsoch commented on issue kubernetes/community#7647.
We have a pretty good design going, but won’t have something good to share until the latest PR is merged (with quite a lot of changes). I’m following this issue so I can come back if/when we do. …
vsoch commented on issue urlstechie/urlchecker-action#108.
That
vsoch commented on issue flux-framework/flux-docs#269.
The main devs can comment, but what I think is confusing (that maybe you are getting at) is the default changes depending on being within or outside of s lot. It seems that outside of a task slot (unless otherwise specified) it is false …
vsoch commented on issue urlstechie/urlchecker-action#108.
Does it reproduce locally?…
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#29.
@johanneskoester I’m going to bed, but if you see some hint about the error in the cloud logs that would help me to debug. Goodnight!…
vsoch commented on issue kubernetes-sigs/kueue#487.
I originally couldn’t do it because we didn’t have a proper CLA, but we do now, in case help is still wanted. Let me know!…
vsoch commented on issue urlstechie/urlchecker-action#108.
Is this new / were they OK before? Could it be ephemeral?…
vsoch commented on issue opencontainers/wg-image-compatibility#15.
We had four +1 reviews - is this one good to merge?…
vsoch commented on issue flux-framework/flux-sched#1178.
:partying_face: …
vsoch commented on issue GoogleCloudPlatform/hpc-tools#3.
Thanks for the fix!…
vsoch commented on issue spack/spack#32312.
oh man, protobom! I love protocol buffers so this is :pinched_fingers: …
vsoch commented on issue kubernetes-sigs/scheduler-plugins#722.
Great! Here is the automation for what we are running - I’m building a tool to collect data about scheduler decisions to add to this, but that should minimally reproduce (and you can change the timeout or look at earlier runs (the directory names) to find the initial bug. https://github.com/converged-computing/operator-experiments/tree/main/google/scheduler/run10#coscheduling…
vsoch commented on issue kubernetes-sigs/scheduler-plugins#722.
> 120 seems too much as a general default value IMO. Actually in additional to plugin-level config, it also honors PodGroup-level config, which can be specified in the PodGroup spec, and it takes precedence over the plugin-level one: …
vsoch commented on issue kubernetes-sigs/kueue#2001.
Thanks, they look good! Hopefully the automation will work next time so you don’t have to do manual work….
vsoch commented on issue spack/spack#43331.
This is still failing almost all our builds and updates, almost every night, reliably. :cry: …
vsoch commented on issue kubernetes-sigs/kueue#2001.
hmm the rebase won’t work because it adds my user to all the previous commits. …
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#29.
Ping @johanneskoester can you review again?…
vsoch commented on issue flux-framework/flux-k8s#74.
Failure due to controller entry point change from 5 days ago. Hopefully won
vsoch commented on issue flux-framework/flux-sched#1178.
Woot!! Ping @trws :green_circle: …
vsoch commented on issue opencontainers/wg-image-compatibility#15.
To be clear for this comment: https://github.com/opencontainers/wg-image-compatibility/pull/15#discussion_r1555142734 …
vsoch commented on issue flux-framework/flux-python#11.
Sure thing, thanks for the notice! The flux versions are moving very quickly these days. Going to close the issue - please re-open or comment if something else comes up….
vsoch commented on issue flux-framework/flux-sched#1169.
There is also something called a key value flyweight, I wonder if we need to use that for some of the subsystem maps? https://www.boost.org/doc/libs/1_79_0/libs/flyweight/example/key_value.cpp. I also don’t know the difference between when they show: …
vsoch commented on issue flux-framework/flux-sched#1169.
Some tiny progress! Thanks to @milroy for seeing this. Here is the first failure to build (this is for the focal build) …
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#29.
huh, but if it works for you that’s great! Let’s get @johanneskoester to try it out for another test….
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#46.
Also if you are new to rebasing, I made a very dumb video a few years ago, haha. https://youtu.be/9F4RE2_yn6I …
vsoch commented on issue flux-framework/flux-core#5862.
> Additionally, if there are any other tips or workarounds for building and testing Flux within a Singularity container without sudo privileges, I’d like to hear them. …
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#29.
That
vsoch commented on issue oras-project/oras-py#129.
@my5cents looks like you just need one more run of black and we’re good (take note of the version)….
vsoch commented on issue flux-framework/flux-docs#267.
Thanks @garlick ! I’ll get started on these changes and ping you when they are ready for a second review. I really appreciate it!…
vsoch commented on issue sustainable-computing-io/peaks#9.
Perfect, thank you! Is there a link to that somewhere (prominently) here?…
vsoch commented on issue sustainable-computing-io/peaks#9.
Hi! Is work still underway here? I am interested in the project idea but I don’t see any custom scheduler plugin code (is it somewhere else)? Thanks! …
vsoch commented on issue spack/spack-infrastructure#795.
Thank you!…
vsoch commented on issue singularityhub/singularity-cli#220.
I think it
vsoch commented on issue spack/spack#43331.
Again tonight. …
vsoch commented on issue opencontainers/wg-image-compatibility#13.
> Damn, I just got called a “no one”
vsoch commented on issue go-hep/hep#1010.
hey @sbinet I’m trying to bring life back to the project, and (if nothing but a learning exercise) it’s been really fun so far! I was able to implement the sampler-to-sink example, here and I was wondering if I could ask for help with the request reply? I’m following (what I perceive to be) the logic in the FairMQ example, but my message is never received by the server. If you might be able to take a look and give me some hints, I’m hoping to get this one working, then try the router/dealer pattern, and my ultimate goal is to have something that can send pair to pair messages between nodes (if that is possible). And apologies for my naivete - I’m new to developing with these. Thank you!…
vsoch commented on issue singularityhub/singularity-hpc#672.
There are over 8K containers in the registry, and they are added in an automated fashion, and indeed we don’t check for that. If you’d like to PR to the registry to remove this tag and choose a better one, or just select another one, please feel free….
vsoch commented on issue go-hep/hep#1010.
These are fantastic! It may seem like a tiny thing, but I will definitely try them out. It’s unfortunate there isn’t more work of this type with Go. Understandably most folks like MPI for HPC, but I think Go has a lot of interesting scientific use cases (and especially for distributed). Anyway, I really appreciate your insights….
vsoch commented on issue Parsl/parsl#3259.
Sorry can
vsoch commented on issue chrislusf/gleam#203.
Excellent! I’m looking for an example where there is a main leader that sends pieces of work to workers, and they send back, for example some result value (that might fill in one pixel of an image). Is there a particular example I should look at to get me started?…
vsoch commented on issue vsoch/watchme#72.
hey @samhodge-aiml ! This seems like a cool idea (and simple to implement) but I’m not sure I’ll have time to work on it soon - too many cool things going on <3…
vsoch commented on issue singularityhub/shpc-registry#118.
I do believe the Oras cli in go has that, so if you can find the underlying call (e.g url and params) that should be enough for me to fix here. Thank you!…
vsoch commented on issue kubernetes-sigs/noderesourcetopology-api#1.
He’s leaaaavin’ on a jet plane… :airplane: …
vsoch commented on issue vsoch/pull-request-action#101.
All set and merged / releases, version 1.1.1. Thanks!…
vsoch commented on issue opencontainers/wg-image-compatibility#14.
…
vsoch commented on issue vsoch/pull-request-action#101.
Sorry not sure I can help with advice for how to test enterprise. Maybe use a free account?…
vsoch commented on issue snakemake/snakemake-executor-plugin-flux#8.
Let me know if there are substantial changes that would warrant a deeper look….
vsoch commented on issue rootless-containers/usernetes#318.
Wow time flies - thanks again for your help on this @AkihiroSuda ! I thought of it because I’m running this again, just on slightly larger / better infrastructure (network and scale wise). To return to our last correspondence, for those interested in the talk, it’s the Bare Metal Bros and was really fun to do - we are hoping to extend this to a reproducible setup for others to use (actually I’m mostly done with that as of this week, just tidying it up for our own experiments). AWS uses the elastic fiber adapter, and getting that working with EFA and usernetes took some elbow grease! …
vsoch commented on issue prefix-dev/pixi#634.
Absolutely! And I’ll try them out the next time around. Here are the images in GitHub packages: https://github.com/prefix-dev/pixi-docker/pkgs/container/pixi. …
vsoch commented on issue vsoch/pull-request-action#101.
If you try the PR branch I linked it should print the one that is None. Thank you!…
vsoch commented on issue converged-computing/oras-csi#28.
This looks like a bot. if you have a real issue, please report it. Closing….
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#29.
Thank you!
vsoch commented on issue rootless-containers/usernetes#322.
Ah, got it working! I will post the full update tomorrow - basically I needed to hack the daemonset a bit, and then add the correct annotations for it to bind to the pod (of the job). Then I could run a sleep job, shell in, install fi_info
for libfabric, and see efa and run the tests.
…
vsoch commented on issue flux-framework/flux-python#9.
okay these should be good to go - https://pypi.org/project/flux-python/#history. Let me know if there are any other issues or if we are good to close here….
vsoch commented on issue acrlabs/simkube#105.
Thanks!…
vsoch commented on issue NixOS/nixpkgs#198721.
I
vsoch commented on issue snakemake/snakemake-storage-plugin-gcs#23.
@jeffhsu3 the easiest thing to do is poetry run black <root>
and poetry run flake8 <root>
in case you aren’t doing that. I don’t use poetry a lot so I struggle a bit, but that’s what works for me….
vsoch commented on issue snakemake/snakemake-interface-executor-plugins#57.
I would be OK to accept if it fixes an issue, but I think @johanneskoester needs to approve / merge this one….
vsoch commented on issue opencontainers/wg-image-compatibility#13.
If there is still going to be an external artifact, then I can remain indifferent! But if this is just it, this is probably far worse for all the use cases I care about….
vsoch commented on issue acrlabs/simkube#104.
Thanks @drmorr0 ! I got everything running last night, but for some reason the pods never go out of pending: …
vsoch commented on issue spack/spack#42985.
That’s indeed what I was trying to do, yes. :laughing: …
vsoch commented on issue kubernetes/kube-openapi#461.
Perhaps it’s an issue with using it to generate the python files? I see: …
vsoch commented on issue flux-framework/flux-operator#218.
This is done with #219 …
vsoch commented on issue converged-computing/rainbow#15.
This will be closed with #14…
vsoch commented on issue converged-computing/cloud-select#37.
That’s actually OK, this was done. Thanks stale bot :) …
vsoch commented on issue rseng/zenodo-release#14.
- The html url would be another variable for the action …
vsoch commented on issue kubernetes/kops#16066.
Please don’t close nobody has responded. Thanks!…
vsoch commented on issue flux-framework/flux-restful-api#60.
Here is a different approach to try (what I would do). Since we already have a database setup, get rid of the basic auth and needing to cache things in the browser and use JWT tokens, and then you add that to depends for each view. https://medium.com/@chnarsimha986/fastapi-login-logout-changepassword-4c12e92d41e2. …
vsoch commented on issue flux-framework/flux-coral2#131.
@jameshcorbett I was thinking in the context of a jobspec that is able to request storage - it sounds like if the request is too small we would run into trouble, and maybe not every cluster has a behind the scenes solution to set a minimum in that case. So when we check to see if a jobspec can be satisfied (and is valid) we need to ensure it’s not too small. And maybe if it is, and it’s a global truth (meaning it would be too small for any cluster) we could tweak the jobspec on our own, kind of like a mutating …
vsoch commented on issue flux-framework/flux-coral2#130.
There is too much wabbit storage in this test! :rabbit: :rabbit: :carrot: :carrot: …
vsoch commented on issue converged-computing/rainbow#14.
Small update - more detailed output for checking the attributes, and I added a recursive function call to recurse into the jobspec (which was not properly happening before). Go doesn’t have the ability to do “while (for) there are objects in this array” because the length of the array is pre-determined. So this recursive approach is probably best. …
vsoch commented on issue c0mm4nd/go-hwloc#2.
I’m generally interested in getting the equivalent output that I might get for lstopo or in the xml file generated - I’m not a C programmer so the “it’s easy” part is probably not applicable here! :laughing: Do you have examples / know someone that has used your library that might help me get started?…
vsoch commented on issue rseng/zenodo-release#11.
> which is probably not right anyhow and I need to convert the tarball to zip although perhaps I will try .tar.gz and see if Zenodo preview that. …
vsoch commented on issue flux-framework/flux-restful-api#60.
Thanks @khoing0810 ! I should have time this weekend to review….
vsoch commented on issue converged-computing/rainbow#12.
This tool appears to be deprecated so I wound up using standard runtime
and then manual GC, and it does look like we are cleaning up. I’ll keep this in mind moving forward.
…
vsoch commented on issue compspec/jobspec-go#1.
And this should come with associated functions to generate them, and since they are versioned, we would want the Version
field to be populated appropriately….
vsoch commented on issue c0mm4nd/go-hwloc#2.
That worked! I’ve updated the PR here. Quick question for you - what is a “hello world” example for using this? I’ve been trying to do a basic init and then “print something out I can see” - my first effort the print segfaulted, and then I followed a pattern in a test and it doesn’t segfault (but I don’t see anything). I’m new to using hwloc outside of the command line tools so apologies for my naivete. Here is what I am testing: …
vsoch commented on issue kubernetes/kube-openapi#461.
I’m probably going to just pin it to the old version - I got a variant working: …
vsoch commented on issue compspec/compspec-go#29.
Closed with #30. I also made the extraction more lenient within a section - if --allow-fail
is set this is not just applied to the top level extractor, but to sections within. …
vsoch commented on issue c0mm4nd/go-hwloc#2.
Ah I think I see the issues you were probably running into? Go doesn’t like the static bit for a module: …
vsoch commented on issue urlstechie/urlchecker-python#90.
I think we are good here and figured it out in https://github.com/urlstechie/urlchecker-python/pull/89. Thanks for your help here @SuperKogito !…
vsoch commented on issue kubernetes-sigs/jobset#291.
I haven’t seen it again, so happy to close (and reopen if I do). Thanks!…
vsoch commented on issue singularityhub/singularity-compose#68.
Could you please test build and install of a wheel directly? I can release that alongside if needed. Otherwise, please make a specific suggestion for what you’d like changed. Thanks!…
vsoch commented on issue openjournals/joss-reviews#6374.
I can’t offer my time now, but thank you for thinking of me!…
vsoch commented on issue kubernetes-sigs/node-feature-discovery#1599.
It could be that hwloc is a better fit for this - I found a library in go but it has a bug so I opened an issue….
vsoch commented on issue hpc-social/good-first-issues#11.
fixed…
vsoch commented on issue hpc-social/good-first-issues#11.
okay just kidding, just threw it together, will close this after it runs. I am doing it once a week because I don’t think I want to review this every day….
vsoch commented on issue flux-framework/fluxion-go#7.
I realize we will need not just an ability to convert JGF for rainbow, but also the grow functionality in flux-sched. I think @zekemorton is working on the guts of that? And I think a part of it might be here? https://github.com/flux-framework/flux-sched/pull/1061 …
vsoch commented on issue converged-computing/jsongraph-go#9.
This is fixed - it was just serializing at the wrong level!…
vsoch commented on issue kubernetes-sigs/node-feature-discovery#1599.
I’m also not seeing basics about number of physical vs logical cores - that seems obvious like it should be here?…
vsoch commented on issue spack/spack#42605.
> Do you still have any interest in snakemake and spack? Any chance you want to help inform the refactoring of spack ci generate by contributing a real snakemake generator following your original idea? Or if not, how about providing guidance for me or someone else to do it? …
vsoch commented on issue singularityhub/shpc-registry#202.
Have you verified this installs? There are no tokens, etc. needed?…
vsoch commented on issue kubernetes-sigs/node-feature-discovery#1581.
Hi folks! To keep this discussion moving (and not block our projects) I made a prototype of a “source only” node-feature-discovery, here https://github.com/converged-computing/nfd-source and you can see the number of dependencies we could nix by way of it here https://github.com/compspec/compspec-go/pull/22/files and that includes not needing to bump up to 1.21. It would be great to continue discussion, and I’d be happy to prototype something here to test out! I’m not familiar with the code base but I pick things up fairly quickly so not worried about that….
vsoch commented on issue spack/spack#42635.
Thank you!
vsoch commented on issue snakemake/snakemake-storage-plugin-gcs#19.
@w8jcik this plugin isn’t properly working yet, and I’m not familiar enough with the refactored interface to work on it, so any contribution you might make would be appreciated!…
vsoch commented on issue singularityhub/sregistry#447.
Normally you need the collection to exist first….
vsoch commented on issue flux-framework/flux-sched#1142.
We can have discussion here https://github.com/flux-framework/fluxion-go/discussions/3…
vsoch commented on issue flux-framework/flux-core#5709.
Yes - thank you @grondo ! Closing….
vsoch commented on issue snakemake/snakemake-storage-plugin-gcs#18.
Looks like it’s missing a format string! I was never able to understand the logic of these plugins (and get this one to work myself) so we might need to tag team with @johanneskoester. I’ve been using aws s3 instead….
vsoch commented on issue spack/spack#38037.
That’s a great idea!…
vsoch commented on issue kubernetes-sigs/node-feature-discovery#1581.
Also there is so much here, I seriously love this library <3 …
vsoch commented on issue krator-rs/krator#76.
Thank you for sharing these! I’m sorry it’s abandoned / about to be archived, it’s really cool work. I minimally appreciate it for the learning….
vsoch commented on issue flux-framework/flux-core#5728.
> flux filename …
vsoch commented on issue kubernetes-sigs/jobset#380.
I will ask! To be clear - “using it” meaning for development and prototyping or in production? We do not have a production Kubernetes cluster. That’s what we are working towards….
vsoch commented on issue flux-framework/flux-sched#1133.
And to discuss some planning, I’m hoping to start work on moving the go bindings out of tree after we are merged here, and after that continue work on fluence. I’m going to start from the same git history and be very careful to rename / move so that the original authors / commits are preserved. It would be really great if we could finish this one up today (and then I could work over the weekend) but if that’s not possible it’s ok too….
vsoch commented on issue converged-computing/aws-tofu#1.
Note this is replaced with https://github.com/converged-computing/flux-usernetes…
vsoch commented on issue singularityhub/shpc-registry#195.
Also your interest in this container led me to find a bug with tag parsing - nvidia recently added vex and sbom (as tags) and we weren’t filtering them out! Now we are. So thank you!…
vsoch commented on issue opencontainers/wg-image-compatibility#10.
I am as well, and I’m not hard set on my current implementation. I needed a prototype to prove the concept, and I’m planning to update it with the design that we decide upon (I hope it’s a good one)!…
vsoch commented on issue flux-framework/flux-sched#1120.
…
vsoch commented on issue singularityhub/shpc-registry#195.
If you clone that PR branch and change the registry entry in your settings.yaml to that clone path root it will install from it. It
vsoch commented on issue kubernetes-sigs/jobset#380.
It depends on if you think it’s really free of errors and potential issues, or not. I don’t see any harm in doing v1beta1 and then having that wiggle room, but it’s up to you!…
vsoch commented on issue flux-framework/flux-pmix#97.
Thank you!…
vsoch commented on issue LLNL/maestrowf#434.
> So, bundling these two as they’re closely related. First, just to clarify and make sure we’re talking about the same thing, I was really asking about what the ‘cmd’ block in the maestro spec looks like in this mode. Currently they’re all bash, so question was aimed really at does the step definition in the maestro spec change appreciably for this mode, or is the only real difference being that the step’s cmd/script is just executing in a container vs by an hpc scheduler? …
vsoch commented on issue panoptes-organization/panoptes#142.
I can
vsoch commented on issue flux-framework/flux-python#9.
This issue probably needs to be transferred to flux-core - any changes that are needed can be made there are trickle down here. …
vsoch commented on issue zekemorton/flux-sched#1.
Thank you for working on this!! Its badly needed definitely and I’m looking forward to using it!…
vsoch commented on issue betterscientificsoftware/bssw.io#1633.
okay, happy not to do any more work then! :laughing: …
vsoch commented on issue betterscientificsoftware/bssw.io#1633.
Hi @bartlettroscoe @markcmiller86 I wanted to follow up here - realizing that probably I’m best oriented to work on this, I put aside the work I wanted to do for this afternoon and tackled this issue. I have a new release of urlchecker-python (0.0.35) and a branch with the action that you can test. Importantly: …
vsoch commented on issue zekemorton/flux-sched#1.
For final touches, you’ll want to convert the original int into the Go types we’ve added, there is an example in the code I can show again here: …
vsoch commented on issue zekemorton/flux-sched#1.
And @zekemorton let me know if you’d like me to rebase this - from our discussion in the other thread I grepped you didn’t plan to merge and integrate into your branch. It would kill a few birds with one stone if you wanted to, but I’m happy to follow up with another PR….
vsoch commented on issue urlstechie/urlchecker-python#90.
@SuperKogito that just means that there is an error within the task. The way to debug is to run the same in serial, likely on a local machine so you can IPython.embed() and test why there are no match results for Urls (there should be) and then figure out how to update the regexes….
vsoch commented on issue urlstechie/urlchecker-action#105.
Also double check you installed ca-certificates
in the container, and try using --network=host
too. Likely that won’t fix it (I am terrible with Macs and know they are terrible with docker) but just a suggestion!…
vsoch commented on issue flux-framework/flux-python#9.
okay give 0.58.0 a shot. https://pypi.org/project/flux-python/ You can sanity check what flux sees with flux --version
and make sure the versions match, and also import flux to see flux.__file__
…
vsoch commented on issue betterscientificsoftware/bssw.io#1633.
Ah gotcha! If you want to make new additions or changes speeder, then I’d recommend changed files: https://github.com/marketplace/actions/changed-files. I use that for container matrices so I only build containers with updated Dockerfile. If you are concerned about existing links breaking (across many files) a dumb thing I do is to segment a list of things (e.g., paths) into equal lists based on matches hashes to calendar month days, then run for the day (a small subset) each night. https://github.com/vsoch/split-list-action. You probably don’t want to be checking everything on every PR, every time!…
vsoch commented on issue urlstechie/urlchecker-action#105.
A pdf file is not a text readable readable file, so you should not ask the checker to parse it (or add to ignore)….
vsoch commented on issue snakemake/snakemake-executor-plugin-kueue#8.
@alculquicondor if anything jumps out at you here, your insights would be appreciated. I’ve tried a few times now to get this running - I’m installing the latest and pairing with queue, and each time the launcher continually crashes and in about half a second so I don’t have time to see what is going on. I assume I can’t debug the worker if the launcher isn’t working, so not sure how to debug this. I started with the vanilla example in the mpi operator repository and then moved to the example my team was using for canopie-22 (which was working) but no luck either way. …
vsoch commented on issue snakemake/snakemake-executor-plugin-kueue#2.
This is handled with a remote (e.g., AWS). I still haven’t gotten this working on the MPI Operator - if the design means the output is generated elsewhere (and it’s expected to be on the launcher) we might run into an issue - will try testing again….
vsoch commented on issue snakemake/snakemake-interface-executor-plugins#56.
@Feelx234 possibly a dumb question - why not use two wildcards instead of putting two variables into one? Can you better explain / show me the use case - I’m assuming it’s a separator of some type? Or content that would go into a csv file that you want to put in a variable instead?…
vsoch commented on issue flux-framework/flux-sched#1133.
@zekemorton @trws @milroy I put together a quick prototype for what I think might be a good pattern for adding the faux enum, and also testing it: https://github.com/zekemorton/flux-sched/pull/1 …
vsoch commented on issue flux-framework/flux-docs#260.
Gotcha. So I think if we are working on docs here, we can assume many readers will come with some (even minor) knowledge of what fair share is. Either we will let them make assumptions about our derivative and scope, or (what I think might be best) is we call that out, and then very clearly define the two and explain the difference….
vsoch commented on issue LLNL/maestrowf#434.
I’m running out the door for a quick run, but some quick answers (and can follow up with more detail where needed) …
vsoch commented on issue flux-framework/flux-docs#260.
And I’m worried when you say “the algorithm” that’s the only one, oh no. :laughing: …
vsoch commented on issue apptainer/singularity#380.
> Idk why the team is reluctant to add this trivially seeming functionality via a flag. What are your thoughts? …
vsoch commented on issue flux-framework/flux-sched#1134.
That said, the python bindings for flux core are pretty essential, so I think they belong in tree. Also, I realized just now this discussion is in the wrong place - I’m going to link it to the issue I opened https://github.com/flux-framework/flux-core/issues/5709. If we have more discussion let’s pick up there….
vsoch commented on issue flux-framework/flux-core#5709.
Note that discussion started here: https://github.com/flux-framework/flux-sched/issues/1134#issuecomment-1913664132…
vsoch commented on issue eksctl-io/eksctl#6869.
This was the PR for this particular issue - closed. https://github.com/eksctl-io/eksctl/pull/6870…
vsoch commented on issue kubernetes-sigs/jobset#146.
Actually I just realized @ahg-g was literally sitting on the stage for my talk, so he saw (some of the high level) description about the need for easy to deploy common patterns for jobs. Really excited that you are going to work on this, and please include me if you are able….
vsoch commented on issue flux-framework/flux-restful-api#55.
I added a simple variable to control mode #65, and it separates the flux auth mode (which was previously tangled). I’ll update this issue to be about different kinds of auth, since we just have the basic auth that feeds into token with a signed payload. I’m not sure (for the work we are doing) we need much more than that right now. …
vsoch commented on issue flux-framework/flux-operator#214.
This is also done….
vsoch commented on issue apptainer/singularity#380.
And can you give me a concrete example of such an application that requires this knowledge? Just as an FYI, you can still set the present working directory at runtime with --pwd
…
vsoch commented on issue kubernetes-sigs/jobset#146.
That’s awesome! That last paragraph is literally the design of the metrics-operator - the idea was to use JobSet like legos, adding on whatever additional volumes / commands/ features are needed. And it worked pretty well, although I probably went a bit too deep in terms of wanting to play with an interface design. Please ping me if there is something fun to collaborate on….
vsoch commented on issue kubernetes-sigs/jobset#381.
> I think it is better to handle concrete use cases than discussing hypothetical or abstract scenarios, so lets list the user stories that you would like to be taken into account. …
vsoch commented on issue vsoch/scif#67.
Closed with #68 …
vsoch commented on issue kubernetes-sigs/kueue#487.
Hey is there a status update here? I made my first plugin last weekend and it was fairly simple so I could jump in to help if the current assignee is low on bandwidth….
vsoch commented on issue vsoch/scif#67.
I maintain hundreds of projects, so I will work on them on an as-needs basis. I’m glad you are using SCIF!…
vsoch commented on issue supercontainers/compspec#2.
I like the idea of having them be a graph that is flattened out - and I do think what we have here will ultimately change! Let’s keep it dumb and simple for now, share these ideas with the working group when it comes up, and then we will do another sweep / update for what the group eventually decides. …
vsoch commented on issue kubernetes-sigs/jobset#133.
Regardless of the issue I’ll add this to my master TODO as a reminder for me….
vsoch commented on issue kubernetes/sample-cli-plugin#6.
Working! I was being an idiot and using the wrong server address. :upside_down_face: …
vsoch commented on issue singularityhub/singularity-cli#218.
yes….
vsoch commented on issue flux-framework/flux-core#5691.
Thank you!…
vsoch commented on issue snakemake/snakemake-interface-executor-plugins#55.
Agree, I would instead do: …
vsoch commented on issue singularityhub/install-singularity#3.
Thank you!…
vsoch commented on issue opencontainers/wg-image-compatibility#8.
> I think we have adress two different topics with the different specs. One topics is what we want to read and the other how we read it. …
vsoch commented on issue kubernetes/community#7684.
I’m so excited to see this move through - thank you for your support @alculquicondor ! :raised_hands: …
vsoch commented on issue expfactory/expfactory#177.
It would be possible as look as you have a hook to save data (sending to the server) and then proceed to a next task. I’m not familiar with psychojs but can imagine it working similarly to the most common framework we use here, jsPsych. This is really the most important part: https://github.com/expfactory-experiments/nback-10min-animals/blob/e2b88886097bb02eb1e70a1a5e42e287e9716b18/index.html#L48-L67…
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#25.
Also with the fix for the gs, the jobs are (finally) green! I won’t show you how many red / failed there are, let along that it takes 7 minutes per one step run for a hello world… :grimacing: …
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#25.
ah! Just found it in an environment locally - will test removing here and seeing if that somehow (magically) removes it from the remote. That doesn’t make sense, but if snakemake is getting the plugins from my local call, it actually would! Will report back….
vsoch commented on issue apache/airflow#22253.
I understood that. The high level pattern is adding developer churn to the process of review, regardless of the specific details….
vsoch commented on issue apache/airflow#22253.
Going to agree with you @bolkedebruin. I had similar issues about 5 years ago, and while I think the intention is best, when it results in a PR being open on the order of years and not a more reasonable few months, it gives me pause. I can’t comment on the details of the community here because it’s been too long since I contributed, but it seems to me that something is off when the PR time is on the order of years. My heart goes out to you @potiuk, I hope you find the right balance, and seek feedback if/when you determine that things might not be working….
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#25.
I used the exact logic from the previous snakemake life sciences plugin, so if that is the case, it wasn’t updated to support this design. Can you point me to a plugin that is using preemtible correctly? …
vsoch commented on issue singularityhub/sregistry#444.
I don’t know off the top of my head, but generally speaking if someone has an issue they open it….
vsoch commented on issue spack/spack#42001.
Agree that does appear to be the issue - and it’s working again. https://github.com/flux-framework/spack/actions. …
vsoch commented on issue snakemake/snakemake#2598.
It looks like #2599 is closed, so I’m going to close the issue here. Let us know if there is anything else we can help with @musicinmybrain!…
vsoch commented on issue snakemake/snakemake#2598.
Hi @johanneskoester @musicinmybrain! …
vsoch commented on issue rootless-containers/usernetes#319.
@AkihiroSuda could it be mtu related if we are also using the network with mtu 1500 for the comparison case? For example, see the middle one here: …
vsoch commented on issue prefix-dev/pixi#634.
The issue is that you need the lock file in the outside of the container to copy to it to ensure the same thing builds….
vsoch commented on issue kubernetes/enhancements#3371.
hey @denkensk is there any reason you stopped working on this? We would find this highly useful for one of our needs….
vsoch commented on issue kubernetes/community#7659.
> We usually sync with the open source community at night, so it’s usually hard to get up early in the morning. And 9-10 am we usually on our way to the office in China.
vsoch commented on issue eksctl-io/eksctl#6743.
Please re-open again, thank you!…
vsoch commented on issue vsoch/pe-predictive#1.
You could try pulling the old image and then doing a pip freeze to see them. I suspect even with the pinned versions it might not work, but I’m happy if you want to share them and we could try a build. I think better would be to update the code with newer versions of things (and then pin those)….
vsoch commented on issue opencontainers/wg-image-compatibility#3.
@sudo-bmitch do we have a template for this so I can assess the different pieces in that context?…
vsoch commented on issue converged-computing/kubescaler#18.
This is underway, and likely mostly done - will finish up this afternoon after a run. https://github.com/converged-computing/metrics-operator-experiments/tree/main/google/spot-instances/run0/test…
vsoch commented on issue snakemake/snakemake#2174.
This is replaced by the kueue plugin, which now has support for using the flux operator (and MiniClusters) https://github.com/snakemake/snakemake-executor-plugin-kueue…
vsoch commented on issue flux-framework/spack#151.
This will be closed by https://github.com/spack/spack/pull/41974…
vsoch commented on issue kubernetes/community#7659.
Again: …
vsoch commented on issue flux-framework/flux-coral2#120.
The wabbits are down, you say? Did you try giving them carrwots? :carrot: :carrot: …
vsoch commented on issue oras-project/oras#1224.
Here is the full error I was getting (sorry didn’t realize it wasn’t here, maybe I chose the wrong issue)!: …
vsoch commented on issue oras-project/oras#1224.
I just hit this issue for a push, and downgrading worked for me! Specifically: …
vsoch commented on issue kubernetes/community#7659.
@alculquicondor and others, do you want another doodle / way to vote? What would be best to help choose this?…
vsoch commented on issue kubeflow/mpi-operator#611.
You could also just use a different MPI flavor that will use the DNS names and call it a day :)…
vsoch commented on issue flux-framework/flux-k8s#47.
> Got it, I missed that detail. Is there any advantage to dr := &pb.CancelResponse{JobID: in.JobID, Error: 0} instead of dr := &pb.CancelResponse{JobID: in.JobID}? …
vsoch commented on issue kubeflow/mpi-operator#611.
This would be handled well by the flux operator, which uses zeromq to bootstrap and if a pod (follower broker) goes down flux would see the node as down, and we could schedule to another node (possibly newly added, which would join the cluster and then be seen as going from down to up). Nodes going up and down happens all the time in HPC so our workload managers are used to handling that, and for the flux operator you essentially get your own scheduler within the indexed job. We do, however, use a headless service and not the host file. …
vsoch commented on issue kubeflow/mpi-operator#611.
> we can always add a simple sidecar or something that adds the hostname IP entries to the /etc/hosts file in the Pods …
vsoch commented on issue oras-project/oras-go#644.
@Wwwsylvia it’s probably good to open up to the community - I could have made time over break but now that we are back to work, I would be able to make time akin to others that might help….
vsoch commented on issue oras-project/oras-go#644.
hey @FeynmanZhou ! I remember looking at the code when I opened the issue, and I think I could take a first shot but I need some guidance about the logic. I’ve had a hard time following it since the refactor from push/pull to the more abstract design now. If someone can provide that guidance I’m happy to take a shot, otherwise feel free to assign to someone familiar with the codebase. …
vsoch commented on issue spack/spack#38037.
I didn’t ever figure out specific details - with spack things mysteriously break and then resolve later and that’s what happened for this case. The only suggestion I can make is to look at libarchive (e.g., commit from May that mentions iconv, maybe that’s when it showed up https://github.com/spack/spack/commits/develop/var/spack/repos/builtin/packages/libarchive) and then check your package.py for flux-core - do you have at least this one? https://github.com/spack/spack/commit/10999c02836fa6a510871d24ad6548d12d2b72ae….
vsoch commented on issue rootless-containers/usernetes#318.
@AkihiroSuda my colleague had an insight that gave us (at least a solution for now) that allows us to ping the hostname running the pod directly! The missing piece was defining the hostPort, here is the diff for the relevant section. …
vsoch commented on issue rootless-containers/usernetes#318.
oh i see the issue - that parameter is for when it starts so I snuffed out the lower value! Going to try again and dangerously set it to 0 (don’t worry this cluster is extremely isolated)….
vsoch commented on issue VClinic/VClinic#1.
Ack sorry - just copy pasted the one shown in the terminal! Here you go: …
vsoch commented on issue converged-computing/flex-aws-topology#1.
This is fixed!…
vsoch commented on issue NixOS/nixpkgs#198721.
Okay, rebased and squashed, and addressed the review comments that I knew how to! I opened this 14 months ago, and since then don’t have my same environment (so I don’t have a way to test) but I don’t want to just give up on the contribution. I had really big hopes for using Nix but it feels like it’s (overall) too hard….
vsoch commented on issue flux-framework/flux-coral2#117.
:raised_hands: :raised_hands: :raised_hands: …
vsoch commented on issue eksctl-io/eksctl#6869.
Please do not close the issue….
vsoch commented on issue VClinic/VClinic#1.
Here you go… …
vsoch commented on issue VClinic/VClinic#1.
I think likely we don’t want to rely on Python 2….
vsoch commented on issue NixOS/nixpkgs#198721.
@luzpaz contributing a package was so hard I think I largely gave up….
vsoch commented on issue snakemake/snakemake-storage-plugin-gcs#7.
@johanneskoester if you have bandwidth, I am done with the intel MPI example but blocked here: https://github.com/snakemake/snakemake-executor-plugin-googlebatch/issues/18. I think there might be a bug with this example that I got from snakemake (the output is generated but empty)….
vsoch commented on issue singularityhub/singularity-compose#67.
Here is the updated link - feel free to open a PR to fix! https://github.com/singularityhub/singularity-compose-examples/tree/master/v1.0/rstudio-simple …
vsoch commented on issue jsongraph/json-graph-specification#57.
In case anyone is interested, I’m working on them here: https://github.com/converged-computing/jsongraph-go. We have a few projects in mind for these (so updates likely) but for the time being I’m keeping it very simple to just define structs for the different types of graphs supported in v2. Thanks!…
vsoch commented on issue converged-computing/flux-views#8.
Thanks @rajibhossen ! I’ll test these out today and if they look good, will push the images and then merge here….
vsoch commented on issue rse-ops/flux-spack-docker#1.
opened this but not going to pursue until someone actually needs. Specifically the base container is old (and going to go away) and I’d rather build from ubuntu fresh, but we already have builds with flux for that. https://github.com/rse-ops/spack-flux-container…
vsoch commented on issue lima-vm/lima#368.
Even better! …
vsoch commented on issue kubernetes-sigs/noderesourcetopology-api#1.
hey folks! My team is interested in this work (and I could help if needed). Is there a status update / what are next steps?…
vsoch commented on issue kubernetes/enhancements#3545.
hey folks! My team is interested in this KEP - it looks like all the alpha/beta check boxes are purple (merged) and the release contender was 1.28? Was this released with 1.28 and I missed it https://kubernetes.io/blog/2023/08/15/kubernetes-v1-28-release/ or is it still TBA (and possibly not documented) because the issue here is still open? Thanks! And apologies for my naivete about this process….
vsoch commented on issue flux-framework/rfc#406.
oh neat! Is this what I was doing with flux exec
(and wasn’t aware what it was called)?…
vsoch commented on issue flux-framework/flux-core#5628.
Thank you mergify bot. You work so hard. :cookie: …
vsoch commented on issue archspec/archspec-go#13.
Thank you @alalazo! I should be good for early development, but would be great to have a suggested solution in place after that. …
vsoch commented on issue rse-ops/docker-images#112.
Thank you! …
vsoch commented on issue rootless-containers/usernetes#308.
Yes! I have two variants (that are a bit simpler) that I’m using too. Thanks to you both for the help!…
vsoch commented on issue lima-vm/lima#368.
Gotcha! And super cool - thanks for giving me this nugget of info. I know about file
but didn’t consider trying it here (I was going to try opening it in vim to see if there was an identifable header, like with ELF, haha).
…
vsoch commented on issue hpc-social/jobs#18.
Will do! We did used to accept PRs, but they always seem to lead to bugs. :bug: :lady_beetle: …
vsoch commented on issue flux-framework/flux-sched#1120.
Here is what we needed in fluence: https://github.com/flux-framework/flux-k8s/blob/105562e4662e86a1b8ce1e19762678a6c0dd6309/src/fluence/fluxion/fluxion.go …
vsoch commented on issue flux-framework/flux-core#920.
Is there still something we need to do here? The repository here https://github.com/flux-framework/spack/ has been working fairly well to regularly test builds and update packages in upstream spack. If we are good with that, we can close the issue here….
vsoch commented on issue brainhackorg/brainhack_cloud#61.
Thanks for the help on this last year, it was fun to try out!…
vsoch commented on issue lima-vm/lima#368.
Silly question - I saved a snapshot, e.g.,: …
vsoch commented on issue flux-framework/flux-sched#1120.
Update: library is now officially ice-cream-ized: …
vsoch commented on issue flux-framework/flux-sched#1120.
@milroy this is fantastic! :raised_hands: For a quick example of what this enables, I was (very easily) able to build a go module out of tree: …
vsoch commented on issue spack/spack#41708.
We’re good! https://github.com/converged-computing/flux-views/actions/runs/7225964629/job/19690533110 they won’t finish in the 6 hour limit (I’ll need to bring up arm instances on AWS to do these custom builds) but spack is working well. Thanks!…
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#17.
I think I see a possible error - the executable was not made executable it seems: …
vsoch commented on issue kubernetes-sigs/jobset#353.
okay I figured this out. …
vsoch commented on issue hpc-social/jobs#18.
hey @msleigh ! If you’d like to add a job, please use the form here: https://hpc.social/jobs/about/ …
vsoch commented on issue prefix-dev/pixi#557.
I can try this next time! …
vsoch commented on issue flux-framework/flux-sched#1094.
@grondo I think I could give it a shot (it could just be an envar or similar I think?) but mostly wanted to get permission if there is some licensing issue or other thing I don’t know about. But I’m going back to sleep for a bit anyway so… happy to wait for the wisdom of @trws ! :raised_hands: …
vsoch commented on issue flux-framework/flux-sched#1116.
Thanks @grondo! …
vsoch commented on issue urlstechie/urlchecker-action#104.
@SuperKogito my first suggestion to @kubu4 is to try processing in batches (e.g., multiple runs on different roots, and that can be put into an action matrix). If that doesn’t work, then I think we should add some kind of support to handle that internally….
vsoch commented on issue ECP-copa/ExaMPM#56.
Thanks for the suggestion! The lammps metrics container uses mpich and the one here is openmpi, so I don’t think we could do that. I did try mpich too (with the same command) and got a non-working result….
vsoch commented on issue urlstechie/urlchecker-action#104.
You could also just target runs on separate subdirectories (one at a time or in a matrix), depending on how large your repository is….
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#17.
That worked! We made it through the compile step, and I see pi_MPI in storage. Now it looks like the mpiexec (same workflow above) is failing but I don’t see any error why: …
vsoch commented on issue pydicom/deid#259.
hey @peter-kuzmak ! I have limited bandwidth to help on this, but if you can share a dummy dicom file and the dicom.deid and the exact way to reproduce your use case, I can likely make some time over a weekend. Thanks!…
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#17.
> Can you please post the full log? I think it is related to the upload logic for local jobs. There might be a bug. …
vsoch commented on issue singularityhub/singularity-cli#215.
I would be OK with matching the convention that SIngularityCE uses - if they allow dashes (and have a regex to check) we can do that too. I’m open to reviewing a PR that makes this change if you’d like it!…
vsoch commented on issue rse-ops/docker-images#111.
I’m going to pass maintainership over to @davidbeckingsale and his RSE team - I am not as much in touch with the needs of the code teams, and not officially on RADIUSS time anymore. So please do what you think is best here. Thanks!…
vsoch commented on issue aws-samples/efa-device-plugin-helm#1.
Yes of course! All set. I also ensured they were in alphabetical order….
vsoch commented on issue flux-framework/flux-sched#1112.
Yep that fixed it. TLDR: update your docker! Thanks @grondo, closing….
vsoch commented on issue vsoch/spack-package-action#10.
That
vsoch commented on issue snakemake/snakemake-interface-executor-plugins#34.
I didn’t wind up adding it (and properly testing it) but will keep this on my radar! …
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#16.
Should we put this into a discussion / as a question that can be found later?…
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#12.
@johanneskoester I’ll leave this question for the next time you are around (I pinged you to merge the finished work for the googlebatch executor for the above). I’m moving to the MPI workload and adjusting my paths to use the same s3 approach. This has an intermediate step, and it’s telling me that it cannot find an output (that should be generated from the compile) so I’m not sure why it’s doing this check: …
vsoch commented on issue singularityhub/singularity-cli#213.
No worries! So execute
is supposed to minic singularity exec
. And run_command
was previously a more hidden helper command to (one off) run a random command, but not inside the container (to the system). I do think it would be more intuitive if this current run_command
was an exec to the instance, and then the current was moved back to be a helper (hidden) function. I am happy to test this out and open another PR if you agree….
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#12.
When we can confirm the output (and maybe the best way for me to check) I’ll continue (finally!) with the more advanced use cases (e.g., MPI) now that this is working. This is great, and I’m so glad it’s finally running again! :partying_face: …
vsoch commented on issue singularityhub/singularity-cli#213.
Let’s start with the first issue posted above, which is fixed here https://github.com/singularityhub/singularity-cli/pull/214 (pending your testing and approval) then (from that branch) let’s update the thread here (or open a new issue) with a specific way to reproduce your second. Thanks!…
vsoch commented on issue flux-framework/flux-k8s#36.
Gotcha, thanks for the update! …
vsoch commented on issue converged-computing/metrics-operator-experiments#4.
Second update: I tested this more today and think it’s an improvement on the current. The spot API limits the unique requests I can make (e.g., changing zones) so I’m going to wait another day (or possibly two) to get another quota to do a full comparison between the two size groups. That should give me a ballpark estimate if (at least according to this algorithm) there is opportunity to continue with spot experiments. Either way, I’ll add the resulting data to this PR and then merge….
vsoch commented on issue singularityhub/sregistry#443.
Did you test on localhost first (not on EC2)?…
vsoch commented on issue rse-ops/docker-images#111.
I’d like to hear @davidbeckingsale thoughts. I’m not convinced that going from 3.29GB down to 2.41GB is a long term solution - I’d suspect the problem will re-occur and it needs more thought. I’m also wondering if Azure piplines is the best one to use, we don’t run into these issues in other places. …
vsoch commented on issue aws/aws-sdk-js-v3#4495.
I think this might be an underlying waiters problem (as opposed to a particular JDK). For the Python SDK, it will work for a while, and then entirely stop working, to the point that the waiters just wait forever (and given the nodegroup_active
waiter) I can easily use the kubeconfig.yaml and kubetl to see the nodegroup has been ready for 10-20 minutes, but the waiter is… still waiting….
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#12.
Gotcha! Updated message: …
vsoch commented on issue singularityhub/sregistry#443.
also, any reason to not just use a registry that supports pushing SIF (e.g., GitHub packages?) It’s expensive to run your own infra (I would know running Singularity Hub for 5 years)!…
vsoch commented on issue flux-framework/flux-core#5475.
Nice! I’ll add that to my containers….
vsoch commented on issue vsoch/pull-request-action#98.
You can also fairly easily do it separately….
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#12.
Bugs are gone (so far)! :partying_face: …
vsoch commented on issue singularityhub/singularity-compose#66.
I would always provide a src:dest
…
vsoch commented on issue eksctl-io/eksctl#6743.
Thank you!…
vsoch commented on issue singularityhub/sregistry#442.
I think those are two different entry points to auth - one is logging you into a web UI via a callback with a code (OIDC) https://www.pingidentity.com/en/resources/identity-fundamentals/authentication-authorization-standards/openid-connect.html and the command line is still hitting the Django auth system, but not going through that handshake. …
vsoch commented on issue eksctl-io/eksctl#6743.
Why was this closed?…
vsoch commented on issue aws/containers-roadmap#2225.
@tzneal can you show me an example of a launch template that updates the instances to use one thread per code (and i’m familiar with the snippets to do that)! I can figure out the logic to determine if the instance needs it, and I see that I can use it here: https://boto3.amazonaws.com/v1/documentation/api/1.26.85/reference/services/eks/client/create_nodegroup.html. I should be able to test it soon and give you feedback….
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#12.
I can get around that by setting a default here https://github.com/snakemake/snakemake-interface-executor-plugins/blob/d24c886cfd47ce21b2954f08d5383addffb4c900/snakemake_interface_executor_plugins/settings.py#L53C1-L53C1 and now I see: …
vsoch commented on issue converged-computing/metrics-operator#83.
lol, I’m such a froot loop - this is already supported! Thank you me of the past that anticipated needing this :)…
vsoch commented on issue 3dem/relion#1040.
That’s perfect - thank you! Will test tomorrow….
vsoch commented on issue singularityhub/singularity-compose#65.
Here is my suggested set of steps that I’d take: …
vsoch commented on issue ECP-copa/ExaMPM#56.
This is probably my stopping point for working on it then - I’m not sure what the problem above is (and I’m still inexperienced with MPI). For context I was going to add it to the metrics operator https://github.com/converged-computing/metrics-operator and use for converged computing experiments on Kubernetes, but I’ll skip over it and move on to the next. Thanks!…
vsoch commented on issue vsoch/forward#45.
@akkornel could you confirm this? I’ve never seen something like it before. Thank you!…
vsoch commented on issue snakemake/snakemake#2492.
This was from October - I just updated my snakemake-interface-common and the pulled changes seem to resolve the issue! So good to close (for me)….
vsoch commented on issue flux-framework/flux-coral2#111.
> Try just removing the yum install -y python3-pip. …
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#14.
And my thinking so far from this discussion is that I’m not comfortable adding anything to the executor that requires enabling an extra service, and one that can accrue costs (albeit slowly) over time. It’s not transparent enough, and it adds additional complexity for the user to setup (and need to request an increase in quota or other PubSub setup). I would rather direct the user to the console and then provide a helper script to get a particular log….
vsoch commented on issue oras-project/oras-py#119.
Yes absolutely - I think that’s the direction the upstream client will go to. Would you care to do a pull request? We can add the argument, and ensure that: …
vsoch commented on issue openjournals/joss-reviews#5888.
> Hi @vsoch do you have time to review this paper? …
vsoch commented on issue lima-vm/lima#2031.
I thought it might be related to remote machines, so I chose it. …
vsoch commented on issue ECP-copa/ExaMPM#56.
Thank you! A quick follow up question (I’m not great at debugging MPI). I can confirm that I can ping the other host and can ssh into it from my launcher, but I’m getting an error. Here are details: …
vsoch commented on issue ECP-copa/CabanaPIC#3.
ok cool thanks for the advice, I will take a look!…
vsoch commented on issue rootless-containers/usernetes#311.
> Please try increasing 65536 there to a larger number …