vsoch commented on issue kubeflow/mpi-operator#610.
> https://github.com/microsoft/DeepSpeedExamples/tree/master/training/HelloDeepSpeed, it do not involve the communication process. The communication setup by pdsh with the hostfile provided by the mpi-operator. …
vsoch commented on issue pydicom/deid#260.
Your best bet (and for your learning) is to look into the code, specifically at the DicomParser that get_identifiers uses, and understand what it is doing (and if you want to change it). The save as will fall back to using pydicom, so any issues there should be asked to that project….
vsoch commented on issue flux-framework/flux-core#5917.
…
vsoch commented on issue LLNL/Kripke#54.
That worked great - thank you!…
vsoch commented on issue kuzudb/kuzu#3406.
That would be great! I don
vsoch commented on issue kubernetes-sigs/kueue#2093.
I think the administrative use case is good, but it seems much smaller than what I hoped for with respect to this tool - a way to manage and understand the running workloads (the user case, for which I think there are many more than administrators). Looking forward (hoping) to see the latter….
vsoch commented on issue kubeflow/training-operator#2091.
Ah - this looks more promising. https://github.com/kubeflow/mpi-operator/pull/567/files…
vsoch commented on issue kubeflow/mpi-operator#610.
It looks like it defaults to CPU, but it’s not clear to me how communication is setup. Is it just using a shared volume at /workspace
? if that’s the case, what’s the point of an operator that supports MPI?…
vsoch commented on issue flux-framework/flux-core#5917.
@grondo if we put together a PR to flux-core, could it be considered? I took a look at the design this weekend, and it could be that we add a job-manager/plugins/dependency-name.c
but also could work to add an entry for dependency.name
to dependency-after.c
…
vsoch commented on issue flux-framework/flux-core#5917.
I can try writing one! And I understand this point: …
vsoch commented on issue snakemake/snakemake-storage-plugin-gcs#41.
Thansk @jjjermiah ! You probably want to remove the __pycache__
stuff….
vsoch commented on issue memgraph/memgraph#1975.
Thanks! So if I understand correctly, I’d need to have everything represented in key values pairs associated with nodes. For example, if a physical node resource is scheduled, it might have scheduled=true, or (because that is too simple for a need to find a time into the future) more likely, a timestamp when the node will next be free that increases as it is scheduled. …
vsoch commented on issue memgraph/memgraph#1975.
> Hi @vsoch, not sure what exactly is the context of representing the state, are you looking to store it in a property or is there other solution you’re looking for? …
vsoch commented on issue kubernetes/community#7647.
We have a pretty good design going, but won’t have something good to share until the latest PR is merged (with quite a lot of changes). I’m following this issue so I can come back if/when we do. …
vsoch commented on issue urlstechie/urlchecker-action#108.
That
vsoch commented on issue flux-framework/flux-docs#269.
The main devs can comment, but what I think is confusing (that maybe you are getting at) is the default changes depending on being within or outside of s lot. It seems that outside of a task slot (unless otherwise specified) it is false …
vsoch commented on issue urlstechie/urlchecker-action#108.
Does it reproduce locally?…
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#29.
@johanneskoester I’m going to bed, but if you see some hint about the error in the cloud logs that would help me to debug. Goodnight!…
vsoch commented on issue kubernetes-sigs/kueue#487.
I originally couldn’t do it because we didn’t have a proper CLA, but we do now, in case help is still wanted. Let me know!…
vsoch commented on issue urlstechie/urlchecker-action#108.
Is this new / were they OK before? Could it be ephemeral?…
vsoch commented on issue opencontainers/wg-image-compatibility#15.
We had four +1 reviews - is this one good to merge?…
vsoch commented on issue flux-framework/flux-sched#1178.
:partying_face: …
vsoch commented on issue GoogleCloudPlatform/hpc-tools#3.
Thanks for the fix!…
vsoch commented on issue spack/spack#32312.
oh man, protobom! I love protocol buffers so this is :pinched_fingers: …
vsoch commented on issue kubernetes-sigs/scheduler-plugins#722.
Great! Here is the automation for what we are running - I’m building a tool to collect data about scheduler decisions to add to this, but that should minimally reproduce (and you can change the timeout or look at earlier runs (the directory names) to find the initial bug. https://github.com/converged-computing/operator-experiments/tree/main/google/scheduler/run10#coscheduling…
vsoch commented on issue kubernetes-sigs/scheduler-plugins#722.
> 120 seems too much as a general default value IMO. Actually in additional to plugin-level config, it also honors PodGroup-level config, which can be specified in the PodGroup spec, and it takes precedence over the plugin-level one: …
vsoch commented on issue kubernetes-sigs/kueue#2001.
Thanks, they look good! Hopefully the automation will work next time so you don’t have to do manual work….
vsoch commented on issue spack/spack#43331.
This is still failing almost all our builds and updates, almost every night, reliably. :cry: …
vsoch commented on issue kubernetes-sigs/kueue#2001.
hmm the rebase won’t work because it adds my user to all the previous commits. …
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#29.
Ping @johanneskoester can you review again?…
vsoch commented on issue flux-framework/flux-k8s#74.
Failure due to controller entry point change from 5 days ago. Hopefully won
vsoch commented on issue flux-framework/flux-sched#1178.
Woot!! Ping @trws :green_circle: …
vsoch commented on issue opencontainers/wg-image-compatibility#15.
To be clear for this comment: https://github.com/opencontainers/wg-image-compatibility/pull/15#discussion_r1555142734 …
vsoch commented on issue flux-framework/flux-python#11.
Sure thing, thanks for the notice! The flux versions are moving very quickly these days. Going to close the issue - please re-open or comment if something else comes up….
vsoch commented on issue flux-framework/flux-sched#1169.
There is also something called a key value flyweight, I wonder if we need to use that for some of the subsystem maps? https://www.boost.org/doc/libs/1_79_0/libs/flyweight/example/key_value.cpp. I also don’t know the difference between when they show: …
vsoch commented on issue flux-framework/flux-sched#1169.
Some tiny progress! Thanks to @milroy for seeing this. Here is the first failure to build (this is for the focal build) …
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#29.
huh, but if it works for you that’s great! Let’s get @johanneskoester to try it out for another test….
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#46.
Also if you are new to rebasing, I made a very dumb video a few years ago, haha. https://youtu.be/9F4RE2_yn6I …
vsoch commented on issue flux-framework/flux-core#5862.
> Additionally, if there are any other tips or workarounds for building and testing Flux within a Singularity container without sudo privileges, I’d like to hear them. …
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#29.
That
vsoch commented on issue oras-project/oras-py#129.
@my5cents looks like you just need one more run of black and we’re good (take note of the version)….
vsoch commented on issue flux-framework/flux-docs#267.
Thanks @garlick ! I’ll get started on these changes and ping you when they are ready for a second review. I really appreciate it!…
vsoch commented on issue sustainable-computing-io/peaks#9.
Perfect, thank you! Is there a link to that somewhere (prominently) here?…
vsoch commented on issue sustainable-computing-io/peaks#9.
Hi! Is work still underway here? I am interested in the project idea but I don’t see any custom scheduler plugin code (is it somewhere else)? Thanks! …
vsoch commented on issue spack/spack-infrastructure#795.
Thank you!…
vsoch commented on issue singularityhub/singularity-cli#220.
I think it
vsoch commented on issue spack/spack#43331.
Again tonight. …
vsoch commented on issue opencontainers/wg-image-compatibility#13.
> Damn, I just got called a “no one”
vsoch commented on issue go-hep/hep#1010.
hey @sbinet I’m trying to bring life back to the project, and (if nothing but a learning exercise) it’s been really fun so far! I was able to implement the sampler-to-sink example, here and I was wondering if I could ask for help with the request reply? I’m following (what I perceive to be) the logic in the FairMQ example, but my message is never received by the server. If you might be able to take a look and give me some hints, I’m hoping to get this one working, then try the router/dealer pattern, and my ultimate goal is to have something that can send pair to pair messages between nodes (if that is possible). And apologies for my naivete - I’m new to developing with these. Thank you!…
vsoch commented on issue singularityhub/singularity-hpc#672.
There are over 8K containers in the registry, and they are added in an automated fashion, and indeed we don’t check for that. If you’d like to PR to the registry to remove this tag and choose a better one, or just select another one, please feel free….
vsoch commented on issue go-hep/hep#1010.
These are fantastic! It may seem like a tiny thing, but I will definitely try them out. It’s unfortunate there isn’t more work of this type with Go. Understandably most folks like MPI for HPC, but I think Go has a lot of interesting scientific use cases (and especially for distributed). Anyway, I really appreciate your insights….
vsoch commented on issue Parsl/parsl#3259.
Sorry can
vsoch commented on issue chrislusf/gleam#203.
Excellent! I’m looking for an example where there is a main leader that sends pieces of work to workers, and they send back, for example some result value (that might fill in one pixel of an image). Is there a particular example I should look at to get me started?…
vsoch commented on issue vsoch/watchme#72.
hey @samhodge-aiml ! This seems like a cool idea (and simple to implement) but I’m not sure I’ll have time to work on it soon - too many cool things going on <3…
vsoch commented on issue singularityhub/shpc-registry#118.
I do believe the Oras cli in go has that, so if you can find the underlying call (e.g url and params) that should be enough for me to fix here. Thank you!…
vsoch commented on issue kubernetes-sigs/noderesourcetopology-api#1.
He’s leaaaavin’ on a jet plane… :airplane: …
vsoch commented on issue vsoch/pull-request-action#101.
All set and merged / releases, version 1.1.1. Thanks!…
vsoch commented on issue opencontainers/wg-image-compatibility#14.
…
vsoch commented on issue vsoch/pull-request-action#101.
Sorry not sure I can help with advice for how to test enterprise. Maybe use a free account?…
vsoch commented on issue snakemake/snakemake-executor-plugin-flux#8.
Let me know if there are substantial changes that would warrant a deeper look….
vsoch commented on issue rootless-containers/usernetes#318.
Wow time flies - thanks again for your help on this @AkihiroSuda ! I thought of it because I’m running this again, just on slightly larger / better infrastructure (network and scale wise). To return to our last correspondence, for those interested in the talk, it’s the Bare Metal Bros and was really fun to do - we are hoping to extend this to a reproducible setup for others to use (actually I’m mostly done with that as of this week, just tidying it up for our own experiments). AWS uses the elastic fiber adapter, and getting that working with EFA and usernetes took some elbow grease! …
vsoch commented on issue prefix-dev/pixi#634.
Absolutely! And I’ll try them out the next time around. Here are the images in GitHub packages: https://github.com/prefix-dev/pixi-docker/pkgs/container/pixi. …
vsoch commented on issue vsoch/pull-request-action#101.
If you try the PR branch I linked it should print the one that is None. Thank you!…
vsoch commented on issue converged-computing/oras-csi#28.
This looks like a bot. if you have a real issue, please report it. Closing….
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#29.
Thank you!
vsoch commented on issue rootless-containers/usernetes#322.
Ah, got it working! I will post the full update tomorrow - basically I needed to hack the daemonset a bit, and then add the correct annotations for it to bind to the pod (of the job). Then I could run a sleep job, shell in, install fi_info
for libfabric, and see efa and run the tests.
…
vsoch commented on issue flux-framework/flux-python#9.
okay these should be good to go - https://pypi.org/project/flux-python/#history. Let me know if there are any other issues or if we are good to close here….
vsoch commented on issue acrlabs/simkube#105.
Thanks!…
vsoch commented on issue NixOS/nixpkgs#198721.
I
vsoch commented on issue snakemake/snakemake-storage-plugin-gcs#23.
@jeffhsu3 the easiest thing to do is poetry run black <root>
and poetry run flake8 <root>
in case you aren’t doing that. I don’t use poetry a lot so I struggle a bit, but that’s what works for me….
vsoch commented on issue snakemake/snakemake-interface-executor-plugins#57.
I would be OK to accept if it fixes an issue, but I think @johanneskoester needs to approve / merge this one….
vsoch commented on issue opencontainers/wg-image-compatibility#13.
If there is still going to be an external artifact, then I can remain indifferent! But if this is just it, this is probably far worse for all the use cases I care about….
vsoch commented on issue acrlabs/simkube#104.
Thanks @drmorr0 ! I got everything running last night, but for some reason the pods never go out of pending: …
vsoch commented on issue spack/spack#42985.
That’s indeed what I was trying to do, yes. :laughing: …
vsoch commented on issue kubernetes/kube-openapi#461.
Perhaps it’s an issue with using it to generate the python files? I see: …
vsoch commented on issue flux-framework/flux-operator#218.
This is done with #219 …
vsoch commented on issue converged-computing/rainbow#15.
This will be closed with #14…
vsoch commented on issue converged-computing/cloud-select#37.
That’s actually OK, this was done. Thanks stale bot :) …
vsoch commented on issue rseng/zenodo-release#14.
- The html url would be another variable for the action …
vsoch commented on issue kubernetes/kops#16066.
Please don’t close nobody has responded. Thanks!…
vsoch commented on issue flux-framework/flux-restful-api#60.
Here is a different approach to try (what I would do). Since we already have a database setup, get rid of the basic auth and needing to cache things in the browser and use JWT tokens, and then you add that to depends for each view. https://medium.com/@chnarsimha986/fastapi-login-logout-changepassword-4c12e92d41e2. …
vsoch commented on issue flux-framework/flux-coral2#131.
@jameshcorbett I was thinking in the context of a jobspec that is able to request storage - it sounds like if the request is too small we would run into trouble, and maybe not every cluster has a behind the scenes solution to set a minimum in that case. So when we check to see if a jobspec can be satisfied (and is valid) we need to ensure it’s not too small. And maybe if it is, and it’s a global truth (meaning it would be too small for any cluster) we could tweak the jobspec on our own, kind of like a mutating …
vsoch commented on issue flux-framework/flux-coral2#130.
There is too much wabbit storage in this test! :rabbit: :rabbit: :carrot: :carrot: …
vsoch commented on issue converged-computing/rainbow#14.
Small update - more detailed output for checking the attributes, and I added a recursive function call to recurse into the jobspec (which was not properly happening before). Go doesn’t have the ability to do “while (for) there are objects in this array” because the length of the array is pre-determined. So this recursive approach is probably best. …
vsoch commented on issue c0mm4nd/go-hwloc#2.
I’m generally interested in getting the equivalent output that I might get for lstopo or in the xml file generated - I’m not a C programmer so the “it’s easy” part is probably not applicable here! :laughing: Do you have examples / know someone that has used your library that might help me get started?…
vsoch commented on issue rseng/zenodo-release#11.
> which is probably not right anyhow and I need to convert the tarball to zip although perhaps I will try .tar.gz and see if Zenodo preview that. …
vsoch commented on issue flux-framework/flux-restful-api#60.
Thanks @khoing0810 ! I should have time this weekend to review….
vsoch commented on issue converged-computing/rainbow#12.
This tool appears to be deprecated so I wound up using standard runtime
and then manual GC, and it does look like we are cleaning up. I’ll keep this in mind moving forward.
…
vsoch commented on issue compspec/jobspec-go#1.
And this should come with associated functions to generate them, and since they are versioned, we would want the Version
field to be populated appropriately….
vsoch commented on issue c0mm4nd/go-hwloc#2.
That worked! I’ve updated the PR here. Quick question for you - what is a “hello world” example for using this? I’ve been trying to do a basic init and then “print something out I can see” - my first effort the print segfaulted, and then I followed a pattern in a test and it doesn’t segfault (but I don’t see anything). I’m new to using hwloc outside of the command line tools so apologies for my naivete. Here is what I am testing: …
vsoch commented on issue kubernetes/kube-openapi#461.
I’m probably going to just pin it to the old version - I got a variant working: …
vsoch commented on issue compspec/compspec-go#29.
Closed with #30. I also made the extraction more lenient within a section - if --allow-fail
is set this is not just applied to the top level extractor, but to sections within. …
vsoch commented on issue c0mm4nd/go-hwloc#2.
Ah I think I see the issues you were probably running into? Go doesn’t like the static bit for a module: …
vsoch commented on issue urlstechie/urlchecker-python#90.
I think we are good here and figured it out in https://github.com/urlstechie/urlchecker-python/pull/89. Thanks for your help here @SuperKogito !…
vsoch commented on issue kubernetes-sigs/jobset#291.
I haven’t seen it again, so happy to close (and reopen if I do). Thanks!…
vsoch commented on issue singularityhub/singularity-compose#68.
Could you please test build and install of a wheel directly? I can release that alongside if needed. Otherwise, please make a specific suggestion for what you’d like changed. Thanks!…
vsoch commented on issue openjournals/joss-reviews#6374.
I can’t offer my time now, but thank you for thinking of me!…
vsoch commented on issue kubernetes-sigs/node-feature-discovery#1599.
It could be that hwloc is a better fit for this - I found a library in go but it has a bug so I opened an issue….
vsoch commented on issue hpc-social/good-first-issues#11.
fixed…
vsoch commented on issue hpc-social/good-first-issues#11.
okay just kidding, just threw it together, will close this after it runs. I am doing it once a week because I don’t think I want to review this every day….
vsoch commented on issue flux-framework/fluxion-go#7.
I realize we will need not just an ability to convert JGF for rainbow, but also the grow functionality in flux-sched. I think @zekemorton is working on the guts of that? And I think a part of it might be here? https://github.com/flux-framework/flux-sched/pull/1061 …
vsoch commented on issue converged-computing/jsongraph-go#9.
This is fixed - it was just serializing at the wrong level!…
vsoch commented on issue kubernetes-sigs/node-feature-discovery#1599.
I’m also not seeing basics about number of physical vs logical cores - that seems obvious like it should be here?…
vsoch commented on issue spack/spack#42605.
> Do you still have any interest in snakemake and spack? Any chance you want to help inform the refactoring of spack ci generate by contributing a real snakemake generator following your original idea? Or if not, how about providing guidance for me or someone else to do it? …
vsoch commented on issue singularityhub/shpc-registry#202.
Have you verified this installs? There are no tokens, etc. needed?…
vsoch commented on issue kubernetes-sigs/node-feature-discovery#1581.
Hi folks! To keep this discussion moving (and not block our projects) I made a prototype of a “source only” node-feature-discovery, here https://github.com/converged-computing/nfd-source and you can see the number of dependencies we could nix by way of it here https://github.com/compspec/compspec-go/pull/22/files and that includes not needing to bump up to 1.21. It would be great to continue discussion, and I’d be happy to prototype something here to test out! I’m not familiar with the code base but I pick things up fairly quickly so not worried about that….
vsoch commented on issue spack/spack#42635.
Thank you!
vsoch commented on issue snakemake/snakemake-storage-plugin-gcs#19.
@w8jcik this plugin isn’t properly working yet, and I’m not familiar enough with the refactored interface to work on it, so any contribution you might make would be appreciated!…
vsoch commented on issue singularityhub/sregistry#447.
Normally you need the collection to exist first….
vsoch commented on issue flux-framework/flux-sched#1142.
We can have discussion here https://github.com/flux-framework/fluxion-go/discussions/3…
vsoch commented on issue flux-framework/flux-core#5709.
Yes - thank you @grondo ! Closing….
vsoch commented on issue snakemake/snakemake-storage-plugin-gcs#18.
Looks like it’s missing a format string! I was never able to understand the logic of these plugins (and get this one to work myself) so we might need to tag team with @johanneskoester. I’ve been using aws s3 instead….
vsoch commented on issue spack/spack#38037.
That’s a great idea!…
vsoch commented on issue kubernetes-sigs/node-feature-discovery#1581.
Also there is so much here, I seriously love this library <3 …
vsoch commented on issue krator-rs/krator#76.
Thank you for sharing these! I’m sorry it’s abandoned / about to be archived, it’s really cool work. I minimally appreciate it for the learning….
vsoch commented on issue flux-framework/flux-core#5728.
> flux filename …
vsoch commented on issue kubernetes-sigs/jobset#380.
I will ask! To be clear - “using it” meaning for development and prototyping or in production? We do not have a production Kubernetes cluster. That’s what we are working towards….
vsoch commented on issue flux-framework/flux-sched#1133.
And to discuss some planning, I’m hoping to start work on moving the go bindings out of tree after we are merged here, and after that continue work on fluence. I’m going to start from the same git history and be very careful to rename / move so that the original authors / commits are preserved. It would be really great if we could finish this one up today (and then I could work over the weekend) but if that’s not possible it’s ok too….
vsoch commented on issue converged-computing/aws-tofu#1.
Note this is replaced with https://github.com/converged-computing/flux-usernetes…
vsoch commented on issue singularityhub/shpc-registry#195.
Also your interest in this container led me to find a bug with tag parsing - nvidia recently added vex and sbom (as tags) and we weren’t filtering them out! Now we are. So thank you!…
vsoch commented on issue opencontainers/wg-image-compatibility#10.
I am as well, and I’m not hard set on my current implementation. I needed a prototype to prove the concept, and I’m planning to update it with the design that we decide upon (I hope it’s a good one)!…
vsoch commented on issue flux-framework/flux-sched#1120.
…
vsoch commented on issue singularityhub/shpc-registry#195.
If you clone that PR branch and change the registry entry in your settings.yaml to that clone path root it will install from it. It
vsoch commented on issue kubernetes-sigs/jobset#380.
It depends on if you think it’s really free of errors and potential issues, or not. I don’t see any harm in doing v1beta1 and then having that wiggle room, but it’s up to you!…
vsoch commented on issue flux-framework/flux-pmix#97.
Thank you!…
vsoch commented on issue LLNL/maestrowf#434.
> So, bundling these two as they’re closely related. First, just to clarify and make sure we’re talking about the same thing, I was really asking about what the ‘cmd’ block in the maestro spec looks like in this mode. Currently they’re all bash, so question was aimed really at does the step definition in the maestro spec change appreciably for this mode, or is the only real difference being that the step’s cmd/script is just executing in a container vs by an hpc scheduler? …
vsoch commented on issue panoptes-organization/panoptes#142.
I can
vsoch commented on issue flux-framework/flux-python#9.
This issue probably needs to be transferred to flux-core - any changes that are needed can be made there are trickle down here. …
vsoch commented on issue zekemorton/flux-sched#1.
Thank you for working on this!! Its badly needed definitely and I’m looking forward to using it!…
vsoch commented on issue betterscientificsoftware/bssw.io#1633.
okay, happy not to do any more work then! :laughing: …
vsoch commented on issue betterscientificsoftware/bssw.io#1633.
Hi @bartlettroscoe @markcmiller86 I wanted to follow up here - realizing that probably I’m best oriented to work on this, I put aside the work I wanted to do for this afternoon and tackled this issue. I have a new release of urlchecker-python (0.0.35) and a branch with the action that you can test. Importantly: …
vsoch commented on issue zekemorton/flux-sched#1.
For final touches, you’ll want to convert the original int into the Go types we’ve added, there is an example in the code I can show again here: …
vsoch commented on issue zekemorton/flux-sched#1.
And @zekemorton let me know if you’d like me to rebase this - from our discussion in the other thread I grepped you didn’t plan to merge and integrate into your branch. It would kill a few birds with one stone if you wanted to, but I’m happy to follow up with another PR….
vsoch commented on issue urlstechie/urlchecker-python#90.
@SuperKogito that just means that there is an error within the task. The way to debug is to run the same in serial, likely on a local machine so you can IPython.embed() and test why there are no match results for Urls (there should be) and then figure out how to update the regexes….
vsoch commented on issue urlstechie/urlchecker-action#105.
Also double check you installed ca-certificates
in the container, and try using --network=host
too. Likely that won’t fix it (I am terrible with Macs and know they are terrible with docker) but just a suggestion!…
vsoch commented on issue flux-framework/flux-python#9.
okay give 0.58.0 a shot. https://pypi.org/project/flux-python/ You can sanity check what flux sees with flux --version
and make sure the versions match, and also import flux to see flux.__file__
…
vsoch commented on issue betterscientificsoftware/bssw.io#1633.
Ah gotcha! If you want to make new additions or changes speeder, then I’d recommend changed files: https://github.com/marketplace/actions/changed-files. I use that for container matrices so I only build containers with updated Dockerfile. If you are concerned about existing links breaking (across many files) a dumb thing I do is to segment a list of things (e.g., paths) into equal lists based on matches hashes to calendar month days, then run for the day (a small subset) each night. https://github.com/vsoch/split-list-action. You probably don’t want to be checking everything on every PR, every time!…
vsoch commented on issue urlstechie/urlchecker-action#105.
A pdf file is not a text readable readable file, so you should not ask the checker to parse it (or add to ignore)….
vsoch commented on issue snakemake/snakemake-executor-plugin-kueue#8.
@alculquicondor if anything jumps out at you here, your insights would be appreciated. I’ve tried a few times now to get this running - I’m installing the latest and pairing with queue, and each time the launcher continually crashes and in about half a second so I don’t have time to see what is going on. I assume I can’t debug the worker if the launcher isn’t working, so not sure how to debug this. I started with the vanilla example in the mpi operator repository and then moved to the example my team was using for canopie-22 (which was working) but no luck either way. …
vsoch commented on issue snakemake/snakemake-executor-plugin-kueue#2.
This is handled with a remote (e.g., AWS). I still haven’t gotten this working on the MPI Operator - if the design means the output is generated elsewhere (and it’s expected to be on the launcher) we might run into an issue - will try testing again….
vsoch commented on issue snakemake/snakemake-interface-executor-plugins#56.
@Feelx234 possibly a dumb question - why not use two wildcards instead of putting two variables into one? Can you better explain / show me the use case - I’m assuming it’s a separator of some type? Or content that would go into a csv file that you want to put in a variable instead?…
vsoch commented on issue flux-framework/flux-sched#1133.
@zekemorton @trws @milroy I put together a quick prototype for what I think might be a good pattern for adding the faux enum, and also testing it: https://github.com/zekemorton/flux-sched/pull/1 …
vsoch commented on issue flux-framework/flux-docs#260.
Gotcha. So I think if we are working on docs here, we can assume many readers will come with some (even minor) knowledge of what fair share is. Either we will let them make assumptions about our derivative and scope, or (what I think might be best) is we call that out, and then very clearly define the two and explain the difference….
vsoch commented on issue LLNL/maestrowf#434.
I’m running out the door for a quick run, but some quick answers (and can follow up with more detail where needed) …
vsoch commented on issue flux-framework/flux-docs#260.
And I’m worried when you say “the algorithm” that’s the only one, oh no. :laughing: …
vsoch commented on issue apptainer/singularity#380.
> Idk why the team is reluctant to add this trivially seeming functionality via a flag. What are your thoughts? …
vsoch commented on issue flux-framework/flux-sched#1134.
That said, the python bindings for flux core are pretty essential, so I think they belong in tree. Also, I realized just now this discussion is in the wrong place - I’m going to link it to the issue I opened https://github.com/flux-framework/flux-core/issues/5709. If we have more discussion let’s pick up there….
vsoch commented on issue flux-framework/flux-core#5709.
Note that discussion started here: https://github.com/flux-framework/flux-sched/issues/1134#issuecomment-1913664132…
vsoch commented on issue eksctl-io/eksctl#6869.
This was the PR for this particular issue - closed. https://github.com/eksctl-io/eksctl/pull/6870…
vsoch commented on issue kubernetes-sigs/jobset#146.
Actually I just realized @ahg-g was literally sitting on the stage for my talk, so he saw (some of the high level) description about the need for easy to deploy common patterns for jobs. Really excited that you are going to work on this, and please include me if you are able….
vsoch commented on issue flux-framework/flux-restful-api#55.
I added a simple variable to control mode #65, and it separates the flux auth mode (which was previously tangled). I’ll update this issue to be about different kinds of auth, since we just have the basic auth that feeds into token with a signed payload. I’m not sure (for the work we are doing) we need much more than that right now. …
vsoch commented on issue flux-framework/flux-operator#214.
This is also done….
vsoch commented on issue apptainer/singularity#380.
And can you give me a concrete example of such an application that requires this knowledge? Just as an FYI, you can still set the present working directory at runtime with --pwd
…
vsoch commented on issue kubernetes-sigs/jobset#146.
That’s awesome! That last paragraph is literally the design of the metrics-operator - the idea was to use JobSet like legos, adding on whatever additional volumes / commands/ features are needed. And it worked pretty well, although I probably went a bit too deep in terms of wanting to play with an interface design. Please ping me if there is something fun to collaborate on….
vsoch commented on issue kubernetes-sigs/jobset#381.
> I think it is better to handle concrete use cases than discussing hypothetical or abstract scenarios, so lets list the user stories that you would like to be taken into account. …
vsoch commented on issue vsoch/scif#67.
Closed with #68 …
vsoch commented on issue kubernetes-sigs/kueue#487.
Hey is there a status update here? I made my first plugin last weekend and it was fairly simple so I could jump in to help if the current assignee is low on bandwidth….
vsoch commented on issue vsoch/scif#67.
I maintain hundreds of projects, so I will work on them on an as-needs basis. I’m glad you are using SCIF!…
vsoch commented on issue supercontainers/compspec#2.
I like the idea of having them be a graph that is flattened out - and I do think what we have here will ultimately change! Let’s keep it dumb and simple for now, share these ideas with the working group when it comes up, and then we will do another sweep / update for what the group eventually decides. …
vsoch commented on issue kubernetes-sigs/jobset#133.
Regardless of the issue I’ll add this to my master TODO as a reminder for me….
vsoch commented on issue kubernetes/sample-cli-plugin#6.
Working! I was being an idiot and using the wrong server address. :upside_down_face: …
vsoch commented on issue singularityhub/singularity-cli#218.
yes….
vsoch commented on issue flux-framework/flux-core#5691.
Thank you!…
vsoch commented on issue snakemake/snakemake-interface-executor-plugins#55.
Agree, I would instead do: …
vsoch commented on issue singularityhub/install-singularity#3.
Thank you!…
vsoch commented on issue opencontainers/wg-image-compatibility#8.
> I think we have adress two different topics with the different specs. One topics is what we want to read and the other how we read it. …
vsoch commented on issue kubernetes/community#7684.
I’m so excited to see this move through - thank you for your support @alculquicondor ! :raised_hands: …
vsoch commented on issue expfactory/expfactory#177.
It would be possible as look as you have a hook to save data (sending to the server) and then proceed to a next task. I’m not familiar with psychojs but can imagine it working similarly to the most common framework we use here, jsPsych. This is really the most important part: https://github.com/expfactory-experiments/nback-10min-animals/blob/e2b88886097bb02eb1e70a1a5e42e287e9716b18/index.html#L48-L67…
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#25.
Also with the fix for the gs, the jobs are (finally) green! I won’t show you how many red / failed there are, let along that it takes 7 minutes per one step run for a hello world… :grimacing: …
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#25.
ah! Just found it in an environment locally - will test removing here and seeing if that somehow (magically) removes it from the remote. That doesn’t make sense, but if snakemake is getting the plugins from my local call, it actually would! Will report back….
vsoch commented on issue apache/airflow#22253.
I understood that. The high level pattern is adding developer churn to the process of review, regardless of the specific details….
vsoch commented on issue apache/airflow#22253.
Going to agree with you @bolkedebruin. I had similar issues about 5 years ago, and while I think the intention is best, when it results in a PR being open on the order of years and not a more reasonable few months, it gives me pause. I can’t comment on the details of the community here because it’s been too long since I contributed, but it seems to me that something is off when the PR time is on the order of years. My heart goes out to you @potiuk, I hope you find the right balance, and seek feedback if/when you determine that things might not be working….
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#25.
I used the exact logic from the previous snakemake life sciences plugin, so if that is the case, it wasn’t updated to support this design. Can you point me to a plugin that is using preemtible correctly? …
vsoch commented on issue singularityhub/sregistry#444.
I don’t know off the top of my head, but generally speaking if someone has an issue they open it….
vsoch commented on issue spack/spack#42001.
Agree that does appear to be the issue - and it’s working again. https://github.com/flux-framework/spack/actions. …
vsoch commented on issue snakemake/snakemake#2598.
It looks like #2599 is closed, so I’m going to close the issue here. Let us know if there is anything else we can help with @musicinmybrain!…
vsoch commented on issue snakemake/snakemake#2598.
Hi @johanneskoester @musicinmybrain! …
vsoch commented on issue rootless-containers/usernetes#319.
@AkihiroSuda could it be mtu related if we are also using the network with mtu 1500 for the comparison case? For example, see the middle one here: …
vsoch commented on issue prefix-dev/pixi#634.
The issue is that you need the lock file in the outside of the container to copy to it to ensure the same thing builds….
vsoch commented on issue kubernetes/enhancements#3371.
hey @denkensk is there any reason you stopped working on this? We would find this highly useful for one of our needs….
vsoch commented on issue kubernetes/community#7659.
> We usually sync with the open source community at night, so it’s usually hard to get up early in the morning. And 9-10 am we usually on our way to the office in China.
vsoch commented on issue eksctl-io/eksctl#6743.
Please re-open again, thank you!…
vsoch commented on issue vsoch/pe-predictive#1.
You could try pulling the old image and then doing a pip freeze to see them. I suspect even with the pinned versions it might not work, but I’m happy if you want to share them and we could try a build. I think better would be to update the code with newer versions of things (and then pin those)….
vsoch commented on issue opencontainers/wg-image-compatibility#3.
@sudo-bmitch do we have a template for this so I can assess the different pieces in that context?…
vsoch commented on issue converged-computing/kubescaler#18.
This is underway, and likely mostly done - will finish up this afternoon after a run. https://github.com/converged-computing/metrics-operator-experiments/tree/main/google/spot-instances/run0/test…
vsoch commented on issue snakemake/snakemake#2174.
This is replaced by the kueue plugin, which now has support for using the flux operator (and MiniClusters) https://github.com/snakemake/snakemake-executor-plugin-kueue…
vsoch commented on issue flux-framework/spack#151.
This will be closed by https://github.com/spack/spack/pull/41974…
vsoch commented on issue kubernetes/community#7659.
Again: …
vsoch commented on issue flux-framework/flux-coral2#120.
The wabbits are down, you say? Did you try giving them carrwots? :carrot: :carrot: …
vsoch commented on issue oras-project/oras#1224.
Here is the full error I was getting (sorry didn’t realize it wasn’t here, maybe I chose the wrong issue)!: …
vsoch commented on issue oras-project/oras#1224.
I just hit this issue for a push, and downgrading worked for me! Specifically: …
vsoch commented on issue kubernetes/community#7659.
@alculquicondor and others, do you want another doodle / way to vote? What would be best to help choose this?…
vsoch commented on issue kubeflow/mpi-operator#611.
You could also just use a different MPI flavor that will use the DNS names and call it a day :)…
vsoch commented on issue flux-framework/flux-k8s#47.
> Got it, I missed that detail. Is there any advantage to dr := &pb.CancelResponse{JobID: in.JobID, Error: 0} instead of dr := &pb.CancelResponse{JobID: in.JobID}? …
vsoch commented on issue kubeflow/mpi-operator#611.
This would be handled well by the flux operator, which uses zeromq to bootstrap and if a pod (follower broker) goes down flux would see the node as down, and we could schedule to another node (possibly newly added, which would join the cluster and then be seen as going from down to up). Nodes going up and down happens all the time in HPC so our workload managers are used to handling that, and for the flux operator you essentially get your own scheduler within the indexed job. We do, however, use a headless service and not the host file. …
vsoch commented on issue kubeflow/mpi-operator#611.
> we can always add a simple sidecar or something that adds the hostname IP entries to the /etc/hosts file in the Pods …
vsoch commented on issue oras-project/oras-go#644.
@Wwwsylvia it’s probably good to open up to the community - I could have made time over break but now that we are back to work, I would be able to make time akin to others that might help….
vsoch commented on issue oras-project/oras-go#644.
hey @FeynmanZhou ! I remember looking at the code when I opened the issue, and I think I could take a first shot but I need some guidance about the logic. I’ve had a hard time following it since the refactor from push/pull to the more abstract design now. If someone can provide that guidance I’m happy to take a shot, otherwise feel free to assign to someone familiar with the codebase. …
vsoch commented on issue spack/spack#38037.
I didn’t ever figure out specific details - with spack things mysteriously break and then resolve later and that’s what happened for this case. The only suggestion I can make is to look at libarchive (e.g., commit from May that mentions iconv, maybe that’s when it showed up https://github.com/spack/spack/commits/develop/var/spack/repos/builtin/packages/libarchive) and then check your package.py for flux-core - do you have at least this one? https://github.com/spack/spack/commit/10999c02836fa6a510871d24ad6548d12d2b72ae….
vsoch commented on issue rootless-containers/usernetes#318.
@AkihiroSuda my colleague had an insight that gave us (at least a solution for now) that allows us to ping the hostname running the pod directly! The missing piece was defining the hostPort, here is the diff for the relevant section. …
vsoch commented on issue rootless-containers/usernetes#318.
oh i see the issue - that parameter is for when it starts so I snuffed out the lower value! Going to try again and dangerously set it to 0 (don’t worry this cluster is extremely isolated)….
vsoch commented on issue VClinic/VClinic#1.
Ack sorry - just copy pasted the one shown in the terminal! Here you go: …
vsoch commented on issue converged-computing/flex-aws-topology#1.
This is fixed!…
vsoch commented on issue NixOS/nixpkgs#198721.
Okay, rebased and squashed, and addressed the review comments that I knew how to! I opened this 14 months ago, and since then don’t have my same environment (so I don’t have a way to test) but I don’t want to just give up on the contribution. I had really big hopes for using Nix but it feels like it’s (overall) too hard….
vsoch commented on issue flux-framework/flux-coral2#117.
:raised_hands: :raised_hands: :raised_hands: …
vsoch commented on issue eksctl-io/eksctl#6869.
Please do not close the issue….
vsoch commented on issue VClinic/VClinic#1.
Here you go… …
vsoch commented on issue VClinic/VClinic#1.
I think likely we don’t want to rely on Python 2….
vsoch commented on issue NixOS/nixpkgs#198721.
@luzpaz contributing a package was so hard I think I largely gave up….
vsoch commented on issue snakemake/snakemake-storage-plugin-gcs#7.
@johanneskoester if you have bandwidth, I am done with the intel MPI example but blocked here: https://github.com/snakemake/snakemake-executor-plugin-googlebatch/issues/18. I think there might be a bug with this example that I got from snakemake (the output is generated but empty)….
vsoch commented on issue singularityhub/singularity-compose#67.
Here is the updated link - feel free to open a PR to fix! https://github.com/singularityhub/singularity-compose-examples/tree/master/v1.0/rstudio-simple …
vsoch commented on issue jsongraph/json-graph-specification#57.
In case anyone is interested, I’m working on them here: https://github.com/converged-computing/jsongraph-go. We have a few projects in mind for these (so updates likely) but for the time being I’m keeping it very simple to just define structs for the different types of graphs supported in v2. Thanks!…
vsoch commented on issue converged-computing/flux-views#8.
Thanks @rajibhossen ! I’ll test these out today and if they look good, will push the images and then merge here….
vsoch commented on issue rse-ops/flux-spack-docker#1.
opened this but not going to pursue until someone actually needs. Specifically the base container is old (and going to go away) and I’d rather build from ubuntu fresh, but we already have builds with flux for that. https://github.com/rse-ops/spack-flux-container…
vsoch commented on issue lima-vm/lima#368.
Even better! …
vsoch commented on issue kubernetes-sigs/noderesourcetopology-api#1.
hey folks! My team is interested in this work (and I could help if needed). Is there a status update / what are next steps?…
vsoch commented on issue kubernetes/enhancements#3545.
hey folks! My team is interested in this KEP - it looks like all the alpha/beta check boxes are purple (merged) and the release contender was 1.28? Was this released with 1.28 and I missed it https://kubernetes.io/blog/2023/08/15/kubernetes-v1-28-release/ or is it still TBA (and possibly not documented) because the issue here is still open? Thanks! And apologies for my naivete about this process….
vsoch commented on issue flux-framework/rfc#406.
oh neat! Is this what I was doing with flux exec
(and wasn’t aware what it was called)?…
vsoch commented on issue flux-framework/flux-core#5628.
Thank you mergify bot. You work so hard. :cookie: …
vsoch commented on issue archspec/archspec-go#13.
Thank you @alalazo! I should be good for early development, but would be great to have a suggested solution in place after that. …
vsoch commented on issue rse-ops/docker-images#112.
Thank you! …
vsoch commented on issue rootless-containers/usernetes#308.
Yes! I have two variants (that are a bit simpler) that I’m using too. Thanks to you both for the help!…
vsoch commented on issue lima-vm/lima#368.
Gotcha! And super cool - thanks for giving me this nugget of info. I know about file
but didn’t consider trying it here (I was going to try opening it in vim to see if there was an identifable header, like with ELF, haha).
…
vsoch commented on issue hpc-social/jobs#18.
Will do! We did used to accept PRs, but they always seem to lead to bugs. :bug: :lady_beetle: …
vsoch commented on issue flux-framework/flux-sched#1120.
Here is what we needed in fluence: https://github.com/flux-framework/flux-k8s/blob/105562e4662e86a1b8ce1e19762678a6c0dd6309/src/fluence/fluxion/fluxion.go …
vsoch commented on issue flux-framework/flux-core#920.
Is there still something we need to do here? The repository here https://github.com/flux-framework/spack/ has been working fairly well to regularly test builds and update packages in upstream spack. If we are good with that, we can close the issue here….
vsoch commented on issue brainhackorg/brainhack_cloud#61.
Thanks for the help on this last year, it was fun to try out!…
vsoch commented on issue lima-vm/lima#368.
Silly question - I saved a snapshot, e.g.,: …
vsoch commented on issue flux-framework/flux-sched#1120.
Update: library is now officially ice-cream-ized: …
vsoch commented on issue flux-framework/flux-sched#1120.
@milroy this is fantastic! :raised_hands: For a quick example of what this enables, I was (very easily) able to build a go module out of tree: …
vsoch commented on issue spack/spack#41708.
We’re good! https://github.com/converged-computing/flux-views/actions/runs/7225964629/job/19690533110 they won’t finish in the 6 hour limit (I’ll need to bring up arm instances on AWS to do these custom builds) but spack is working well. Thanks!…
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#17.
I think I see a possible error - the executable was not made executable it seems: …
vsoch commented on issue kubernetes-sigs/jobset#353.
okay I figured this out. …
vsoch commented on issue hpc-social/jobs#18.
hey @msleigh ! If you’d like to add a job, please use the form here: https://hpc.social/jobs/about/ …
vsoch commented on issue prefix-dev/pixi#557.
I can try this next time! …
vsoch commented on issue flux-framework/flux-sched#1094.
@grondo I think I could give it a shot (it could just be an envar or similar I think?) but mostly wanted to get permission if there is some licensing issue or other thing I don’t know about. But I’m going back to sleep for a bit anyway so… happy to wait for the wisdom of @trws ! :raised_hands: …
vsoch commented on issue flux-framework/flux-sched#1116.
Thanks @grondo! …
vsoch commented on issue urlstechie/urlchecker-action#104.
@SuperKogito my first suggestion to @kubu4 is to try processing in batches (e.g., multiple runs on different roots, and that can be put into an action matrix). If that doesn’t work, then I think we should add some kind of support to handle that internally….
vsoch commented on issue ECP-copa/ExaMPM#56.
Thanks for the suggestion! The lammps metrics container uses mpich and the one here is openmpi, so I don’t think we could do that. I did try mpich too (with the same command) and got a non-working result….
vsoch commented on issue urlstechie/urlchecker-action#104.
You could also just target runs on separate subdirectories (one at a time or in a matrix), depending on how large your repository is….
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#17.
That worked! We made it through the compile step, and I see pi_MPI in storage. Now it looks like the mpiexec (same workflow above) is failing but I don’t see any error why: …
vsoch commented on issue pydicom/deid#259.
hey @peter-kuzmak ! I have limited bandwidth to help on this, but if you can share a dummy dicom file and the dicom.deid and the exact way to reproduce your use case, I can likely make some time over a weekend. Thanks!…
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#17.
> Can you please post the full log? I think it is related to the upload logic for local jobs. There might be a bug. …
vsoch commented on issue singularityhub/singularity-cli#215.
I would be OK with matching the convention that SIngularityCE uses - if they allow dashes (and have a regex to check) we can do that too. I’m open to reviewing a PR that makes this change if you’d like it!…
vsoch commented on issue rse-ops/docker-images#111.
I’m going to pass maintainership over to @davidbeckingsale and his RSE team - I am not as much in touch with the needs of the code teams, and not officially on RADIUSS time anymore. So please do what you think is best here. Thanks!…
vsoch commented on issue aws-samples/efa-device-plugin-helm#1.
Yes of course! All set. I also ensured they were in alphabetical order….
vsoch commented on issue flux-framework/flux-sched#1112.
Yep that fixed it. TLDR: update your docker! Thanks @grondo, closing….
vsoch commented on issue vsoch/spack-package-action#10.
That
vsoch commented on issue snakemake/snakemake-interface-executor-plugins#34.
I didn’t wind up adding it (and properly testing it) but will keep this on my radar! …
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#16.
Should we put this into a discussion / as a question that can be found later?…
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#12.
@johanneskoester I’ll leave this question for the next time you are around (I pinged you to merge the finished work for the googlebatch executor for the above). I’m moving to the MPI workload and adjusting my paths to use the same s3 approach. This has an intermediate step, and it’s telling me that it cannot find an output (that should be generated from the compile) so I’m not sure why it’s doing this check: …
vsoch commented on issue singularityhub/singularity-cli#213.
No worries! So execute
is supposed to minic singularity exec
. And run_command
was previously a more hidden helper command to (one off) run a random command, but not inside the container (to the system). I do think it would be more intuitive if this current run_command
was an exec to the instance, and then the current was moved back to be a helper (hidden) function. I am happy to test this out and open another PR if you agree….
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#12.
When we can confirm the output (and maybe the best way for me to check) I’ll continue (finally!) with the more advanced use cases (e.g., MPI) now that this is working. This is great, and I’m so glad it’s finally running again! :partying_face: …
vsoch commented on issue singularityhub/singularity-cli#213.
Let’s start with the first issue posted above, which is fixed here https://github.com/singularityhub/singularity-cli/pull/214 (pending your testing and approval) then (from that branch) let’s update the thread here (or open a new issue) with a specific way to reproduce your second. Thanks!…
vsoch commented on issue flux-framework/flux-k8s#36.
Gotcha, thanks for the update! …
vsoch commented on issue converged-computing/metrics-operator-experiments#4.
Second update: I tested this more today and think it’s an improvement on the current. The spot API limits the unique requests I can make (e.g., changing zones) so I’m going to wait another day (or possibly two) to get another quota to do a full comparison between the two size groups. That should give me a ballpark estimate if (at least according to this algorithm) there is opportunity to continue with spot experiments. Either way, I’ll add the resulting data to this PR and then merge….
vsoch commented on issue singularityhub/sregistry#443.
Did you test on localhost first (not on EC2)?…
vsoch commented on issue rse-ops/docker-images#111.
I’d like to hear @davidbeckingsale thoughts. I’m not convinced that going from 3.29GB down to 2.41GB is a long term solution - I’d suspect the problem will re-occur and it needs more thought. I’m also wondering if Azure piplines is the best one to use, we don’t run into these issues in other places. …
vsoch commented on issue aws/aws-sdk-js-v3#4495.
I think this might be an underlying waiters problem (as opposed to a particular JDK). For the Python SDK, it will work for a while, and then entirely stop working, to the point that the waiters just wait forever (and given the nodegroup_active
waiter) I can easily use the kubeconfig.yaml and kubetl to see the nodegroup has been ready for 10-20 minutes, but the waiter is… still waiting….
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#12.
Gotcha! Updated message: …
vsoch commented on issue singularityhub/sregistry#443.
also, any reason to not just use a registry that supports pushing SIF (e.g., GitHub packages?) It’s expensive to run your own infra (I would know running Singularity Hub for 5 years)!…
vsoch commented on issue flux-framework/flux-core#5475.
Nice! I’ll add that to my containers….
vsoch commented on issue vsoch/pull-request-action#98.
You can also fairly easily do it separately….
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#12.
Bugs are gone (so far)! :partying_face: …
vsoch commented on issue singularityhub/singularity-compose#66.
I would always provide a src:dest
…
vsoch commented on issue eksctl-io/eksctl#6743.
Thank you!…
vsoch commented on issue singularityhub/sregistry#442.
I think those are two different entry points to auth - one is logging you into a web UI via a callback with a code (OIDC) https://www.pingidentity.com/en/resources/identity-fundamentals/authentication-authorization-standards/openid-connect.html and the command line is still hitting the Django auth system, but not going through that handshake. …
vsoch commented on issue eksctl-io/eksctl#6743.
Why was this closed?…
vsoch commented on issue aws/containers-roadmap#2225.
@tzneal can you show me an example of a launch template that updates the instances to use one thread per code (and i’m familiar with the snippets to do that)! I can figure out the logic to determine if the instance needs it, and I see that I can use it here: https://boto3.amazonaws.com/v1/documentation/api/1.26.85/reference/services/eks/client/create_nodegroup.html. I should be able to test it soon and give you feedback….
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#12.
I can get around that by setting a default here https://github.com/snakemake/snakemake-interface-executor-plugins/blob/d24c886cfd47ce21b2954f08d5383addffb4c900/snakemake_interface_executor_plugins/settings.py#L53C1-L53C1 and now I see: …
vsoch commented on issue converged-computing/metrics-operator#83.
lol, I’m such a froot loop - this is already supported! Thank you me of the past that anticipated needing this :)…
vsoch commented on issue 3dem/relion#1040.
That’s perfect - thank you! Will test tomorrow….
vsoch commented on issue singularityhub/singularity-compose#65.
Here is my suggested set of steps that I’d take: …
vsoch commented on issue ECP-copa/ExaMPM#56.
This is probably my stopping point for working on it then - I’m not sure what the problem above is (and I’m still inexperienced with MPI). For context I was going to add it to the metrics operator https://github.com/converged-computing/metrics-operator and use for converged computing experiments on Kubernetes, but I’ll skip over it and move on to the next. Thanks!…
vsoch commented on issue vsoch/forward#45.
@akkornel could you confirm this? I’ve never seen something like it before. Thank you!…
vsoch commented on issue snakemake/snakemake#2492.
This was from October - I just updated my snakemake-interface-common and the pulled changes seem to resolve the issue! So good to close (for me)….
vsoch commented on issue flux-framework/flux-coral2#111.
> Try just removing the yum install -y python3-pip. …
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#14.
And my thinking so far from this discussion is that I’m not comfortable adding anything to the executor that requires enabling an extra service, and one that can accrue costs (albeit slowly) over time. It’s not transparent enough, and it adds additional complexity for the user to setup (and need to request an increase in quota or other PubSub setup). I would rather direct the user to the console and then provide a helper script to get a particular log….
vsoch commented on issue oras-project/oras-py#119.
Yes absolutely - I think that’s the direction the upstream client will go to. Would you care to do a pull request? We can add the argument, and ensure that: …
vsoch commented on issue openjournals/joss-reviews#5888.
> Hi @vsoch do you have time to review this paper? …
vsoch commented on issue lima-vm/lima#2031.
I thought it might be related to remote machines, so I chose it. …
vsoch commented on issue ECP-copa/ExaMPM#56.
Thank you! A quick follow up question (I’m not great at debugging MPI). I can confirm that I can ping the other host and can ssh into it from my launcher, but I’m getting an error. Here are details: …
vsoch commented on issue ECP-copa/CabanaPIC#3.
ok cool thanks for the advice, I will take a look!…
vsoch commented on issue rootless-containers/usernetes#311.
> Please try increasing 65536 there to a larger number …
vsoch commented on issue snakemake/snakemake-interface-executor-plugins#44.
Can you try using the main branch instead of a tag? I don’t think there has been a release yet….
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#10.
> Oh, I missed that it was actually skipped! …
vsoch commented on issue snakemake/snakemake#2409.
hey folks! For those previously using the Google Life Sciences executor (or with interest in using Google Batch) development is underway, and I think we are at a point where any interested snakemake developers can come in and start playing around and contributing to development. To be clear - this is not ready for any kind of testing - we are still fairly early on, but I wanted to be inclusive in this process and get more eyes on the work. There have been a lot of changes (and a lot of moving pieces). …
vsoch commented on issue kubeflow/training-operator#1949.
@tenzen-y we’re in business! …
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#10.
For the test I think this step needs to run to setup google cloud: …
vsoch commented on issue lima-vm/lima#2021.
Also just hit a weird case where it isn’t shutting down. …
vsoch commented on issue snakemake/snakemake-executor-plugin-googlebatch#5.
That makes sense. So would an example running command look like this? …
vsoch commented on issue rse-ops/snakemake-executor-plugin-googlebatch#10.
Just catching up - we still need credentials for this to work, and the lab needs to provide them here? I’m not sure I can do that, but I can ask….
vsoch commented on issue lima-vm/lima#2012.
Thanks for the help! I now have a worker and control plane set of configs, and it’s entirely automated except for copying the join-command to the (TBA worker) directory that is mounted and available for provision, and then for actually running join I have the user do that interactively because otherwise there is an issue with containerd. But overall it’s just a few commands and very easy, and that’s great! Thanks for all the help! https://github.com/converged-computing/usernetes-lima…
vsoch commented on issue lima-vm/lima#2012.
@afbjorklund how are you handling creating the join-command here (and sharing with a worker?) I’m trying to get that working for my setup, and it seems like the mounts are not done until the provision is done (so I can’t write content there) and there doesn’t seem to be a copy directive for into the build. I found limactl copy but that assumes both are running (and I’d like the second to provision with the file) https://fig.io/manual/limactl and I found this issue https://github.com/lima-vm/lima/issues/594 but it wasn’t clear that there was a resolution. Thanks for the tips!…
vsoch commented on issue eksctl-io/eksctl#6869.
Please do not close, it’s not fixed. Thanks!…
vsoch commented on issue rootless-containers/usernetes#308.
And follow up question - how do you decide what to scope to different provision blocks (vs having one large one)?…
vsoch commented on issue flux-framework/flux-core#5564.
> What base system are you running on @vsoch? …
vsoch commented on issue flux-framework/flux-core#5564.
> It was weird that the pipe size was smaller in your environment compared to alpine docker too. Hmm. …
vsoch commented on issue spack/spack#41154.
Thank you @alecbcs !…
vsoch commented on issue rootless-containers/usernetes#308.
okay first part (root run commands) seems to be mostly working. The trick was to add a probe that watches for the kubectl binary (the last block) to exist. https://github.com/converged-computing/usernetes-lima/pull/1. Going to see if there is a way to run something in userspace next….
vsoch commented on issue lima-vm/lima#960.
Heyo! Is this something that is possible? I’m testing a VM with a few scripts and I really want to see what is going on!…
vsoch commented on issue flux-framework/flux-core#5564.
I’m going to make sure I have the same install deps in my environment, and I’ll run again….
vsoch commented on issue oras-project/oras-py#117.
okay updated - and I just removed mypy. It’s more pain than it’s worth I think….
vsoch commented on issue sylabs/singularity-userdocs#222.
No that’s great! Thanks @preminger. …
vsoch commented on issue flux-framework/flux-core#5563.
Update from me! The make -> make install works (on its own) for the first time without me needing to tweak things :partying_face: …
vsoch commented on issue sylabs/singularity#2348.
Just saw this in my Twitter DMs - sorry I missed the earlier discussion. Super cool and nice job @preminger !…
vsoch commented on issue LLNL/maestrowf#429.
I think I’m probably going to drop working on this for now, at least until someone else requests the example to work again, because it’s one workflow of (a potential) many that would be nice to have working as it did before, but shouldn’t block to working on other things. Let’s leave the issue open in case anyone else has insights. Thanks for the help @jwhite242 !…
vsoch commented on issue LLNL/maestrowf#429.
okay with rebuild - things started out OK: …
vsoch commented on issue singularityhub/singularity-docker#23.
@HippocampusGirl thank you for the suggestion! Not having a latest tag was an explicit decision to make the person think about their choice of version. In that it’s a moving target, it would be dangerous for someone to pin “latest” into some pipeline and then later get a very different image. These are the current reasons we have not been providing latest….
vsoch commented on issue kubernetes-sigs/kubebuilder#959.
@Bobby-Wan that is this selector here? …
vsoch commented on issue kubernetes-sigs/kueue#1320.
Sorry - are you taking the work that I did (reference https://github.com/kubernetes-sigs/kueue/pull/977) and putting your commit on it? And asking me to help you debug your PR issues? it’s one thing to take another’s work and extend the commit, but it’s another (immoral) step to take it and tag it as your own. That is really inappropriate, imho. …
vsoch commented on issue flux-framework/flux-core#5540.
Naive question - why do we need pmi2? Couldn’t we make a simple broker config (with the two hostnames) to give to flux start and skip that extra layer?…
vsoch commented on issue rootless-containers/usernetes#301.
The hanging terminal finally timed out: …
vsoch commented on issue rootless-containers/usernetes#301.
oh neat - I am not familiar with this tool. I’ll try this out after a meeting / later this evening and give you an update!…
vsoch commented on issue flux-framework/flux-docs#255.
Are you wanting “multi-node” as in multiple containers? We have some docker-compose examples: https://github.com/rse-ops/flux-compose and also a Kubernetes operator, depending on the kind of orchestration you are looking for. https://github.com/flux-framework/flux-operator. Note that the operator (v1) is in the process of a refactor for a slightly more elegant design, but the current main should still work. …
vsoch commented on issue oras-project/oras-py#113.
That would be OK with me :) …
vsoch commented on issue kubernetes-sigs/kubebuilder#3684.
https://vsoch.github.io/2023/mutating-admission-webhook-multiple-objects/ …
vsoch commented on issue singularityhub/shpc-registry#118.
@mdehollander what about using oras? Here is the command line (in Go) tool: …
vsoch commented on issue kubernetes-sigs/kubebuilder#3684.
@camilamacedo86 I found a solution that works with Kubebuilder - I’m going to finish it up and I’ll post details here after for others!…
vsoch commented on issue kubernetes-sigs/jobset#325.
I was using 0.2.0 - so I’ll bump up to 0.2.1. Thanks @kannon92 ! Also I didn’t know you moved over to RedHat from G-Research? Congrats! …
vsoch commented on issue rse-ops/snakemake-executor-plugin-googlebatch#6.
Will try this tomorrow (going to bed now!)…
vsoch commented on issue rse-ops/snakemake-executor-plugin-googlebatch#6.
@johanneskoester my question is how do I use this (for local testing) outside of this test case?…
vsoch commented on issue snakemake/snakemake#2500.
> Ok, I will have a look as soon as possible, but I cannot guarantee that I will be able to check this tomorrow. …
vsoch commented on issue flux-framework/locator-map#1.
Closed with #2 The sheet is also updated!…
vsoch commented on issue oras-project/oras#1155.
If you find PRs are regular, you can also do a nightly or even bi-weekly build instead. That would accomplish the same I think and give you more control of the frequency….
vsoch commented on issue kubernetes-sigs/kueue#1091.
Thanks @alculquicondor ! …
vsoch commented on issue seqeralabs/wave#286.
@marcodelapierre let
vsoch commented on issue snakemake/snakemake#2500.
I can’t seem to work on the regular executor anymore either (I also pulled the updated executor base functions). Providing these via command line or environment they aren’t registered: …
vsoch commented on issue seqeralabs/wave#286.
> Evoking the one and mighty Vanessasaurus @vsoch : …
vsoch commented on issue oras-project/oras#1152.
Perhaps a develop or main (branch) tag would make sense? It should indicate
vsoch commented on issue oras-project/oras#1152.
Any possibility to get a develop or branch named container build here? I just ran into the (previous) ambiguous error and this would be very helpful! …
vsoch commented on issue eksctl-io/eksctl#6222.
Please don’t close the issue stalebot - I think a resolution would either be to fix the config here or remove the efaEnabled flag (which will not work without root)….
vsoch commented on issue pydicom/sendit#21.
Is this part of send it here? …
vsoch commented on issue eksctl-io/eksctl#6752.
Thank you! :raised_hands: I’m about to go to bed, but please let me know what is needed for review. The other show stopper for us (meaning the functionality absolutely did not work given the bug) was https://github.com/eksctl-io/eksctl/pull/6870 and the daemonSet with runAsNonRoot to true. But likely we can compromise on that one and just use the helm template. But then arguably the efaEnabled should be removed here. …
vsoch commented on issue eksctl-io/eksctl#6752.
Note that the pull request mentioned was marked stale and closed. I am going to look into using Terraform/Tofu instead….
vsoch commented on issue oras-project/oras-py#112.
That would be a good solution. Would you care to open a PR to fix this?…
vsoch commented on issue vsoch/oci-python#15.
> adding files to an image without downloading the image! I’ll take a look at getting a demo of this. …
vsoch commented on issue snakemake/snakemake-storage-plugin-gs#1.
okay found it! The mangling is happening here: https://github.com/snakemake/snakemake/blob/fc252c80227e75a4fcf869f828d3c7d5d066f794/snakemake/path_modifier.py#L111 …
vsoch commented on issue singularityhub/shpc-registry#165.
These are container modules intended to be used with singularity-hpc “shpc” https://github.com/singularityhub/singularity-hpc…
vsoch commented on issue kubernetes-sigs/node-feature-discovery#1423.
@ArangoGutierrez I think I understand the change now, and I agree this would be great for Fluence. Do you want any help?…
vsoch commented on issue singularityhub/singularity-docker#21.
ok I think they are all done: https://github.com/singularityhub/singularity-docker/actions I didn’t update the README, can do that next time….
vsoch commented on issue rse-ops/snakemake-executor-plugin-googlebatch#3.
okay I can’t figure out what commit to checkout - going to give up for now….
vsoch commented on issue snakemake/snakemake-wrappers#1875.
Update: I am currently modeling these snippets within the executor. https://github.com/rse-ops/snakemake-executor-plugin-googlebatch/pull/2. This is because the design is specific to Batch, e.g., having different sub-steps (typically for setup, barrier, then run) so it would be hard to represent as a single snippet (traditional snakemake snippet). I’ll keep this generic use case in mind in case we can still integrate here….
vsoch commented on issue kubernetes-sigs/custom-metrics-apiserver#70.
Beautiful, thank you! I will also follow….
vsoch commented on issue kubernetes/enhancements#4262.
Thank you for opening this @JorTurFer we need this functionality as well. …
vsoch commented on issue snakemake/snakemake#2486.
haha, fantastic! I should have asked before spending yesterday to implement it….
vsoch commented on issue expfactory/expfactory-experiments#626.
Gotcha. Have you played with https://expfactory.github.io/usage#participant-variables ? That would be the way to pass experiment and subject specific variables into the container, and would work in headless mode….
vsoch commented on issue eksctl-io/eksctl#6869.
This needs to be fixed, please don’t close it….
vsoch commented on issue spack/spack#40605.
:scream: …
vsoch commented on issue flux-framework/spack#118.
@trws after an epic journey through the land of caches and spack updates passed, this is ready for you to review again! …
vsoch commented on issue spack/spack#40541.
Thank you @alecbcs !…
vsoch commented on issue singularityhub/singularity-hpc#663.
This picture is epic. Wow. And I didn’t realize how long 2.7 stuck around! I always harped on the “still using python 2.x” crowd but… they weren’t being absolutely terrible? …
vsoch commented on issue kubernetes-sigs/jobset#244.
ah I think I see! So we basically can define readiness gates, and the readiness gates are references to different probes. And a probe can be https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#container-probes anything from that set. And for the kind of probe we want readiness: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-readiness-probes. Is that correct? …
vsoch commented on issue flux-framework/spack#118.
> Something strange seems to be going on with pkg-config and I’m curious what’s causing it. …
vsoch commented on issue converged-computing/cloud-select#28.
Sorry about that! I glanced quickly and thought you were changing the index of the example for the one off error - totally my fault / lack of reading closely enough. This is definitely a fix we need!…
vsoch commented on issue kubernetes-sigs/jobset#244.
> We can use the ReplicatedJobStatus to check that all child jobs are ready, before starting the subsequent ReplicatedJob. …
vsoch commented on issue singularityhub/singularity-hpc#663.
No, the setup.py cannot enforce that. However, it’s a standard setup classifier (e.g., the entire listing https://vsoch.github.io/pypi-classifiers/) that is rendered on pypi and wherever else the package is distributed, so it’s a convention. …
vsoch commented on issue kubernetes-sigs/jobset#244.
I like that, and the one thing that isn’t clear to me is: …
vsoch commented on issue flux-framework/flux-sched#1101.
I don’t think I know enough about flux-sched to address your questions, going to ping @milroy here. I can help with any changes to the Go that are warranted when it’s decided how to handle it….
vsoch commented on issue singularityhub/singularity-hpc#663.
> Also don’t see container as an attribute of shpc.main. …
vsoch commented on issue kubeflow/pipelines#9703.
We can close - I am going to follow https://github.com/kubeflow/pipelines/issues/9768…
vsoch commented on issue converged-computing/cloud-select#29.
Please see https://github.com/converged-computing/cloud-select/pull/30….
vsoch commented on issue singularityhub/singularity-deploy#12.
That works for me, although I’m not sure if’s worth the change (I don’t think anyone uses this action but maybe I’m wrong)….
vsoch commented on issue GoogleContainerTools/container-diff#402.
It’s very quick and ensures we control the base environment. What you pointed out was a composite action that runs directly on the host. I think my preference is to maintain what we currently have, and have another PR that follows up with a larger change if desired. Asking for more than that is a bit of scope creep imho….
vsoch commented on issue spack/spack#40455.
> You probably won’t actually see the problem IF you install gettext separately with develop updated in the last couple of days then the package you want. …
vsoch commented on issue flyteorg/flyte#3829.
I don
vsoch commented on issue GoogleContainerTools/container-diff#401.
@MPV please see: https://github.com/GoogleContainerTools/container-diff/pull/402 I’m including your commits there to give credit for your contribution, and opened a new PR mostly to skip the intermediate step of PR-ing to your PR to update here. Let me know what you think about my changes, and thanks for pinging me!…
vsoch commented on issue conda-forge/singularity-hpc-feedstock#43.
yaml again….
vsoch commented on issue snakemake/snakemake-interface-executor-plugins#33.
Update: ran into this bug again: …
vsoch commented on issue oras-project/oras-py#111.
I’m thinking I don’t need artifactType in the layer schema as well - let me know! Reference: https://github.com/opencontainers/image-spec/blob/93f6e65855a1e046b42afbad0ad129a3d44dc971/manifest.md?plain=1#L29…
vsoch commented on issue flux-framework/spack#118.
I’ve disabled actions on this repo for now - it will fail every night….
vsoch commented on issue kellyjonbrazil/jc#441.
hey! Apologies for the wait - I just tested it out, and it looks great! …
vsoch commented on issue terraform-google-modules/terraform-google-network#482.
It’s not stale - nobody has responded to my original issue. :( …
vsoch commented on issue singularityhub/sregistry#441.
You likely want to look at your console logs to see the path that is being tried, and adjust accordingly. The roots are defined here: https://github.com/singularityhub/sregistry/blob/675fb4daaef5151132f2ef0947b85a6e255e24b7/shub/settings.py#L549-L556…
vsoch commented on issue GoogleContainerTools/container-diff#401.
@MPV I won’t have bandwidth any time soon but I can add to my TODO….
vsoch commented on issue hpc-social/hpc-social.github.io#71.
I think a heard icon would be sufficient, even just the emoji. I can’t offer to do the work, unfortunately, I’m a bit underwater….
vsoch commented on issue flux-framework/flux-sched#1094.
Oh sorry I meant to say we don
vsoch commented on issue GoogleContainerTools/container-diff#401.
@MPV I think it would be good to have (somewhere) example usage. E.g., typically actions have minimally a table with inputs and then a YAML file example, either in a README or a file that can be easily copy pasted. I’d also recommend adding a test for it….
vsoch commented on issue expfactory-experiments/angling-risk-task#1.
@NomisCiri this is for the second version of expfactory that builds into modular containers (not expfactory.org) - you are probably looking for https://github.com/expfactory/expfactory-experiments…
vsoch commented on issue snakemake/snakemake-interface-executor-plugins#31.
Bug that is breaking tests should be fixed here: https://github.com/snakemake/snakemake-interface-common/pull/25…
vsoch commented on issue flux-framework/flux-sched#1094.
fedora builders say… …
vsoch commented on issue flux-framework/flux-docs#130.
The easiest thing is, if you are comfortable building with make html, to use the site itself as your design mock area, and then we can preview it directly. I can offer to help show you how to do that if needed, probably after this week….
vsoch commented on issue kubernetes/kubernetes#117819.
Hi Google, Happy October! :fallen_leaf: …
vsoch commented on issue flux-framework/flux-sched#1092.
Testing!…
vsoch commented on issue flux-framework/flux-sched#1062.
…
vsoch commented on issue flux-framework/flux-sched#1080.
yay! I love it when the answer to the problem is… what problem? :D …
vsoch commented on issue flux-framework/flux-docs#248.
That repo is grondo, James, and Steven….
vsoch commented on issue mfem/mfem#3865.
That definitely is strange - each cleaning step should be resilient to failure so something else is going on. …
vsoch commented on issue kubernetes/client-go#89.
> For those who encounter with this issue , you can check whether your k8s version is equal to your client-go’s version . …
vsoch commented on issue kubernetes-sigs/jobset#237.
> Additionally, in the case of this project, could there be scenarios where different platforms require distinct “CRDs yaml defintions” to accommodate platform-specific requirements? …
vsoch commented on issue oras-project/oras-py#100.
Thank you @saketjajoo !…
vsoch commented on issue hpc-social/jobs#17.
> Should the %b be %m in line 4, or no? …
vsoch commented on issue snakemake/snakemake#2409.
My ETA for starting is likely early October - I got hit with two major talks and experiments, and I’m operating in full steam mode, at least until the first talk is done and the experiments are run (and I’ve started on the second one). Apologies to everyone for the delay! On my queue is refactor for the flux executor, and of course, batch….
vsoch commented on issue singularityhub/singularity-hpc#658.
Thank you @marcodelapierre ! Apologies I couldn’t jump on it - I’m in full steam work mode. I think I wrote 5k + lines of code today, I’m shoving down dinner, and am going right back to it. …
vsoch commented on issue oras-project/oras-py#100.
Yes that would be excellent - please go ahead….
vsoch commented on issue axboe/fio#1610.
Thank you - that sounds like it will be helpful! These are k8s CSI interfaces to storage so I’m not super surprised at the very slow nature of it….
vsoch commented on issue singularityhub/singularity-hpc#655.
@audreystott is this something you might be able to work on? Note that we would want to import ruamel.yaml, and there are examples here: https://yaml.readthedocs.io/en/latest/example.html…
vsoch commented on issue singularityhub/singularity-hpc#658.
Would you like me to put in a PR or do you want to take a shot?…
vsoch commented on issue mfem/mfem#3865.
Thank you @sebastiangrimberg !…
vsoch commented on issue hpc-social/jobs#12.
@alansill it’s generated with the automation action. I have it locally when I run it. …
vsoch commented on issue singularityhub/singularity-hpc#653.
Correction - I am one blind mouse, with many bad jokes….
vsoch commented on issue flux-framework/flux-sched#1062.
Going to squash commits now that I know it works……
vsoch commented on issue singularityhub/singularity-hpc#653.
Update - I think I found those bits! https://github.com/singularityhub/singularity-hpc/pull/654/commits/43c4abb6565c5b3b1ef6888e5e41255d64645155 Give is another try next time you are at keyboard, and it’s Friiidaaay so no worries if not until next weekend! …
vsoch commented on issue rootless-containers/usernetes#300.
okay tried that - no change….
vsoch commented on issue rootless-containers/usernetes#300.
Oh! I can test this too. Is it possible to change it, and if so, how?…
vsoch commented on issue hpc-social/jobs#12.
hey @hadriandjohari ! I would suggest that you put the job id alongside the title, and then the link with the URL. The reason is because this is a fairly rare format and it would be cumbersome to add an extra field just for (what comes down to) a little extra description. The job would be unique from others that are similar due to that id. If you want to try that out, I can take a look after and see how it looks. Let me know if you have any questions, and thanks for the post!…
vsoch commented on issue flux-framework/flux-sched#1062.
Two fixed (thanks @grondo !) one more to go….
vsoch commented on issue pydicom/deid#253.
yes!…
vsoch commented on issue flux-framework/flux-sched#1036.
Totally no worries! yes absolutely - please use / repurpose whatever you see fit and try it out on your cloud clusters, and let me know if you want to chat about anything….
vsoch commented on issue flux-framework/flux-sched#1062.
Golang basic testing worked for me locally. Need to figure out how to reproduce the above. …
vsoch commented on issue spack/packages.spack.io#12.
yeah! I thought it was super cool too :) Thanks @alalazo I’m off to bed!…
vsoch commented on issue snakemake/snakemake#2409.
> not sure what this means with respect to executors - is there a devel branch already? (Honestly, I have not seen one.) …
vsoch commented on issue pydicom/deid#258.
Thank you @wetzelj !!…
vsoch commented on issue kubernetes-sigs/jobset#237.
> yeah, we can support that, do you want to send a PR that follows the Kueue pattern? …
vsoch commented on issue flux-framework/flux-docs#193.
And here is the repository - https://github.com/flux-framework/cheat-sheet/ - it’s already designed so that data (from yaml) renders into the UI, automation via GitHub pages, so we could do a tweak to allow more than one cheat sheet (e.g., specific to a system) and then have links to all of them somewhere. …
vsoch commented on issue singularityhub/singularity-hpc#653.
Awesome! I say we merge this then. @marcodelapierre Ill wait the AOK (or request for more tests, changes, etc.) from your review….
vsoch commented on issue kubernetes-sigs/jobset#237.
Yes! I’ve done this for two operators now, if you want another pattern to look at: …
vsoch commented on issue It4innovations/hyperqueue#609.
oh! I just remembered (right after I posted of course, lol). So we also needed JobSet for arm, but I haven’t heard anything on that issue. https://github.com/kubernetes-sigs/jobset/issues/237. I think I was able to bring up the arm images but not actually get everything running. I’ll ping them again now….
vsoch commented on issue singularityhub/singularity-hpc#653.
> The wrapper scripts are looking for the file in the wrappers directory but it is actually located in the modules directory. …
vsoch commented on issue rootless-containers/usernetes#287.
And @AkihiroSuda we will make sure to plan another one that is on our Thursday which I am realizing is your Friday morning next time. Apologies for the oversight!…
vsoch commented on issue kubernetes-sigs/jobset#291.
:point_right: https://github.com/converged-computing/metrics-operator/actions/runs/6124911952…
vsoch commented on issue flux-framework/flux-sched#1067.
@trws all set! I kept the direct install of cmake because we use that exact version in other places, and might be safer to ensure they are the same….
vsoch commented on issue aws/aws-parallelcluster-cookbook#2429.
Woohoo - thank you @enrico-usai ! How would you suggest someone new to this setup start to work on adding to it? Any tips for a new cookbook developer?…
vsoch commented on issue spack/spack#39855.
Thanks @tldahlgren !…
vsoch commented on issue rootless-containers/usernetes#300.
yeah, not getting through either of these steps now with this change. :/ …
vsoch commented on issue rootless-containers/usernetes#287.
I don’t think so - this is from the same host: …
vsoch commented on issue apptainer/apptainer#1642.
@bdklahn that’s awesome! It was a fun project, definitely, here is the “official” paper if you are interested https://academic.oup.com/gigascience/article/7/5/giy023/4931737. …
vsoch commented on issue rootless-containers/usernetes#287.
Possibly coming from here? https://github.com/rootless-containers/usernetes/blob/ee1f4ea766e8a3ed35161d5f80e01394f37efdb3/Makefile#L8 Although I’m not sure if this is just the hostname for the container. But if there is some interface with the host (and we need the actual host) that could be a bug….
vsoch commented on issue aws/aws-parallelcluster-cookbook#2429.
Fixed, rebased, and tested locally. Thank you for the review @enrico-usai!…
vsoch commented on issue GoogleCloudPlatform/hpc-tools#3.
And a few more possible runs - the help says the default install dir is /opt/Libfabric
(capital L) but it turns to lowercase:
…
vsoch commented on issue snakemake/snakemake#2360.
@jjjermiah I’m hoping it doesn’t take me too long - @johanneskoester and I designed the executor plugin setup, and I’ve written already many Python SDK examples with batch, so I suppose I just need to pancake those two things together :laughing: :pancakes: …
vsoch commented on issue singularityhub/singularity-hpc#654.
@marcodelapierre I won’t have time this weekend (or maybe next) - I have to make and record a full talk, plus write a full paper (that I haven’t started yet) in the next two weeks, plus a second talk and entire set of experiments (for late October) that also aren’t started yet, on top of regular work, so I don’t think I’ll have time for my fun extra projects! But I’m adding to my calendar for after those are done….
vsoch commented on issue rootless-containers/usernetes#287.
okay tested! Complete output is in details sections here: https://github.com/converged-computing/usernetes-terraform-gcp/tree/main/examples/basic#test-application it looks like the main issue for creating the pods for flannel is that some subnet.env file is missing? …
vsoch commented on issue flux-framework/flux-operator#205.
> sure, my bad, sorry, I didn’t expect we have to set weight
vsoch commented on issue flux-framework/flux-operator#204.
@aojea if you give me permission I can make these changes to the PR. …
vsoch commented on issue UIUC-PPL/charm#3761.
I wanted to explore what was there - I can’t say I knew in advance! I was going off of the clone here https://charm.readthedocs.io/en/latest/ampi/05-examples.html so I was generally interested in all the AMPI apps….
vsoch commented on issue singularityhub/shpc-registry#149.
@marcodelapierre good for me when you give the final :+1: !…
vsoch commented on issue singularityhub/singularity-hpc#654.
@marcodelapierre okay let’s debug! The curly bracket is first priority. …
vsoch commented on issue rootless-containers/usernetes#287.
okay first test - the first make up
works! I’m running the second command, and it cannot find socat
…
vsoch commented on issue rootless-containers/usernetes#287.
@AkihiroSuda I was working on a talk and doing other terrible things today, but I’ll start to setup an equivalent build for this on our terraform + Google Cloud VMs so minimally I can test, understand what is going on, and ask questions. if not tomorrow, this week! Excited for these changes - thank you so much <3 …
vsoch commented on issue vsoch/pytest-github-report#3.
I can’t reproduce with the local tests/. Can you send me something to reproduce your error?…
vsoch commented on issue flux-framework/flux-sched#1062.
! :green_apple: :green_book: :green_circle: :green_heart: :green_salad: :tea: :green_square: …
vsoch commented on issue snakemake/snakemake#2412.
@jakevc I can’t answer for Johannes, but based on what I see in the current code, it looks like the ClusterExecutor is a kind of RemoteExecutor. So (regardless of where it is located, and indeed it could be an executor plugin itself) for the time being it’s still in snakemake, and you will want to try using it! …
vsoch commented on issue singularityhub/sregistry#439.
I
vsoch commented on issue GoogleCloudPlatform/PerfKitBenchmarker#4437.
And when I try that, I had to add a role and role binding (not ideal) along with installing kubectl, but I still get: …
vsoch commented on issue singularityhub/sregistry#439.
The sregistry command is no longer the primary means to interact with singularity registry - you typically use the library:// endpoint….
vsoch commented on issue rootless-containers/usernetes#281.
Ah interesting, so that file exists - and the xt_MASQUERADE was missing. I added it, reloaded, and I see: …
vsoch commented on issue kubernetes-sigs/jobset#261.
@danielvegamyhre perhaps there should be a set of two annotations with lower granularity that say to add only the affinity or anti affinity sections?…
vsoch commented on issue tiangolo/fastapi#10007.
Yeah this is strange - when I go to previously working versions I still get: …
vsoch commented on issue flux-framework/flux-restful-api#62.
See https://github.com/tiangolo/fastapi/issues/10007 I can’t find a way around this / getting it to work, even with previous versions….
vsoch commented on issue flux-framework/flux-restful-api#60.
hey @khoing0810 ! It looks like you are bringing in commits that were from previous work (and squashed). The way to fix this is to keep your fork main branch absolutely in sync with the one here, and always checkout fresh. Let me know if you want some help to look at this / walk through this process - most projects won’t be happy to just ignore (e.g., squash and commit) and will want to see a logical git history, so we should make sure you know how to do that. Happy weekend!…
vsoch commented on issue converged-computing/kubescaler#11.
@rajibhossen are we close to merging? It doesn’t need to be perfect for sure - you can always open another PR to follow up with tweaks. I do think you’ve made a lot of important changes we should get in soon….
vsoch commented on issue snakemake/snakemake#2412.
@jakevc ok let’s see - I’m not familiar with azure batch and I didn’t work on that particular piece, but it looks like the changes are here: https://github.com/snakemake/snakemake/commit/c9eaa4e12e4a348f93e5ea5793faaec1fd547fac#diff-3a7c4d992c26b6767b554e11c830767fec6f1d798b3f3ddfa156a02839018f34 primarily using the classes provided here: https://github.com/snakemake/snakemake-interface-executor-plugins/blob/main/snakemake_interface_executor_plugins/executors/remote.py. The check seems to be happening for this other new class, the RealExecutor https://github.com/snakemake/snakemake-interface-executor-plugins/blob/main/snakemake_interface_executor_plugins/executors/real.py. …
vsoch commented on issue singularityhub/singularity.lang#11.
Let me check in with this developer to see if he wants to contribute, even just a reference….
vsoch commented on issue singularityhub/singularity-hpc#649.
Can you test without spack?…
vsoch commented on issue rse-ops/docker-images#111.
To step back - if you are just creating and view and nuking spack, my suggestion would be to add a set of images that are “minimal” on top of the existing ones to do that. We should probably not provide one and not the other….
vsoch commented on issue rootless-containers/usernetes#281.
Yes, but do you have another suggestion? I want to get this setup working and I sense there isn
vsoch commented on issue rootless-containers/usernetes#281.
Ah! Take a look at my comment above about the control plane node (001) - I see the parent_id
file that suggests it is hitting the second block.
…
vsoch commented on issue opencontainers/specs.opencontainers.org#4.
thanks @sudo-bmitch !…
vsoch commented on issue jbms/sphinx-immaterial#277.
You and @jbms are really too notch maintainers. Every time I
vsoch commented on issue jbms/sphinx-immaterial#277.
The first error went away! But I think I might have hit the third error there? …
vsoch commented on issue flux-framework/flux-core#5393.
> Now, to your question about whether we could use the current logic… yes, we could find a way to do that. We’d need to, when the class is created or on first call, have it generate a FunctionWrapper instance for each of the functions. They could be simplified slightly since they’re being called in a known way, but then each call would basically be self._
vsoch commented on issue dyninst/testsuite#231.
oh nooooo! Long live the dashboard! :sob: …
vsoch commented on issue converged-computing/metrics-operator#49.
Thank you! It was a collaboration between an AI image generator and I - I have a small GPU I run in my apartment for these purposes, mostly to make hilarious pictures of Jeff Bezos riding dinosaurs (but sometimes something useful)? This guy was a few variations, and then I brought him into Gimp for a lot of manual editing - the trick to making them look good is cleaning up all the edges, fixing some of the weird parts with custom painting, and basically doing that until you are satisfied. I use other programs / online tools to create the text. I think he came out adorable! …
vsoch commented on issue rse-ops/docker-images#111.
You
vsoch commented on issue kubernetes-sigs/jobset#261.
Here is where I’m adding it (for context) if needed! https://github.com/converged-computing/metrics-operator/pull/48/files#diff-35308b81b4f76090271b240119636f94f69c25ab2ab502836d4b35dc06267c23R113…
vsoch commented on issue singularityhub/singularity-hpc#649.
This has been my experience generally with cvms- it
vsoch commented on issue ovis-hpc/ovis#1261.
Gotcha - thanks @tom95858! One more question - do you have an example config file that will output a json structure (as opposed to the tabular / terminal output I see by default)?…
vsoch commented on issue kubernetes-sigs/jobset#239.
I’d be in support! I’m definitely enjoying using Jobset - it’s the basis of my new operator: https://converged-computing.github.io/metrics-operator/getting_started/metrics.html :partying_face: …
vsoch commented on issue flux-framework/flux-core#5393.
Here are a few examples of generated wrapper script classes: …
vsoch commented on issue flux-framework/flux-core#5393.
Oh that
vsoch commented on issue ACCESS-NRI/ACCESS-OM#13.
lol No I’ve never heard of spack, what’s that? Just kidding :) Yes and yes, and I’m not interested, thanks!…
vsoch commented on issue ovis-hpc/ovis#1261.
Is there documentation somewhere that describes the interfaces used by each sampler? …
vsoch commented on issue GoogleCloudPlatform/scientific-computing-examples#57.
To add additional comment, when I made my own builds this issue seemed to go away (when I logged in I saw both my user name and the Google cloud name) so there is some issue in the setup here. But it’s not an explicit issue anymore because I’m not using these configs here. …
vsoch commented on issue rootless-containers/usernetes#281.
Possibly related to this (albeit a bit old) but maybe still relevant? https://github.com/containerd/containerd/issues/2246#issuecomment-377414459…
vsoch commented on issue rootless-containers/usernetes#281.
Heyo! Wanted to ping for next week and see if anyone @aojea or @AkihiroSuda had thoughts about the above? What should we try next? If you don’t have ideas I could take a shot at PR to try and generalize the scripts to not be hard coded for docker-compose (in case there is some tiny error in there leading to the behavior here). Happy Sunday!…
vsoch commented on issue hpc/pavilion2#545.
Nice work @Paul-Ferrell so happy to see that merged! :partying_face: …
vsoch commented on issue pydicom/sendit#20.
hey @samar-syd7 ! I don’t actively maintain this so I can’t even guarantee that all the pieces are working. My suggestion would be to use something else, but if you have a specific error to look at I could advise. I would guess the version of Django is not working with some unpinned dependency….
vsoch commented on issue mfem/mfem#3743.
I think this will be closed with https://github.com/mfem/mfem/pull/3769 stalebot - anything else I need to do?…
vsoch commented on issue kellyjonbrazil/jc#441.
Thanks @kellyjonbrazil ! For a workaround, I’m doing two separate calls, one for CHILD and for one TASK, and then parsing them separately, but it would be good to get them both at the same time in the list!…
vsoch commented on issue ACCESS-NRI/ACCESS-OM#13.
I finally got it working after much pain - I’ve pinned all versions except for prrrte so I should do that. It’s very janky but at least it seems to work? https://github.com/researchapps/pmix-compose/blob/611b0e13e381bba1e61f4d2c73ea67d2f9ba5046/Dockerfile…
vsoch commented on issue ACCESS-NRI/ACCESS-OM#13.
@harshula I ran into this error too for a container build (outside of spack) https://github.com/spack/spack/issues/30906 …
vsoch commented on issue singularityhub/singularity-hpc#656.
So I’m tempted to say this is not a bug - a container add should not be installing to the containers base, because it is simply just creating a container.yaml that then should be installed….
vsoch commented on issue kubernetes-sigs/jobset#244.
> IMO, I prefer defining startUpPolicy under each replicatedJob field. I think this implementation is easier to read. …
vsoch commented on issue snakemake/snakemake#2405.
Hey @jhiemstrawisc thats great! For now there are a few notes here https://github.com/snakemake/snakemake-interface-executor-plugins and you can use the flux executor as an example (which now exists in both places): https://github.com/snakemake/snakemake-executor-plugin-flux…
vsoch commented on issue kubernetes-sigs/jobset#244.
This would definitely be useful for us - what I’ve had to do is typically start the workers and handle race conditions with sleeps / waiting, that sort of thing….
vsoch commented on issue expfactory/expfactory#176.
Hi @emmjayya - expfactory is based on Docker containers, so if you are able to build somewhere you have internet and then move the container, that would probably work best. Try looking at docker save
and then moving that archive to the offline host in whatever means is possible….
vsoch commented on issue snakemake/snakemake#2305.
Woohoo!! What would you like to do next @johanneskoester ? I can start refactoring some the current executors into plugins, or since we already have one for flux, remove it here. Let me know your preferences!…
vsoch commented on issue kubernetes-sigs/jobset#146.
And here is a complete updated config with the logging updates I made to the metrics operator: …
vsoch commented on issue hpc-social/jobs#10.
Hi @civodul - thank you for this contribution! We’ve had some bugs in the past adding jobs via pull request, so I’d like to ask you to please use the form here. …
vsoch commented on issue rootless-containers/usernetes#281.
okay going for bike ride / run, shutting this down for now! Thanks for helping on a Saturday!…
vsoch commented on issue terraform-google-modules/terraform-google-network#482.
okay - I’ve tried now bullseye (debian-11) and that fixed the DNS names looking weird and the /etc/hosts, and the same is true on ubuntu, but there is absolutely no eth0 device. I don’t even know how to debug this :( …
vsoch commented on issue rootless-containers/usernetes#281.
Thank you again to you both! I almost have the debian setup done, although I’m not exactly a morning person (got up to chat with you!) so I’ll probably go back to sleep for a bit and be in touch later today / this weekend, and of course this means we can chat more next week. Happy Friday and have an amazing weekend!…
vsoch commented on issue rootless-containers/usernetes#281.
The way I was debugging this before is with systemctl status, but I’m not sure what I’m looking for so it was hard to know where to look….
vsoch commented on issue kubernetes/kubernetes#117819.
Hi folks! I ran the same experiment as before with c2d-standard-112, but added the flag for the TIER-1 network. The full command: …
vsoch commented on issue flux-framework/spack#110.
Will be closed by https://github.com/spack/spack/pull/39237…
vsoch commented on issue flux-framework/flux-docs#252.
Oh that
vsoch commented on issue snakemake/snakemake#2044.
Yep that sounds good, and if we are moving this to a separate executor plugin it shouldn’t be an issue to fix here! I’ll test this again when we do that….
vsoch commented on issue kubernetes-sigs/jobset#241.
Thank you!…
vsoch commented on issue kubernetes/kubernetes#117758.
Not to open the can of worms more, but this reflects on a larger issue that Kubernetes doesn’t have the same level of granularity, both in things like timestamps or states (which are typically managed by a job manager). E.g., in Flux rfc we have an entire state diagram …