vsoch commented on issue google/dranet#151.
Some quick updates. Azure doesn’t seem to expose any ability to customize features - their default version is 1.32.5 and their newest is 1.33.1, so I don’t see a way to enable DynamicResourceAllocation. We can only get to beta (disabled still) in a 2 node cluster HBv3 120rs. …
vsoch commented on issue flux-framework/Tutorials#55.
@wihobbs the reason I’m doing it now is because I consider that a sunk cost. We can’t ensure those links (somewhere on the internet) will be persistent, but I think a user would navigate to the root of the repository and realize the structure is changed. I want to change it sooner than later so moving forward into the future (when we have more links, more people that remember them) they conform to a better structure. And in some cases, if a person tells us about an old link, we can post an update in the place they informed us had the incorrect one (generally speaking, this would be other README.md or links in YouTube videos)….
vsoch pushed to conda-forge/deid-feedstock
Merge pull request #51 from regro-cf-autotick-bot/0.4.4_hf4c4d6
deid v0.4.4</small>
vsoch commented on issue pydicom/deid#281.
Thank you to you both! …
vsoch commented on issue google/dranet#151.
@aojea that is good thinking. I am not sure I can give direct access to the cluster, but if we did a hackathon I could drive whatever orchestration that @gauravkghildiyal needs. I am in Pacific time now - could you give me the window of hours / days that would work for the meeting (August 6th or after)? I will want to test the AKS setup first and spec our the costs….
vsoch pushed to singularityhub/shpc-registry
Merge pull request #360 from singularityhub/update/containers-2025-07-28
[bot] update/containers-2025-07-28</small>
vsoch pushed to converged-computing/sc25-flux-eks
module3: add createsims/cganalysis crd
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/sc25-flux-eks
docker: createsims
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to compspec/fractale
Merge pull request #8 from compspec/add-moab
feat: add moab</small>
vsoch pushed to rseng/software
Merge pull request #433 from rseng/update/software-2025-07-27
Update from update/software-2025-07-27</small>
vsoch pushed to compspec/fractale
updates to moab to handle missed directives
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/sc25-flux-eks
Merge pull request #1 from converged-computing/finish-module2
feat: finishing tested module2</small>
vsoch pushed to converged-computing/flux-apps-helm
Merge pull request #37 from converged-computing/gpu-variant-hpcg
Gpu variant hpcg</small>
vsoch pushed to vsoch/node-feature-discovery
feat: add ability to export
For the HPC use case, we want to be able to export static features, raw or as labels, to the terminal or a JSON path.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/google-performance-study
add new overhead files!
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch commented on issue kubernetes-sigs/node-feature-discovery#2183.
Ping @adrianchiris @zvonkok @ArangoGutierrez @marquiz we’ve gone through several rounds of review and have LGTM from the compatibility working group. Can any of you review this?…
vsoch pushed to converged-computing/aws-performance-study
Compare to main (#8)
- add LASSO models and shap features
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to compspec/ocifit-k8s
Merge pull request #6 from compspec/add-ml-sidecar
updates to model for final experiments</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #357 from singularityhub/update/containers-2025-07-21
[bot] update/containers-2025-07-21</small>
vsoch closed issue mfem/mfem#4848.
Good example for scaling study?
Hi mfem! We are doing a scaling study (4 to 64 nodes) and looking for contender apps/ proxy apps/ benchmarks/ synthetic benchmarks. Is there one that mfem has, with some Figure of Merit, that you would recommend? Strong or weak scaling would work, and the only requirement is that I can build it into a container and run it across nodes. Thanks!…View Comment
vsoch commented on issue hariharan-devarajan/datacrumbs#30.
You might want to add a .gitignore to not add DS_Store
files….
vsoch pushed to converged-computing/flux-apps-helm
update: tweaks to flux minicluster for hpcg
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch created a new branch, compare-to-main at converged-computing/aws-performance-study
vsoch pushed to compspec/ocifit-k8s
Merge pull request #5 from compspec/add-ml-sidecar
feat: adding mlserver for selection</small>
vsoch pushed to compspec/ocifit-k8s
feat: support for customizing minicluster
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to rseng/software
Merge pull request #432 from rseng/update/software-2025-07-20
Update from update/software-2025-07-20</small>
vsoch commented on issue pydicom/deid#280.
> Do you have any guess on what could be causing this or how to fix it?…
vsoch pushed to flux-framework/flux-operator
feat: add custom environment for view (#244)
- adding custom environment support
- bump: go version to 1.23
Also add an identifying app label for the flux operator
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Update container.yaml
Signed-off-by: Vanessasaurus <814322+vsoch@users.noreply.github.com></small>
vsoch pushed to compspec/ocifit-k8s
feat: adding mlserver for selection
This is testing an idea that a compatibility spec can direct to use (ask) an ml model. This is fully working but needs testing in a cloud now since kind (locally) does not have the instance type label.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #356 from craigmeyer/main
Update Pawey’s openfoam and tensorflow, add Pawsey cp2k and namd</small>
vsoch commented on issue oras-project/oras-py#216.
Is this something you’ve discussed with the primary oras maintainers in slack, or is this feature already implemented in oras go?…
vsoch pushed to vsoch/node-feature-discovery
feat: add ability to export
For the HPC use case, we want to be able to export static features, raw or as labels, to the terminal or a JSON path.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch commented on issue pydicom/deid#280.
I would interact through python and put an IPython.embed() here and walk through the logic to understand what is going on (that is what I would try)….
vsoch pushed to converged-computing/jobspec-conversion
bug: ensure we do not consider empty a change
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch commented on issue pydicom/deid#280.
You’d want to put logic in the custom function, which will receive the item being parsed, the value, field and the entire dicom. A development tip is to write it first like this so you get an interactive console:…
vsoch commented on issue google/dranet#151.
@michaelasp I have an equivalent example started - but I have a catch22. The environment I’m working in is user space kubernetes. We can’t see any devices without a privileged pod. When we add that, we see everything. This means the selector is irrelevant. I need an environment with production Kubernetes that I can run without rootless. I had Google Cloud credits but they expired, and I’d need to pay out of pocket. @aojea can you help? It shouldn’t be too many to test this out. Once I have that, I can add the full analogous example to the docs….
vsoch pushed to converged-computing/usernetes
feat: add script to start instance
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch created a new branch, main at converged-computing/sc25-flux-eks
vsoch commented on issue pydicom/deid#280.
I’ve never used deid for private tags like that - going to ask @wetzelj for help on that one….
vsoch pushed to converged-computing/jobspec-conversion
feat: manual validation
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to compspec/fractale
feat: add additional transformers
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to vsoch/node-feature-discovery
feat: add ability to export
For the HPC use case, we want to be able to export static features, raw or as labels, to the terminal or a JSON path.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to vsoch/dranet
Merge branch ‘main’ into add/rdma-raw-devices
vsoch pushed to singularityhub/shpc-registry
Merge pull request #353 from singularityhub/update/containers-2025-07-14
[bot] update/containers-2025-07-14</small>
vsoch commented on issue google/dranet#151.
Ah gotcha - understood
vsoch pushed to rseng/software
Merge pull request #431 from rseng/update/software-2025-07-13
Update from update/software-2025-07-13</small>
vsoch commented on issue pydicom/deid#280.
My suggestion would be to turn the tags.py module into a directory proper, e.g., tags.py to tags/__init__.py
and then have a private.py
with that explicit listing and function….
vsoch pushed to hpc-social/good-first-issues
Fix dropdown visibility issue by updating select background and text color (#16)
vsoch commented on issue pydicom/deid#280.
I can give a suggestion for a strategy to take - if you have very specific, scoped fields you want to keep, you can use the KEEP
directive. If it’s more a global thing a strategy akin to the reverse remove_private_tags might make sense….
vsoch commented on issue google/dranet#151.
> I’d like to see an example of this at work, i.e spin up a pod with a resource request and see that the RDMA device gets properly mounted to the pod. Since a lot of that logic depends on NIC devices there may need to be some work done there for proper allocation/release of the device from the host to the pod. …
vsoch pushed to singularityhub/shpc-registry
Merge pull request #352 from singularityhub/update/containers-2025-07-11
[bot] update/containers-2025-07-11</small>
vsoch pushed to vsoch/dranet
feat: discovery of rdma raw devices
Problem: we only see rdma devices associated with a netlink. Solution: do a discovery loop that also lists all rdma devices. Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch commented on issue google/dranet#151.
Thank you for the speedy review! I’ve fixed the small issues above, and we can discuss some of the design details before the next round. …
vsoch pushed to singularityhub/shpc-registry
Merge pull request #350 from singularityhub/update/containers-2025-07-07
[bot] update/containers-2025-07-07</small>
vsoch pushed to compspec/fractale
Merge pull request #4 from compspec/add-k8s-transformer
feat: simple transform utility</small>
vsoch pushed to flux-framework/flux-operator
feat: hostIPC and hostPID (#243)
I am primarily testing what these features add (or do not add to a user space kubernetes environment. So far, I do not see any changes.
Signed-off-by: vsoch vsoch@users.noreply.github.com Co-authored-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to vsoch/dranet
feat: discovery of rdma raw devices
Problem: we only see rdma devices associated with a netlink. Solution: do a discovery loop that also lists all rdma devices. Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch opened a pull request to google/dranet
vsoch pushed to converged-computing/usernetes
feat: start of work to support dra
We want to test if DRA (and dranet) can better utilize the infiniband devices. We have them working with UCX, and actually I am not sure if this will add anything, but it was worth trying for minimally the learning.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch commented on issue rootless-containers/usernetes#373.
We got this working. I wish I could report why - it started working on a restart to the cluster. My suspicion is that the admin didn’t have all the kernel modules loaded that we need, but I don’t know….
vsoch commented on issue google/dranet#150.
I don’t see that it discovers the Infiniband drivers:…
vsoch pushed to singularityhub/shpc-registry
Merge pull request #348 from singularityhub/update/containers-2025-07-03
[bot] update/containers-2025-07-03</small>
vsoch open issue kubernetes-sigs/node-feature-discovery#2190.
Issue building ARM Graviton on AWS
I downloaded go 1.24.x linux/arm64 and am having issues building. Note that I have the …View Comment
vsoch pushed to converged-computing/aws-performance-study
compat: add early derivation of node metadata
I do not have all the nodes yet - need to go through and redo arm, etc.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to vsoch/node-feature-discovery
feat: add ability to export
For the HPC use case, we want to be able to export static features, raw or as labels, to the terminal or a JSON path.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch commented on issue kubernetes-sigs/node-feature-discovery#2185.
It appears you can’t - if you set the white list to match all, you only get a little over 100. For the export labels/feature PR I am working on, exporting raw features != exporting labels….
vsoch pushed to flux-framework/flux-framework.github.io
Merge pull request #160 from flux-framework/release-docs-2025-07-02
Update from release-docs-2025-07-02</small>
vsoch pushed to vsoch/vsoch.github.io
nit: typo in work
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/flux-framework.github.io
Merge pull request #159 from flux-framework/release-docs-2025-07-01
Update from release-docs-2025-07-01</small>
vsoch pushed to converged-computing/performance-study
add missing size selector to ui
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to vsoch/node-feature-discovery
feat: add ability to export features for static node
For the HPC use case, we want to be able to export static features to the terminal or JSON path.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #347 from singularityhub/update/containers-2025-06-30
[bot] update/containers-2025-06-30</small>
vsoch commented on issue rootless-containers/usernetes#376.
Did you see my post here?…
vsoch commented on issue flannel-io/flannel#2254.
More investigation - this is only for older kernels, before 5.3, which is probably something we only have to deal with in HPC:…
vsoch pushed to converged-computing/performance-study
add web interface
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch commented on issue DLR-AMR/t8code#1615.
We can close - it looks like I got it running:…
vsoch pushed to rseng/software
Merge pull request #429 from rseng/update/software-2025-06-29
Update from update/software-2025-06-29</small>
vsoch opened a pull request to converged-computing/flux-usernetes
vsoch pushed to vsoch/node-feature-discovery
feat: add ability to export features for static node
For the HPC use case, we want to be able to export static features to the terminal or JSON path.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch commented on issue singularityhub/singularity-hpc#670.
If you have interest! I didn’t comment because it’s marked as a Draft. Let me know when it’s ready for review….
vsoch open issue flannel-io/flannel#2254.
bug: br_netfilter requirement prevents startup - not required in user space container
This is similar to https://github.com/flannel-io/flannel/issues/2068, but a different environment….View Comment
vsoch pushed to converged-computing/usernetes
on-premises: infiniband fullly working with this setup
Infiniband is working on TOSS 4.18.0-553.56.1.1toss.t4 based on RHEL 8.10. For this to work, most of the issue was with respect to network firewalls, kernel modules, and system security. Fixes here include needing to create unique CNI names for podman, add a flag to ignore preflight errors (for the old kernel) and update the flannel install to be before 0.25.x when a check for br_netfilter was added. This used to be part of kubeadm, and it was removed with K8s 1.30. It is not technically needed in the podman container (it is needed on the physical host) but since the check is done in the container, this will fail flannel from starting up. For the time being, we will use an older flannel, and I will open an issue on the repository to ask for the ability to disable the check.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch commented on issue zenodo/zenodo#1606.
You know, it’s been enough time this would be a fairly straight forward task for an LLM. As long as you provide it the structure and data you need to populate it….
vsoch pushed to singularityhub/shpc-registry
Merge pull request #346 from singularityhub/update/containers-2025-06-26
[bot] update/containers-2025-06-26</small>
vsoch opened a pull request to rootless-containers/usernetes
vsoch open issue rootless-containers/usernetes#376.
Flannel doesn't see br_netfilter - expected?
Hi @AkihiroSuda. We are testing the latest (current master) of Usernetes, and flannel fails on deployment not seeing br_netfilter….View Comment
vsoch commented on issue rootless-containers/usernetes#374.
And I think it would be unlikely for multiple users to be using the same physical node with Usernetes. …
vsoch pushed to researchapps/usernetes
docs: note on order of starting components
flannel requires an annotation to use a host external ip for a multi-node setup. If the ip addresses that are in the private space can be routed between nodes (possible in some clouds) this is not an issue. It is only an issue in an HPC or similar environment where the private 10.x address might go to a router and not be understood (and dropped). We ran into this issue on our HPC system, and I realized it was because of the order of operations - we should make sync-external-ip first (adding the annotation) and then make install-flannel to use it. This would only be a bug for specific, multi-node environments.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>