vsoch pushed to vsoch/citelang
Automated deployment to update contributors 2025-01-20 (#56)
Co-authored-by: github-actions github-actions@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #294 from singularityhub/update/containers-2025-01-20
[bot] update/containers-2025-01-20</small>
vsoch pushed to rseng/software
Merge pull request #406 from rseng/update/software-2025-01-19
Update from update/software-2025-01-19</small>
vsoch pushed to hpc-social/good-first-issues
unpin ruby/setup-ruby action version
vsoch pushed to converged-computing/fluxqueue
Merge pull request #8 from converged-computing/add-other-types
feat: support for other types</small>
vsoch pushed to converged-computing/fluxqueue
ci: add tests for fluxqueue (#6)
- ci: add tests for fluxqueue
- refactor build to be in parallel
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/fluxion-go
feat and ci: grow support with updated branch
This currently uses a custom branch, and will need to be updated when merged into flux-sched. This last change updates the CI to use the latest noble image, and adds back the shrink support and test.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/singularity-hpc
Revert “Introducing version_naming property that makes modulefile as version
vsoch pushed to converged-computing/flux-service
eks: amazonlinux2023 testing done
The EFA works on the node and in the container, at the same time! The performance seems to be trivially impacted.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to conda-forge/urlchecker-feedstock
Rebuild for CFEP-25 noarch: python
syntax (#20)
-
update to CFEP-25
noarch: python
syntax -
MNT: Re-rendered with conda-build 24.11.2, conda-smithy 3.45.2, and conda-forge-pinning 2025.01.11.16.15.44</small>
vsoch pushed to conda-forge/gridtest-feedstock
Merge pull request #5 from regro-cf-autotick-bot/noarch_python_min-migration-1_h0ebd88
Rebuild for CFEP-25 noarch: python
syntax</small>
vsoch pushed to converged-computing/flux-tutorials
Merge pull request #1 from converged-computing/test-google-builds
wip: testing packer for google cloud</small>
vsoch pushed to rseng/software
Merge pull request #405 from rseng/update/software-2025-01-12
Update from update/software-2025-01-12</small>
vsoch pushed to converged-computing/fluxqueue
docs: fix typos and function headers
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to conda-forge/qme-feedstock
Merge pull request #3 from regro-cf-autotick-bot/noarch_python_min-migration-1_h654fa6
Rebuild for CFEP-25 noarch: python
syntax</small>
vsoch pushed to researchapps/nextflow
flux-executor: add simple test case for jobid with f
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/spack
toybox: depend on virtual zlib (#48486)
vsoch pushed to converged-computing/fluxqueue
scheduler and queue: start of work (#3)
- scheduler and queue: start of work
We need a very simple custom scheduler plugin to receive and deploy (bind) node assignments, and we will do that with the fluxqueue-scheduler I am adding here. The queue will live alongside the controller and interact with the fluxion service, and I have added the skeleton for that (which needs a lot of work, but it is building so this is a great start!
- wip: addition of queue logic
This is a WIP to save state of primarily the main.go/sum because it is currently building, of course still needs a lot of work on the code!
- queue: job pod is added to queue
Fluxion is also added and serving the cluster.
- fluxion submit is working!
We next need to unsuspend/ungate the pod to send to the fluxion custom scheduler, and also implement logic to react to different kubernetes events.
- feat: the full cycle to schedule a pod is working
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/flux-usernetes
update state of usernetes on azure
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/flux-tutorials
test: ubuntu 24.04 build (#7)
Signed-off-by: vsoch vsoch@users.noreply.github.com Co-authored-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/flux-usernetes
feat: add custom build of infiniband with ubuntu 24.04
Because Microsoft is only providing an image from 2022… :/
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Delete quay.io/pawsey/cuda-hpc-python directory
Signed-off-by: Vanessasaurus <814322+vsoch@users.noreply.github.com></small>
vsoch pushed to researchapps/usernetes
cni-plugins: update to v1.6.2
problem: release 1.6.1 no longer has the targz for most platforms solution: update to release 1.6.2, released on 1/6/2024 Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/spack
Update from update-package/flux-core-2025-01-08 (#286)
- Automated deployment to update package flux-core 2025-01-08
- Add back 0.67.0 and py-ply
- Update package.py
Co-authored-by: github-actions github-actions@users.noreply.github.com Co-authored-by: Vanessasaurus <814322+vsoch@users.noreply.github.com></small>
vsoch pushed to flux-framework/flux-framework.github.io
Merge pull request #135 from flux-framework/release-docs-2025-01-08
Update from release-docs-2025-01-08</small>
vsoch pushed to converged-computing/fluxqueue
fluxion submit is working!
We next need to unsuspend/ungate the pod to send to the fluxion custom scheduler, and also implement logic to react to different kubernetes events.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/aks-infiniband-install
feat: add usernetes install
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/fluxqueue
queue: job pod is added to queue
Fluxion is also added and serving the cluster.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to conda-forge/watchme-feedstock
Rebuild for CFEP-25 noarch: python
syntax (#4)
-
update to CFEP-25
noarch: python
syntax -
MNT: Re-rendered with conda-build 24.11.2, conda-smithy 3.45.1, and conda-forge-pinning 2025.01.06.12.46.45</small>
vsoch pushed to rseng/software
Merge pull request #404 from rseng/update/software-2025-01-05
Update from update/software-2025-01-05</small>
vsoch pushed to converged-computing/fluxqueue
wip: addition of queue logic
This is a WIP to save state of primarily the main.go/sum because it is currently building, of course still needs a lot of work on the code!
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/fluxion
Merge pull request #4 from converged-computing/cleanup-match
match: remove unused variable count</small>
vsoch pushed to conda-forge/rse-feedstock
Rebuild for CFEP-25 noarch: python
syntax (#16)
-
update to CFEP-25
noarch: python
syntax -
MNT: Re-rendered with conda-build 24.11.2, conda-smithy 3.45.1, and conda-forge-pinning 2025.01.03.14.12.46</small>
vsoch pushed to conda-forge/deid-feedstock
Merge pull request #46 from regro-cf-autotick-bot/noarch_python_min-migration-1_h28972a
Rebuild for CFEP-25 noarch: python
syntax</small>
vsoch pushed to converged-computing/fluxqueue
Merge pull request #2 from converged-computing/add-fluxion
fluxion: add service to provide scheduler</small>
vsoch pushed to conda-forge/singularity-hpc-feedstock
Rebuild for CFEP-25 noarch: python
syntax (#47)
-
update to CFEP-25
noarch: python
syntax -
MNT: Re-rendered with conda-build 24.11.2, conda-smithy 3.45.1, and conda-forge-pinning 2025.01.02.20.43.49</small>
vsoch pushed to flux-framework/spack
Automated deployment to update package flux-sched 2025-01-02
vsoch pushed to Amjadhpc/singularity-hpc
Ran black to format the indenting
vsoch pushed to converged-computing/flux-tutorials
azure: add usernetes scripts
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/flux-tutorials
azure: ensure we source hpcx environment before install
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to rseng/software
Merge pull request #403 from rseng/update/software-2024-12-29
Update from update/software-2024-12-29</small>
vsoch pushed to converged-computing/flux-distribute
aws: add 30 node experiment
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/flux-tutorials
bug: vmset is not reliable with hostnames (#3)
- bug: vmset is not reliable with hostnames
We cannot be guaranteed that the lead broker is flux00000 so we need to provide automation to update/fix the issue. I am also adding a script for OSU to install the benchmarks
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/spack
freeglut: add v3.6.0 (#48292)
-
freeglut: add v3.6.0
-
Change the version range for the patch
Co-authored-by: jmcarcell jmcarcell@users.noreply.github.com</small>
vsoch pushed to vsoch/vsoch.github.io
add 2024 reflection post!
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to rseng/software
Merge pull request #402 from rseng/update/software-2024-12-22
Update from update/software-2024-12-22</small>
vsoch pushed to converged-computing/performance-study
nit: fix spacing in readme
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/performance-study
paper: ensure that colors are consistent across apps
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/spack
Support oneTBB 2021.13.0 and 2022.0.0 (#48239)
vsoch pushed to conda-forge/spython-feedstock
Rebuild for CFEP-25 noarch: python
syntax (#54)
-
update to CFEP-25
noarch: python
syntax -
MNT: Re-rendered with conda-build 24.11.2, conda-smithy 3.45.1, and conda-forge-pinning 2024.12.20.19.39.30</small>
vsoch pushed to conda-forge/oras-py-feedstock
Rebuild for CFEP-25 noarch: python
syntax (#31)
-
update to CFEP-25
noarch: python
syntax -
MNT: Re-rendered with conda-build 24.11.2, conda-smithy 3.45.1, and conda-forge-pinning 2024.12.21.10.18.30</small>
vsoch pushed to sciworks/spack-updater
Don’t use spack external find
vsoch pushed to flux-framework/spack
xyce: update +pymi related dependencies (#48044)
vsoch pushed to converged-computing/performance-study
analysis: start of osu for paper
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/aws-osu-christmas
happy holidays evan and heidi
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to sciworks/spack-updater
Do not use externals
vsoch pushed to flux-framework/spack
re-enable flux-core
vsoch pushed to flux-framework/spack
libEnsemble: add v1.4.3 (#48144)
vsoch pushed to converged-computing/lise-azure
osu: update benchmark download address
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/jobspec-database
Merge pull request #5 from converged-computing/make-catalog
feat: add catalog of summary data</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #290 from singularityhub/update/containers-2024-12-17
[bot] update/containers-2024-12-17</small>
vsoch pushed to singularityhub/shpc-registry-cache
reago no longer in registry
vsoch pushed to singularityhub/shpc-registry
Delete quay.io/biocontainers/reago directory
This looks deprecated https://quay.io/repository/biocontainers/reago?tab=tags&tag=latest
Signed-off-by: Vanessasaurus <814322+vsoch@users.noreply.github.com></small>
vsoch pushed to rseng/software
Add –ignore-installed to pip install
vsoch pushed to flux-framework/spack
New Package: Trame (#47920)
- Add trame</small>
vsoch pushed to converged-computing/performance-study
Merge pull request #80 from converged-computing/add-anonymous-pull-plots
add combined pull plot</small>
vsoch pushed to flux-framework/flux-python
ci: ensure twine is installed to push
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/flux-usernetes
aws: update build to use usernetes python (#18)
- aws: update build to use usernetes python
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #289 from singularityhub/update/containers-2024-12-12
[bot] update/containers-2024-12-12</small>
vsoch pushed to flux-framework/spack
ginkgo: add v1.9.0 (#47987)
Co-authored-by: Tamara Dahlgren <35777542+tldahlgren@users.noreply.github.com></small>
vsoch pushed to flux-framework/flux-operator
feat: add podsecuritycontext to expose sysctl stuffs
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/flux-k8s
Merge pull request #87 from flux-framework/queue-extensions-bug
bug: context missing from EventsToRegister</small>
vsoch pushed to converged-computing/usernetes-python
feat: add flux helper scripts to shared (#3)
- feat: add flux helper scripts to shared this adds the ability to customize ports
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/performance-study
nit: lammps plot label should be gpu count
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/flux-distribute
experiemnt: add aws 256 node runs
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to rse-ops/usernetes
feat: customization of exposed service ports
Problem: when deploying usernetes in a multi tenant environment, there will likely be clashing of ports due to their hard coding of traditional values. Solution: expose these ports (those mapped to the host in the docker-compose.yaml and kubeadm-config yaml as environment variables to customize.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/spack
dlb: add v3.5.0 (#47916)
vsoch pushed to singularityhub/shpc-registry
Merge pull request #288 from singularityhub/update/containers-2024-12-09
[bot] update/containers-2024-12-09</small>
vsoch pushed to rse-ops/usernetes
feat: customization of exposed service ports
Problem: when deploying usernetes in a multi tenant environment, there will likely be clashing of ports due to their hard coding of traditional values. Solution: expose these ports (those mapped to the host in the docker-compose.yaml and kubeadm-config yaml as environment variables to customize.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Update container.yaml
Signed-off-by: Vanessasaurus <814322+vsoch@users.noreply.github.com></small>
vsoch pushed to hpc-social/jobs
remove zurich job to attempt repost
vsoch pushed to flux-framework/spack
coverage.yml: fail_ci_if_error = true (#47731)
vsoch pushed to oras-project/oras-py
Merge pull request #178 from oras-project/contributors/update-2024-12-06
[tributors] contributors/update-2024-12-06</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #287 from singularityhub/update/containers-2024-12-05
[bot] update/containers-2024-12-05</small>
vsoch pushed to rseng/jobs-updater
Add missing newline to bluesky post
vsoch pushed to rse-ops/usernetes
test: try generating different kubeadm config for join command
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to rseng/zenodo-release
metadata improvemens (#16)
- add title and description support + do not override related_identifiers if some exists
- fix invalid format args</small>
vsoch pushed to flux-framework/flux-framework.github.io
Merge pull request #133 from flux-framework/release-docs-2024-12-04
Update from release-docs-2024-12-04</small>
vsoch pushed to converged-computing/usernetes-python
Merge pull request #1 from converged-computing/install-prolog-epilog
feat: add prolog/epilog start scripts</small>
vsoch pushed to converged-computing/flux-usernetes
feat: add packer build for aws (#17)
- feat: add packer build for aws
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #286 from singularityhub/update/containers-2024-12-03
[bot] update/containers-2024-12-03</small>
vsoch pushed to rseng/jobs-updater
Clean up bluesky posting
vsoch pushed to converged-computing/usernetes-python
feat: add prolog/epilog start scripts
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to rseng/jobs-updater
Merge pull request #10 from rseng/add-bluesky
feat: add bluesky</small>
vsoch pushed to hpc-social/jobs
Merge pull request #25 from hpc-social/add-bluesky
feat: deploy to bluesky</small>
vsoch pushed to converged-computing/usernetes-python
feat: add prolog/epilog start scripts
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/usernetes-python
feat: add prolog/epilog start scripts
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/cise-special-issue
links: add links to special issue
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/cloud-select
Merge pull request #41 from converged-computing/contributors/update-2024-11-30
[tributors] contributors/update-2024-11-30</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #284 from singularityhub/update/containers-2024-11-28
[bot] update/containers-2024-11-28</small>
vsoch pushed to converged-computing/flux-views
rocky: add python3-devel
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/flux-distribute
experiment: fix nodes pages 6->30
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/spack
Add CMake 3.31 minor release (#47676)
vsoch pushed to converged-computing/slurm-scrape
nit: improve titles
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/flux-distribute
experiment: topology with kind
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to vsoch/vsoch.github.io
post: for coach
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to hpc-social/hpc-social.github.io
Merge pull request #82 from hpc-social/contributors/update-2024-11-26
[tributors] contributors/update-2024-11-26</small>
vsoch pushed to converged-computing/performance-study
Merge pull request #76 from converged-computing/fix-mixbench-stream-colors
analysis: fix colors for stream, mixbench, magma</small>
vsoch pushed to converged-computing/flux-tutorials
aws: add tutorial video link
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #283 from singularityhub/update/containers-2024-11-25
[bot] update/containers-2024-11-25</small>
vsoch pushed to vsoch/vsoch.github.io
talk: add fosdem 2024
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to rseng/software
Merge pull request #400 from rseng/update/software-2024-11-24
Update from update/software-2024-11-24</small>
vsoch pushed to converged-computing/performance-study
Merge pull request #74 from converged-computing/update-gke-gpu
analysis: fix missing amg results for gke gpu</small>
vsoch pushed to converged-computing/flux-tutorials
tutorial: flux on aws
This set of configs allows for deploying Flux Framework to bare metal VMs on AWS EC2 using packer to build and terraform to deploy. The video for the tutorial is undergoing review and will be posted when that is finished.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to compspec/compat-lib
Merge pull request #10 from compspec/wip-update-proot
tweak proot to use pwd and kill on exit</small>
vsoch pushed to compspec/compat-lib
test
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/lammps-time
docs: add kind images to readme
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/lammps-time
Add kind experiment (#4)
- add wip kind experiment
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to compspec/compat-lib
tweak proot to use pwd and kill on exit
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/spack
visit: add v3.4.0, v3.4.1 (#47161)
-
Visit: Add new versions 3.4.0 and 3.4.1
-
Adios2: Restrict python, 3.11 doesn’t not work for older Adios2
-
VisIt: Set the VTK_VERSION for @3.4:
Older versions of VTK used the VTK_{MAJOR, MINOR}_VERSION variables for VTK detection. VisIt >= 3.4 uses the full string VTK_VERSION.
-
CI: Don’t build llvm-amdgpu for non-HIP stack
-
VisIt: v3.4.1 handles newer Adios2 correctly
-
Visit: Add missing links in HDF5, set correct VTK version configuration parameter
-
VisIt: Add py-pip requirement and patch visit with configuration changes
-
HDF5 symlinks move when inside of callback
-
VisIt ninja install fails with python module. Using make does not
-
VisIt 3.4 has a high minimum cmake requirement
-
HDF5: Early return when not mpi for mpi symlinks
-
HDF5: Use platform agnostic method for creating legacy compatible MPI symlinks
-
Fix VISIT_VTK_VERSION handling for 8.2.1a hack</small>
vsoch pushed to flux-framework/spack
Merge pull request #260 from flux-framework/update-package/flux-core-2024-11-20
Update from update-package/flux-core-2024-11-20</small>
vsoch pushed to converged-computing/jobspec-database
docs: add documentation in readme for reading databases
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/lammps-time
experiment: add kind cluster running results
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to compspec/compat-lib
Merge pull request #7 from compspec/add-close
feat: add close and perfetto support</small>
vsoch pushed to vsoch/vsoch.github.io
Update 2024-11-17-across-boundaries.md
vsoch pushed to singularityhub/shpc-registry
Merge pull request #280 from singularityhub/update/containers-2024-11-18
[bot] update/containers-2024-11-18</small>
vsoch pushed to compspec/compat-lib
Merge pull request #3 from compspec/add-python-module
feat: add supporting python module</small>
vsoch pushed to rseng/software
Merge pull request #399 from rseng/update/software-2024-11-17
Update from update/software-2024-11-17</small>
vsoch pushed to compspec/compat-lib
add simple release workflow
This will release an x86 binary, which should be suitable for basic testing.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/spack
py-wandb: add v0.16.6 (#43891)
-
py-wandb: add version v0.16.6
-
fix: typo
-
py-wandb: py-click when @0.15.5:, py-pathtools when @:0.15
Co-authored-by: Wouter Deconinck wdconinc@gmail.com</small>
vsoch pushed to singularityhub/shpc-registry
remove tag 1.1–py34 from biocontainers/reago
Signed-off-by: Vanessasaurus <814322+vsoch@users.noreply.github.com></small>
vsoch pushed to converged-computing/lammps-time
analysis: typo
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to conda-forge/oras-py-feedstock
oras-py v0.2.25 (#30)
-
updated v0.2.25
-
MNT: Re-rendered with conda-build 24.9.0, conda-smithy 3.44.3, and conda-forge-pinning 2024.11.14.06.00.25</small>
vsoch pushed to rseng/devstories-episodes-2
episode 102: dan reed “hpc dan”
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to rseng/devstories
Add thank you to our HPC Dan!
vsoch pushed to rseng/devstories
episode 102: dan reed “hpc dan”
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/lammps-time
model: add markov model for predicting next path
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #277 from singularityhub/update/containers-2024-11-11
[bot] update/containers-2024-11-11</small>
vsoch pushed to oras-project/oras-py
Retry on 500 (#168)
- workaround: retry manifest upload on quay
- decorator: get rid of inheritance
- decorator: retry on 500
Signed-off-by: Isabella do Amaral idoamara@redhat.com</small>
vsoch pushed to nicholas-sly/spack
Update var/spack/repos/builtin/packages/flux-sched/package.py
Co-authored-by: Greg Becker becker33@llnl.gov</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #278 from singularityhub/update/containers-2024-11-12
[bot] update/containers-2024-11-12</small>
vsoch pushed to converged-computing/container-chonks
container times: look into specific events (#3)
- container times: look into specific events
- container pulling times: put run1 in the readme
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to vsoch/vsoch.github.io
update work
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/spack
lua: always generate pcfile without patch and remove +pcfile variant (#47353)
-
lua: add +pcfile support for @5.4: versions, without using a version-dependent patch
-
lua: always generate pcfile, remove +pcfile variant from all packages
-
lua: minor fixes
-
rpm: minor fix</small>
vsoch pushed to converged-computing/lammps-time
add pattern ideas to fuse analysis
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/container-chonks
Merge pull request #2 from converged-computing/add-aws-pulling-study
Add aws pulling study</small>
vsoch pushed to converged-computing/container-chonks
google: add back re-run
I did these re-runs because the settings on the kubernetes event exporting was dropping some events, and I do not think that is appropriate data to use for a publication. I am still going to run one more final study on gKE that only tests the regular pulls using their registry.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to rseng/software
Merge pull request #398 from rseng/update/software-2024-11-10
Update from update/software-2024-11-10</small>
vsoch pushed to converged-computing/lammps-time
add lammps output parsing
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/supermarket-fish-problem
add current/max speeds for gpu
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/performance-study
Merge pull request #71 from converged-computing/scaling-governor
analysis: look at scaling governor</small>
vsoch pushed to converged-computing/lammps-time
Merge pull request #2 from converged-computing/add-fuse-install
add copyright, notice, license</small>
vsoch pushed to converged-computing/flux-distribute
add topology testing
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to compspec/compat-lib
feat: basic recorder functionality (#1)
- feat: basic recorder functionality
We are going to want to build a bunch of hpc apps and then record what they are doing, meaning paths touched and when! This is a start.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to vsoch/pypi-classifiers
try ubuntu 24.04
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to compspec/compat-lib
output updates
The output log now has unix nanoseconds, and also the program exits and cleans up after the command finishes running.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #276 from singularityhub/update/containers-2024-11-07
[bot] update/containers-2024-11-07</small>
vsoch pushed to rseng/gpu-search
remove partial data
I was originally saving organized based on date, but I do not anticipate doing this again so I am removing in favor of the top level.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/spack
Merge pull request #257 from flux-framework/release/flux-sched-v0.40.0
Update from release/flux-sched-v0.40.0</small>
vsoch pushed to flux-framework/spack
Merge pull request #249 from flux-framework/release/flux-security-v0.12.0
Update from release/flux-security-v0.12.0</small>
vsoch pushed to flux-framework/flux-framework.github.io
Merge pull request #130 from flux-framework/release-docs-2024-11-05
Update from release-docs-2024-11-05</small>
vsoch pushed to converged-computing/supermarket-fish-problem
plots: restore line width
we cannot see the distribution of values without the linewidth being non-zero. I cannot remove it
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #275 from singularityhub/update/containers-2024-11-04
[bot] update/containers-2024-11-04</small>
vsoch pushed to converged-computing/performance-study
analysis: stream has incorrect title (Minife)
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to rseng/software
Merge pull request #397 from rseng/update/software-2024-11-03
Update from update/software-2024-11-03</small>
vsoch pushed to converged-computing/slurm-operator
container-bases: update to rockylinux9 (#7)
- container-bases: update to rockylinux9
- powertools -> enable crb
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/flux-usernetes
Add google to readme
vsoch pushed to researchapps/flux-sched
debug: adding verbosity to grow function
We need to figure out why the function is returning -1. I am adding additional error parsing to check.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/fluxion-go
feat: shrink support for fluxion
This changeset exposes the remove_subgraph function, which we can call a shrink. It does not account for (I do not think) handling jobs properly, but should be a reasonable start to testing or debugging.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/flux-jobset
app: add stream example
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/flux-jobset
app: add kripke example on one node
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #274 from singularityhub/update/containers-2024-10-31
[bot] update/containers-2024-10-31</small>
vsoch pushed to flux-framework/spack
add the USE_F90_ALLOCATABLE option to Spack (#47190)
Signed-off-by: Jeff Hammond jehammond@nvidia.com</small>
vsoch pushed to converged-computing/fluxgen
bug: move miniconda to system bin (#2)
- bug: move miniconda to system bin
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/performance-study
analysis: topology for aws (#67)
- analysis: topology for aws
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to vsoch/flux-core-feedstock
feat: add support for systemd
This should add libsystemd0 as a dependency on the host so that we can hopefully have support for it - I’m guessing that flux will detect it on build.</small>
vsoch pushed to converged-computing/flux-distribute
docs: update title to flux-distribute
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to vsoch/caliper
ci: add pre commit for linting (#41)
- ci: add pre commit for linting
- spelling
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to rseng/software
Merge pull request #396 from rseng/update/software-2024-10-27
Update from update/software-2024-10-27</small>
vsoch pushed to vsoch/vsoch.github.io
post: add little monster
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #270 from singularityhub/update/containers-2024-10-24
[bot] update/containers-2024-10-24</small>
vsoch pushed to rseng/devstories-episodes-2
episode 10101: michela taufer
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to rseng/devstories
episode 101 michela taufer
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/spack
draco: add v7.19.0 (#47032)
Co-authored-by: Cleveland cleveland@lanl.gov Co-authored-by: Kelly (KT) Thompson KineticTheory@users.noreply.github.com</small>
vsoch pushed to converged-computing/ensemble-python
docs: update todo in readme
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/ensemble-python
update server to receive grow/shrink request
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/ensemble-operator
docs: design
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/performance-study
Merge pull request #66 from converged-computing/add-lammps-fom
analysis: lammps matom steps per second</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #269 from singularityhub/update/containers-2024-10-21
[bot] update/containers-2024-10-21</small>
vsoch pushed to oras-project/oras-py
core: align config_path type annotation (#166)
- core: align config_path type annotation the oras-py CI setup uses basic auth for auth tests
Signed-off-by: tarilabs matteo.mortari@gmail.com</small>
vsoch pushed to converged-computing/ensemble-python
feat: grow/shrink requests are being hit
I need to put this into the ensemble operator next to have the request actually do something, like request the minicluster to scale up or down. I will also need to have a way to communicate the member name and namespace. This could either be done via discovery (requiring the kubernetes API within the ensemble python and the rbac to use it), or more simply done, just put the member name that is expected in the same namespace. More ideally there can be a registration step at the onset that generates a random name and sends it over to the grpc service to associate.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/ensemble-python
feat: support for inequality in rule->when
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to compspec/compat-lib
comment: fix comment about not working (it is)
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to rseng/software
–break-system-packages
vsoch pushed to converged-computing/ensemble-python
example: heartbeat example
This updates the heartbeat so it is entirely derived from the config. This can happen explicitly if the user sets logging->heartbeat to a non zero value, but it will also happen if there is a grow or shrink action used. If the user defines a grow/shrink and sets the heartbeat to 0 it will still be set to the default, 60, because grow/shrink will not work as expected without it.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-guts
remove break system packages
vsoch pushed to singularityhub/guts
–break-system-packages
vsoch pushed to flux-framework/spack
openldap: add v2.6.8; conflict gcc@14: for older (#47024)
vsoch pushed to converged-computing/ensemble-python
ci: add docker build for service (#3)
- ci: add docker build for service
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/ensemble-python
Merge pull request #1 from converged-computing/add-support-logging-backoff
feat: support for repetitions and logging</small>
vsoch pushed to converged-computing/cloud-select
try insecure to registry init
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
–break-system-packages
Signed-off-by: Vanessasaurus <814322+vsoch@users.noreply.github.com></small>
vsoch pushed to singularityhub/shpc-guts
–break-system-packages
vsoch pushed to rseng/software
–break-system-packages
vsoch pushed to flux-framework/spack
py-cython: add v3.0.11 (#46772)
- py-cython: add v3.0.11 Add url for cython because they are using lower case for 3.0.11 Co-authored-by: Tamara Dahlgren <35777542+tldahlgren@users.noreply.github.com>
- Don’t use f-string
- Remove old version directive for 3.0.11
Co-authored-by: jmcarcell jmcarcell@users.noreply.github.com Co-authored-by: Tamara Dahlgren <35777542+tldahlgren@users.noreply.github.com></small>
vsoch pushed to converged-computing/performance-study
Update azure lammps (#64)
- azure was missing lammps size 128 and 256
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #267 from singularityhub/update/containers-2024-10-14
[bot] update/containers-2024-10-14</small>
vsoch pushed to compspec/compat-lib
compatibility: add use case for library server
We want to be able to check software compatbility, amongst other things. In Kubernetes, Node Feature Discovery (NFD) has a design where a daemon runs on the nodes, can parse what they provide, and then provides that to a central service. For our case, we can do similar - having a service (that can run on the node and either work with a local client OR a scheduler) that knows how to read either a compatibility artifact directly (json payload) or retrieve from a registry, where it describes an application or container, and then based on the libraries needed, quickly determine if the node can satisfy the needs. This I am calling a Compatibility server and check because it extends to other things beyond libraries / software.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/singularity-hpc
autamus no longer maintained
vsoch pushed to singularityhub/shpc-registry
Add library to gh-pages
Signed-off-by: Vanessasaurus <814322+vsoch@users.noreply.github.com></small>
vsoch pushed to sciworks/spack-updater
try installing setuptools
vsoch pushed to converged-computing/flux-usernetes
plots: add plot with just bare metal and usernetes
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/performance-study
analysis: add basic container pulling cost estimate
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/container-chonks
pulling: add cost estimates
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #266 from singularityhub/update/containers-2024-10-10
[bot] update/containers-2024-10-10</small>
vsoch pushed to flux-framework/flux-k8s
Merge pull request #85 from flux-framework/control-builds
Bug: JGF Name was removed, and build with distroless destroyed logging</small>
vsoch pushed to converged-computing/ensemble-python
feat: add support for queue metrics and actions!
With this addition, we have our first mini ensemble that is able to submit jobs at start, wait until a count is reached, and then (based on that count metric) submit another group of jobs!
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/spack
python: rework how we compute the “command” property (#46850)
Some Windows Python installations may store the Python exe in Scripts/
rather than the base directory. Update .command
to search in both
locations on Windows. On all systems, the search is now done
recursively from the search root: on Windows, that is the base install
directory, and on other systems it is bin/.</small>
vsoch pushed to converged-computing/ensemble-python
river streaming ml metrics
Add queue metrics to keep track of job groups. This includes MAD, IQR, mean, min and max so far.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #265 from singularityhub/update/containers-2024-10-07
[bot] update/containers-2024-10-07</small>
vsoch pushed to researchapps/eksctl
efa-installer: remove archive in 2023 files
Problem: the node consistently runs out of disk space when adding efa, resulting in an unusable cluster with scattered nodes where the installer failed. Solution: the installer archive itself is huge, and we can simply remove it and avoid this error.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/spack
php: add v7.4.33, v8.3.12 (fix CVEs) (#46829)
-
php: add v7.4.33, v8.3.12
-
php: mv sbang.patch sbang-7.patch
-
php: add sbang-8.patch
-
[@spackbot] updating style on behalf of wdconinc
-
Replace –with-libiconv= (not recognized) with –with-iconv=
Co-authored-by: wdconinc wdconinc@users.noreply.github.com Co-authored-by: Bernhard Kaindl contact@bernhard.kaindl.dev</small>
vsoch pushed to vsoch/vsoch.github.io
spelling: programmatically
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to rseng/software
Merge pull request #395 from rseng/update/software-2024-10-06
Update from update/software-2024-10-06</small>
vsoch pushed to flux-framework/spack
py-rpds-py: add v0.18.1 (#46786)
vsoch pushed to converged-computing/container-chonks
experiment pulling: add streaming results
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #264 from singularityhub/update/containers-2024-10-03
[bot] update/containers-2024-10-03</small>
vsoch pushed to regro-cf-autotick-bot/deid-feedstock
Update meta.yaml
vsoch pushed to flux-framework/spack
py-rucio-clients: new package (and dependencies) (#46585)
Co-authored-by: Bernhard Kaindl contact@bernhard.kaindl.dev</small>
vsoch pushed to converged-computing/ensemble-containers
doi: add zenodo doi
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/container-chonks
experiment, pulling: add run 4 with streaming images
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/flux-framework.github.io
Merge pull request #128 from flux-framework/release-docs-2024-10-02
Update from release-docs-2024-10-02</small>
vsoch pushed to converged-computing/container-chonks
update readme for run2?
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to oras-project/oras-py
Merge pull request #160 from oras-project/release-0.2.21
release: 0.2.21</small>
vsoch pushed to converged-computing/container-crafter
docs: add doi to README
vsoch pushed to converged-computing/container-chonks
experiment: pulling with more sizes
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #263 from singularityhub/update/containers-2024-09-30
[bot] update/containers-2024-09-30</small>
vsoch pushed to converged-computing/jobspec-database
mistral: add sample data
This is not checked by a human, but it was processed by gemini, which is pretty good. It should be an ok start to testing out training (fine tuning) with mistral
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/container-crafter
generator: mostly complete
This build tool now can generate an entire set of images, where each layer is unique based on the size and random filename, so a pulling study can be done and no cache can be used.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/container-chonks
layer digests: only count each uri once
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/container-chonks
digest similarity: add image size calculations
Also adding 100% percentile since we want to calculate the larger range - ML images are big :)
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to rseng/software
Merge pull request #394 from rseng/update/software-2024-09-29
Update from update/software-2024-09-29</small>
vsoch pushed to flux-framework/spack
cc
: ensure that RPATHs passed to linker are unique
macOS Sequoia’s linker will complain if RPATHs on the CLI are specified more than once.
To avoid errors due to this, make cc
only append unique RPATHs to the final args list.
This required a few improvements to the logic in cc
:
-
List functions in
cc
didn’t have any way to append unique elements to a list. Add acontains()
shell function that works like our other list functions. Use it to implement an optional"unique"
argument toappend()
and anextend_unique()
. Use that to add RPATHs to theargs_list
. -
In the pure
ld
case, we weren’t actually parsingRPATH
arguments separately as we do forccld
. Fix this by adding another nested case statement for rawRPATH
parsing. There are now 3 places where we deal with-rpath
and friends, but I don’t see a great way to unify them, as-Wl,
,-Xlinker
, and raw-rpath
arguments are all ever so slightly different. -
Fix ordering of assertions to make
pytest
diffs more intelligible. The meaning of+
and-
in diffs changed inpytest
6.0 and the “preferred” order for assertions becameassert actual == expected
instead of the other way around.
Signed-off-by: Todd Gamblin tgamblin@llnl.gov</small>
vsoch pushed to converged-computing/container-chonks
dockerfile experiment - add summary metrics for digest similarity
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #262 from singularityhub/update/containers-2024-09-26
[bot] update/containers-2024-09-26</small>
vsoch pushed to hpc-social/hpc-social.github.io
Merge pull request #80 from hpc-social/contributors/update-2024-09-25
[tributors] contributors/update-2024-09-25</small>
vsoch pushed to converged-computing/performance-study
container similarity: remove title to improve clarity of clustermaps
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/ensemble-containers
containers: add entire google set
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to oras-project/oras-py
core: improve anon/auth token logic (#148)
- core: TokenAuth request_token fix missing auth
the method is intended to request authenticated token, per pydocs, but was passing an headers which was always missing Authorization.
- core: use token in auth in subsequent requests
if a token was saved in auth, it shall be used in subsequent requests.
This avoid a situation where: to upload a blob, first is done anonymously, then retry with token then upload a manifest, avoid the attempt to upload anonymously if a token was present in the previous flow
- core: if 401 on 2nd attempt, avoid anon tokens
in the first flow using auth backend for token:
- try do_request with no auths at all
- the attempt to gain an anon token is success, but then the request fails with 401
- at this point, in the third attempt, give chance to the flow to request a token but avoid any anon tokens.
Please note: this happens effectively only on the first run of the flow. Subsequent do_request flow invocations should just succeed now on the 1st request by re-using the token –simplified behaviour introduced with this proposal
- guard as headers is Optional
-
implement review request
- Revert “implement review request”
This reverts commit 102381c5c4ae0fdf45c8a4dd26ae1765eae9b029. This reverts commit 1e891d2bfebe4b6520a1fe6902159198c8799d62. This reverts commit 6e226672c60184cd43b6532f5a910acbf9d064ea.
this was taken care in https://github.com/oras-project/oras-py/pull/153
This reverts commit 10e010b365e56488963ca14b6e9e08b1ea7e4a7a.
- implement review comment about anon/req token
from: https://github.com/oras-project/oras-py/pull/148#discussion_r1677018164
And if the basic auth is there, skip over asking for an anon token
as it stands, in case the basic auth are present, these are exchanged for the request token.
Signed-off-by: tarilabs matteo.mortari@gmail.com
Signed-off-by: tarilabs matteo.mortari@gmail.com</small>
vsoch pushed to converged-computing/metrics-operator
custom: add support for custom container (#84)
- custom: add support for custom container
We should be able to support custom containers, and configuration of addons to them. I am not liking the design to have addons defined in parallel, and want to refactor so they are part of the metric. I am also wondering if the metrics themselves are more akin to apps. I have not looked at this project in a bit and need to think about it.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/metrics-containers
Merge pull request #37 from converged-computing/add-mpitrace-ubuntu
container: mpitrace with ubuntu jammy base</small>
vsoch pushed to converged-computing/ensemble-containers
containers: add ssh for metrics operator
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #261 from singularityhub/update/containers-2024-09-23
[bot] update/containers-2024-09-23</small>
vsoch pushed to converged-computing/ensemble-containers
container: add laghos
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/spack
Docs/Windows: Clarify supported shells and caveats (#46381)
While the existing getting started guide does in fact reference the powershell support, it’s a footnote and easily missed. This PR adds explicit, upfront mentions of the powershell support. Additionally this PR adds notes about some of the issues with certain components of the spec syntax when using CMD.</small>
vsoch pushed to converged-computing/ensemble-containers
container: kripke
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #260 from singularityhub/update/containers-2024-09-20
[bot] update/containers-2024-09-20</small>
vsoch pushed to converged-computing/performance-study
Merge pull request #60 from converged-computing/update-mixbench-cpu
analysis: mixbench cpu</small>
vsoch pushed to flux-framework/spack
Merge pull request #241 from flux-framework/update-package/flux-sched-2024-09-18
Update from update-package/flux-sched-2024-09-18</small>
vsoch pushed to compspec/jobspec
readme: add zenodo doi
vsoch pushed to flux-framework/spack
imports: automate missing imports (#46410)
vsoch pushed to converged-computing/performance-study
Merge pull request #51 from converged-computing/catalog-mixbench
- analysis: catalog for mixbench
Every environment was run slightly differently, with very little overlap. I am not sure how we can use this data.
- debug: mixbench
Here I am adding the actual run statements across the mixbench configurations so they can be visually compared.
Signed-off-by: vsoch vsoch@users.noreply.github.com
- analysis: mixbench, parsed data
This changeset includes parsed data for GPU runs. I want to look at these more closely to decide what to further parse and plot. The CSVs are compiled from each experiment environment, and that includes interleaving. I think we might still combine regardless to make a line plot based on the index/iteration.
Signed-off-by: vsoch vsoch@users.noreply.github.com
- gpu analysis: add summary for similar/different
Here we see that there are differences in memory size, notably for Google (only 16GB) and then the other clouds. On the other clouds (32GB) the actual values are still a little different. The weirdest finding is the error correction seems to be set at a mixture of yes/no for Azure GPUs
Signed-off-by: vsoch vsoch@users.noreply.github.com
Signed-off-by: vsoch vsoch@users.noreply.github.com Co-authored-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #258 from singularityhub/update/containers-2024-09-16
[bot] update/containers-2024-09-16</small>
vsoch pushed to singularityhub/singularity-cli
Merge pull request #226 from singularityhub/contributors/update-2024-09-15
[tributors] contributors/update-2024-09-15</small>
vsoch pushed to rseng/software
Merge pull request #392 from rseng/update/software-2024-09-15
Update from update/software-2024-09-15</small>
vsoch pushed to converged-computing/performance-study
Merge pull request #39 from converged-computing/reorganize-results
Reorganize results</small>
vsoch pushed to converged-computing/supermarket-fish-problem
Merge pull request #1 from converged-computing/add-start-of-summary
Add start of summary</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #257 from singularityhub/update/containers-2024-09-12
[bot] update/containers-2024-09-12</small>
vsoch pushed to flux-framework/flux-k8s
Merge pull request #84 from flux-framework/add-subsystem-field
jgf: update edge metadata to include subsystem</small>
vsoch pushed to converged-computing/supermarket-fish-problem
add fish
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to vsoch/vsoch.github.io
update hpckm link in resume
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/spack
flux-sched: add conflict for gcc and clang above 0.37.0
vsoch pushed to converged-computing/performance-study
Merge pull request #35 from converged-computing/add-link-supermarket-fish
single-node: add preprocessing</small>
vsoch pushed to converged-computing/performance-study
container-sizes: analysis to look at size of layers
This includes work to parse the events files and determine pull times, along with getting manifests and configs for all unique containers in the study. We filter this down to those that were used as experiment apps, and then look at overall layer sizes (histogram) and similarity based on digests. We also make plots that look at pull times for the entire study and within applications. This can be expanded if needed.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to conda-forge/juliart-feedstock
Rebuild for python 3.13 (#12)
-
Rebuild for python 3.13
-
MNT: Re-rendered with conda-build 24.7.1, conda-smithy 3.39.1, and conda-forge-pinning 2024.09.11.15.30.13</small>
vsoch pushed to converged-computing/performance-study
docs: add badge to readme
vsoch pushed to singularityhub/shpc-registry
Merge pull request #256 from singularityhub/update/containers-2024-09-09
[bot] update/containers-2024-09-09</small>
vsoch pushed to converged-computing/performance-study
Merge pull request #31 from converged-computing/add-mixbench-4-and-8
mixbench: updating results for compute-engine gpu sizes 4,8</small>
vsoch pushed to flux-framework/spack
Sz3 fix (#46263)
- Updated version of sz3 Supercedes #46128
- Add Robertu94 to maintainers fo r SZ3
Co-authored-by: Robert Underwood runderwood@anl.gov</small>
vsoch pushed to conda-forge/helpme-feedstock
Merge pull request #11 from regro-cf-autotick-bot/rebuild-python313-0-1_h99eafa
Rebuild for python 3.13</small>
vsoch pushed to conda-forge/deid-feedstock
Merge pull request #44 from regro-cf-autotick-bot/0.3.24_h61ad89
deid v0.3.24</small>
vsoch pushed to vsoch/vsoch.github.io
add fun post on ubuntu_containerd and v100 gpus
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to pydicom/deid
Handle Datasets made from BytesIO (#265)
- Handle Datasets made from BytesIO
- fix import order
- Update version.py
- Update CHANGELOG.md</small>
vsoch pushed to flux-framework/spack
flux-sched: add back check for run environment
vsoch pushed to flux-framework/spack
flux-sched: keep check for package external
vsoch pushed to flux-framework/flux-framework.github.io
Merge pull request #126 from flux-framework/release-docs-2024-09-04
Update from release-docs-2024-09-04</small>
vsoch pushed to converged-computing/performance-study
Merge pull request #26 from milroy/marathe-data
Reorganize marathe1 data to conform with structure</small>
vsoch pushed to flux-framework/flux-k8s
Merge pull request #82 from flux-framework/update-go-version
version: update builds and CI to go 1.22</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #254 from singularityhub/update/containers-2024-09-02
[bot] update/containers-2024-09-02</small>
vsoch pushed to flux-framework/spack
package cln 1.3.7 feelpp/spack#2 (#46162)
-
package cln 1.3.7 feelpp/spack#2
-
add myself as maintainer
-
fix style issue, rm blankline</small>
vsoch pushed to flux-framework/flux-k8s
bug: manifest path for scheduler deployment
There was a change upstream that switched the kube-scheduler back to being in bin (in the Dockerfile) but the corresponding manifest was not updated.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/flux-usernetes
Merge pull request #15 from converged-computing/add/google-cloud
adding google cloud usernetes setup</small>
vsoch pushed to converged-computing/aks-infiniband-install
update notes in readme
vsoch pushed to flux-framework/spack
Run spack updater on ubuntu latest
vsoch pushed to converged-computing/fluxnetes
allow more time for pod to run
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/flux-usernetes
Add azure (final tweaks to setup) (#14)
- build: flux on azure with usernetes, notes in readme and final tweaks
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/performance-study
Merge pull request #18 from milroy/cyclecloud-32-64-redo
Cyclecloud size 32 and 64: add missing test results</small>
vsoch pushed to converged-computing/performance-study
compute-engine: reorganize cpu preparing for gpu build
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to milroy/performance-study
remove eks-config.yaml extra build
Problem: there is a copy paste error of the pcluster image build at the bottom of eks-config.yaml Solution: remove it</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #253 from singularityhub/update/containers-2024-08-29
[bot] update/containers-2024-08-29</small>
vsoch pushed to flux-framework/fluxion-go
Merge pull request #11 from flux-framework/fix-ci-errors
build: remove boost_system from dependencies</small>
vsoch pushed to flux-framework/Tutorials
Use tagged version of DLIO Benchmark (#45)
- Use tagged version of DLIO Benchmark
Use 2.0.0 tagged version of DLIO Benchmark</small>
vsoch pushed to flux-framework/Tutorials
HPCIC 2024: Updates and Fixes DYAD Component of Tutorial (#43)
- dyad: fixes content and DYAD data dyad data loader
This commit corrects logic in the the PyTorch data loader for DYAD. It also makes various corrections to the text in the DYAD notebook.
- docker: adds workaround regarding Ubuntu Jammy
The flux-sched image for Ubuntu Jammy has a system install of UCX 1.12.0. However, we are wanting to use UCX 1.13.1 with DYAD. This commit updates LD_LIBRARY_PATH to point to UCX 1.13.1 to prevent runtime issues with DYAD.
- dyad: updates the env file for DYAD notebook
In light of the name change of DLIO Profiler to DFTracer, this commit updates the env file created in the DYAD notebook to use the new names for environment variables.
- dyad: fixes bug in DYAD data loader
This commit fixes a bug in the DYAD PyTorch data loader that causes ‘brokers_per_node’ to not be set before reference.
- dyad: update multiprocessing approach for DLIO
This commit tweaks the DLIO config file to use forking for multiprocessing instead of spawning
- dyad: changes cpu-affinity for DLIO
This commit changes cpu-affinity to off when running DLIO for training for consistency
Co-authored-by: Hariharan mani.hariharan@gmail.com</small>
vsoch pushed to converged-computing/performance-study
aks sizes 32 and 64 re-run with placement group, configs and results
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/performance-study
compute engine: update size64 stream and laghos runs
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #251 from singularityhub/update/containers-2024-08-26
[bot] update/containers-2024-08-26</small>
vsoch pushed to converged-computing/performance-study
compute engine size 32 results google
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/performance-study
Merge pull request #8 from converged-computing/update-lammps
re-run of lammps on eks,aks, and gke</small>
vsoch pushed to converged-computing/performance-study
aks size 32 configs and results
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/performance-study
adding gke cpu size 256 results and configs
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/performance-study
aws eks size 128 for cpu is complete
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #250 from singularityhub/update/containers-2024-08-22
[bot] update/containers-2024-08-22</small>
vsoch pushed to researchapps/flux-core
doc: add dependency example
Problem: there is not a concrete example of using –dependency in our current documentation. Solution: add example to man1/common/job-dependencies.rst
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to researchapps/flux-core
doc: add dependency example
Problem: there is not a concrete example of using –dependency in our current documentation. Solution: add example to man1/common/job-dependencies.rst
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to researchapps/eksctl
efa-installer: remove archive in 2023 files
Problem: the node consistently runs out of disk space when adding efa, resulting in an unusable cluster with scattered nodes where the installer failed. Solution: the installer archive itself is huge, and we can simply remove it and avoid this error.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/performance-study
azure aks gpu docker builds
This adds the Azure GPU docker builds, specifically for AKS. We still need to build amg2023 with spack - it completely just hangs / does nothing when I spack install on my machine.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #249 from singularityhub/update/containers-2024-08-19
[bot] update/containers-2024-08-19</small>
vsoch pushed to flux-framework/spack
py-keras: add v3.5 (#45711)
vsoch pushed to converged-computing/performance-study
google cloud compute engine
This is a fully working setup for using singularity on compute engine.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to rseng/software
Merge pull request #388 from rseng/update/software-2024-08-18
Update from update/software-2024-08-18</small>
vsoch pushed to converged-computing/performance-study
remove terraform - we are deploying from the webui
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/performance-study
aws ec2: first run for cpu with singularity!
This is a full setup (image VM builds) for Singularity with flux on ec2. I am doing this because we needed to test singularity, and Parallel Cluster has not been working great. I want to have this working setup in case we need it (and actually I think I prefer it, and believe it to be more comparable to our other setups that use flux!
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/performance-study
azure aks for cpu: first test runs are done
Most things are working except for mixbench (segfaults) and mt-gemm needs an update to the script (and container rebuilds) because the metric output is meaningless. AMG also needs a decision on the final params since it is different from the rest - using amg from 2013 instead of from 2023 (amg2023).
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/aks-infiniband-install
ubuntu20.04 driver installer for gpu
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/spack
Update package.py
vsoch pushed to flux-framework/flux-framework.github.io
Merge pull request #125 from flux-framework/release-docs-2024-08-16
Update from release-docs-2024-08-16</small>
vsoch pushed to converged-computing/performance-study
azure and eks: configs and builds
This changeset includes the first working run of anything in AKS (OSU benchmarks in AKS CPU) and docker builds to support that. I have the rest of the containers built and need to test them.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/aks-infiniband-install
test: add osu example
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #248 from singularityhub/update/containers-2024-08-15
[bot] update/containers-2024-08-15</small>
vsoch pushed to flux-framework/spack
ensure 0.37.0 is kept
vsoch pushed to compspec/compspec
Merge pull request #25 from fgeorgatos/patch-1
Update README.md - typo fixes</small>
vsoch pushed to flux-framework/flux-framework.github.io
Merge pull request #124 from flux-framework/release-docs-2024-08-14
Update from release-docs-2024-08-14</small>
vsoch pushed to flux-framework/spack
autodiff: add v1.0.2 -> v1.1.2 (#43527)
vsoch pushed to flux-framework/flux-k8s
Merge pull request #81 from flux-framework/fix-upstream-changes
fix: upstream changes for build and charts</small>
vsoch pushed to flux-framework/Tutorials
Merge pull request #42 from flux-framework/add-hpcic-2024
rename: radiuss 2024 to hpcic 2024</small>
vsoch pushed to converged-computing/performance-study
Merge pull request #1 from converged-computing/updating-dockerfile-zen4
docker: update builds for zen4</small>
vsoch pushed to converged-computing/flux-usernetes
test: adding testing pod
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/Tutorials
Merge pull request #31 from flux-framework/2024-radiuss-aws
add: flux radiuss tutorial 2024</small>
vsoch pushed to converged-computing/performance-study
events: add resources used
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/spack
Install test root update: old to new API (#45491)
- convert install_test_root from old to new API</small>
vsoch pushed to flux-framework/Tutorials
flux-workflow-examples: update content
conduit still does not compile, and a note was added about that.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/fluxnetes
group: first fully working group through cleanup (#16)
- group: first fully working group through cleanup
This changeset adds the first completely working submit through cleanup, where all tables are properly cleaned up! We can actually see the group of pods run and go away. Next I want to add back the kubectl command so we can get an idea of job state in the queue, etc.
Signed-off-by: vsoch vsoch@users.noreply.github.com
Signed-off-by: vsoch vsoch@users.noreply.github.com Co-authored-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #246 from singularityhub/update/containers-2024-08-08
[bot] update/containers-2024-08-08</small>
vsoch pushed to hpc-social/community-blog
update ruby to 3.0
vsoch pushed to hpc-social/commercial-blog
update ruby to 3.0
vsoch pushed to flux-framework/flux-framework.github.io
Merge pull request #123 from flux-framework/release-docs-2024-08-08
Update from release-docs-2024-08-08</small>
vsoch pushed to converged-computing/fluxnetes
fix
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to hpc-social/personal-blog
update ruby to 3.0 (#5)
- update ruby to 3.0
Testing updating ruby - the CI is currently failing.</small>
vsoch pushed to hpc-social/community-blog
update ruby to 3.0
vsoch pushed to hpc-social/commercial-blog
update ruby to 3.0
vsoch pushed to vsoch/oci-python
Merge pull request #21 from BeryJu/patch-1
Use raw python string for regex</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #245 from singularityhub/update/containers-2024-08-05
[bot] update/containers-2024-08-05</small>
vsoch pushed to rseng/software
Merge pull request #386 from rseng/update/software-2024-08-04
Update from update/software-2024-08-04</small>
vsoch pushed to converged-computing/performance-study
gke: add example for user metadata and oras namespaces
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/spack
build(deps): bump black in /.github/workflows/requirements/style (#45561)
Bumps black from 24.4.2 to 24.8.0.
updated-dependencies:
- dependency-name: black dependency-type: direct:production update-type: version-update:semver-minor …
Signed-off-by: dependabot[bot] support@github.com Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com></small>
vsoch pushed to converged-computing/fluxnetes
Merge pull request #14 from converged-computing/add-owner-cleanup
add owner cleanup</small>
vsoch pushed to converged-computing/performance-study
google gpu: adding draft of gke experiments
This adds early experiment design for GKE with GPU. There are still some configs missing, and we need to finalize the number of GPU (and thus the corr- esponding CPUs
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/fluxnetes
cleanup: basic functionality added (#13)
- cleanup: basic functionality added
This changeset adds support for a duration that drives cleanup, meaning a duration in seconds can be provided as a label, and then the label will be populated into the duration (seconds) to kickoff a cleanup job after allocation. This currently is not doing a cleanup, as we will need to walk up to a parent level abstraction (often deleting the pod is not sufficient) and issue cancel to fluxion, but that will come soon/next. I am also converting the fluxion service container build to be multi-stage to hopefully make it smaller
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/fluxnetes
Merge pull request #11 from converged-computing/add-build-images
ci: add back build images workflow</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #244 from singularityhub/update/containers-2024-08-01
[bot] update/containers-2024-08-01</small>
vsoch pushed to rseng/devstories-episodes-2
add episode 100 - andrew jones!
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to rseng/devstories
add images for 100th episode post
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to researchapps/flux-sched
chore: flux ion-resource jobspec argument redundancy
Problem: the flux-ion-resource.py match has several subcommands that require a jobspec positional argument. Each subparser is calling the same logic to add it, which is redundant. Solution: iterate through a list to add the same argument to all of them, eliminating the redundancy and making it easier for the developer to read.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/flux-k8s
Merge pull request #79 from flux-framework/fix-build-args-needed
ci: fix build ci to populate build-images.sh script</small>
vsoch pushed to converged-computing/fluxnetes
Merge pull request #9 from converged-computing/add-reservations
feat: queue is working for multiple pods!</small>
vsoch pushed to flux-framework/spack
perl-bio-ensembl-funcgen: new package (#44508)
-
Adding the perl-bio-ensembl-funcgen package
-
Update package.py
-
Update package.py</small>
vsoch pushed to converged-computing/fluxnetes
Merge pull request #8 from converged-computing/provisional-queue
chore: reorganize packages, add provisional queue, and correct sorting</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #243 from singularityhub/update/containers-2024-07-29
[bot] update/containers-2024-07-29</small>
vsoch pushed to rseng/software
Merge pull request #385 from rseng/update/software-2024-07-28
Update from update/software-2024-07-28</small>
vsoch pushed to flux-framework/flux-k8s
ci: fix build ci to populate build-images.sh script
Problem: The upstream “hack/build-images.sh” has had most variables for the environment removed in favor of an upstream Makefile, which we do not currently use in our custom build. Solution: define these variables in our Makefile equivalently.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/fluxnetes
worker: retrieval of podspec and AskFlux
This changeset creates separate worker and podgroup fluxnetes package files, and they handle worker definition and pod group parsing functions, respectively. Up to this point we can now
- retrieve a new pod and see if it is in a group.
- if no (size 1) add to worker queue immediatel. if yes (size N) add to pods table to be inspected later
- retrieve the podspec in the work function
- parse back into podspec and ask flux for the allocation. I next need to do two things. First, figure out how to pass the node assignment back to the scheduler - I am hoping the job object “JobRow” can be modified to add metadata. Then we need to write the function to run at the end of a schedule cycle that moves groups from the provisional table to the worker queue
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/cheat-sheet
Merge pull request #8 from flux-framework/carbon-copy-missing
typo: flux submit advanced missing –cc</small>
vsoch pushed to converged-computing/fluxnetes
feat: new queue to handle groups
This changeset adds a new queue to the fluxnetes in-tree plugin, which currently knows how to accept a pod for work, and then just sleep (basically reschedule for 5 seconds into the future). This is not currently hooked into Kubernetes scheduling because I want to develop the functionality I need first, in parallel, before splicing it in. I should still be able to schedule to Fluxion and trigger cleanup when the actual job is done. I think we might do better to remove the group CRD too - it would hugely simplify things (the in-tree plugin would barely need anything aside from the fluxion interactions and queue) and instead we can keep track of group names and counts (that are still growing) in a separate table, since we already have postgres. Two things I am not sure about include 1. the extent to which in-tree plugins support scheduling. I can either keep them (and then would need to integrate) or have their functionality move into what fluxion can offer. I suspect they add supplementary features since we were able to disable most of them. The second thing I am not sure about (I will figure out) is, given that we customize the plugin framework, where the right place to put sort is. If we are adding pods to a table we will need to store the same metadata (priority, timestamp, etc) to allow for this equivalent sort.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/spack
sz: new test API (#45363)
- sz: new test API
- fix typo; check installed executable; conform to subpart naming convention
- skip tests early if not installed; remove unnecessary “_sz” from test part names
Co-authored-by: Tamara Dahlgren dahlgren1@llnl.gov</small>
vsoch pushed to researchapps/skypilot
chore: linting
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/flux-operator
Merge pull request #231 from flux-framework/add-additional-skypilot-features
- podspec: add additional features for podspec
runtimeClassName can be used to designate nvidia for skypilot
- feat: allow non root user
Problem: skypilot (and likely others) do not run with a root user Solution: allow a non-root user that has sudo Signed-off-by: vsoch vsoch@users.noreply.github.com
- restart policy should default to always
Signed-off-by: vsoch vsoch@users.noreply.github.com
- default should be on failure
Signed-off-by: vsoch vsoch@users.noreply.github.com
Signed-off-by: vsoch vsoch@users.noreply.github.com Co-authored-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/flux-metrics-api
Add zenodo DOI to readme
vsoch pushed to singularityhub/shpc-registry
Merge pull request #242 from singularityhub/update/containers-2024-07-22
[bot] update/containers-2024-07-22</small>
vsoch pushed to flux-framework/flux-operator
default should be on failure
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to rseng/software
Merge pull request #384 from rseng/update/software-2024-07-21
Update from update/software-2024-07-21</small>
vsoch pushed to flux-framework/Tutorials
review radiuss 2024: from flux team on july 19th
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to researchapps/skypilot
feat: adding flux as a cloud
This changeset adds flux as a new cloud, which largely wraps the Kubernetes cloud class, but provides the separation to make it clear we are deploying Flux, and to allow for other small tweaks to the customization. This currently works to deploy Kubernetes (via the Flux cloud) and when I uncomment the deploy_vars “module” variable it will be able to use a work in progress provisioner module, which is not added to this changeset. Note that my strategy is to make as minimal changes as possible, so I am using the same Kubernetes classes and templates and editing only when necessary.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/spack
sqlite: add some newer releases (#45297)
Included: 3.46.0 (most current), 3.45.3, 3.45.1 (for possible compat with Ubuntu 24.04 LTS), 3.44.2.</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #241 from singularityhub/update/containers-2024-07-18
[bot] update/containers-2024-07-18</small>
vsoch pushed to flux-framework/spack
qt-*: add v6.7.1, v6.7.2 (#45288)
vsoch pushed to converged-computing/flux-views
Merge pull request #9 from converged-computing/add/ubuntu-noble
ubuntu: support for noble</small>
vsoch pushed to flux-framework/spack
DCMTK: fix build with libtiff (#45213)
vsoch pushed to singularityhub/shpc-registry
Merge pull request #240 from singularityhub/update/containers-2024-07-15
[bot] update/containers-2024-07-15</small>
vsoch pushed to rseng/software
Merge pull request #383 from rseng/update/software-2024-07-14
Update from update/software-2024-07-14</small>
vsoch pushed to flux-framework/spack
py-tensorflow: change py-tensorflow@2.16-rocm-enhanced to use tarball instead of branch (#45218)
-
change py-tensorflow@2.16-rocm-enhanced to use tarball instead of branch
-
remove revert_fd6b0a4.patch and use github commit patch url</small>
vsoch pushed to flux-framework/flux-go
Merge pull request #1 from flux-framework/add-zmq-dependency
dependency: libczmq needed</small>
vsoch pushed to converged-computing/metrics-operator-experiments
docker: add single-node cpu profile
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to researchapps/flux-sched
chore: flux ion-resource jobspec argument redundancy
Problem: the flux-ion-resource.py match has several subcommands that require a jobspec positional argument. Each subparser is calling the same logic to add it, which is redundant. Solution: iterate through a list to add the same argument to all of them, eliminating the redundancy and making it easier for the developer to read.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/spack
Buildcache: remove deprecated –allow-root and preview subcommand (#45204)
vsoch pushed to singularityhub/shpc-registry
Merge pull request #239 from singularityhub/update/containers-2024-07-11
[bot] update/containers-2024-07-11</small>
vsoch pushed to flux-framework/spack
build(deps): bump docker/login-action from 3.1.0 to 3.2.0 (#44424)
Bumps docker/login-action from 3.1.0 to 3.2.0.
updated-dependencies:
- dependency-name: docker/login-action dependency-type: direct:production update-type: version-update:semver-minor …
Signed-off-by: dependabot[bot] support@github.com Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com></small>
vsoch pushed to flux-framework/flux-framework.github.io
Merge pull request #120 from flux-framework/release-docs-2024-07-11
Update from release-docs-2024-07-11</small>
vsoch pushed to converged-computing/fluxnetes
docs: update readme with working example
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to regro-cf-autotick-bot/sregistry-feedstock
Test removing globus
vsoch pushed to flux-framework/spack
Update from update-package/flux-sched-2024-07-10 (#200)
- Automated deployment to update package flux-sched 2024-07-10
- Add back in 0.36.0
Co-authored-by: github-actions github-actions@users.noreply.github.com Co-authored-by: Vanessasaurus <814322+vsoch@users.noreply.github.com></small>
vsoch pushed to conda-forge/sregistry-feedstock
Rebuild for python312 (#39)
-
Rebuild for python312
-
MNT: Re-rendered with conda-build 24.5.1, conda-smithy 3.36.2, and conda-forge-pinning 2024.07.09.17.01.06
-
Test removing globus
Co-authored-by: Vanessasaurus <814322+vsoch@users.noreply.github.com></small>
vsoch pushed to flux-framework/spack
pinentry: add v1.3.1 (#45073)
vsoch pushed to singularityhub/shpc-registry
Merge pull request #238 from singularityhub/update/containers-2024-07-08
[bot] update/containers-2024-07-08</small>
vsoch pushed to converged-computing/flux-netmark
Merge pull request #2 from converged-computing/add-placement
add placement group and topology api</small>
vsoch pushed to converged-computing/bare-vm-container-study
docker: add arm builds
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to rseng/software
Merge pull request #382 from rseng/update/software-2024-07-07
Update from update/software-2024-07-07</small>
vsoch pushed to flux-framework/spack
spack -C
Precedence:
- Named environment
- Anonymous environment
- Generic directory</small>
vsoch pushed to flux-framework/spack
spack gc: remove debug print statement (#45067)
Signed-off-by: Todd Gamblin tgamblin@llnl.gov</small>
vsoch pushed to flux-framework/flux-framework.github.io
Merge pull request #119 from flux-framework/release-docs-2024-07-05
Update from release-docs-2024-07-05</small>
vsoch pushed to flux-framework/Tutorials
org: moving dyad to be separate module, and organizing into supplementary section
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/flux-netmark
Merge pull request #1 from converged-computing/add-security-ssh-group
security: add user ip address to ssh port 22 ingress</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #237 from singularityhub/update/containers-2024-07-04
[bot] update/containers-2024-07-04</small>
vsoch pushed to researchapps/eksctl
Merge pull request #7828 from eksctl-io/update-release-notes
Add release notes for v0.184.0</small>
vsoch pushed to flux-framework/flux-operator
adding jobs list to fluxion controller
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/bare-vm-container-study
updates to ring buffer timing
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/spack
Automated deployment to update flux-core versions 2024-07-03 (#197)
Signed-off-by: github-actions github-actions@users.noreply.github.com Co-authored-by: github-actions github-actions@users.noreply.github.com</small>
vsoch pushed to flux-framework/flux-framework.github.io
Merge pull request #118 from flux-framework/release-docs-2024-07-03
Update from release-docs-2024-07-03</small>
vsoch pushed to singularityhub/sregistry
Automated deployment to update contributors 2024-07-02 (#448)
Co-authored-by: github-actions github-actions@users.noreply.github.com</small>
vsoch pushed to flux-framework/flux-operator
cleanup of tests and docs to run example with maximum automation
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #236 from singularityhub/update/containers-2024-07-01
[bot] update/containers-2024-07-01</small>
vsoch pushed to rseng/software
Merge pull request #381 from rseng/update/software-2024-06-30
Update from update/software-2024-06-30</small>
vsoch pushed to flux-framework/flux-operator
typo: emptyDirSizeLimit
vsoch pushed to converged-computing/jobspec-database
add missing images for manual resource parsing
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/flux-usernetes
Merge pull request #12 from converged-computing/limit-ssh-deployer-ip
ssh: test config to limit to deployer ip address</small>
vsoch pushed to converged-computing/bare-vm-container-study
add testing program with ld preload
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/spack
Strumpack: Changed old test method to new test method (#44874)
- added try except
- Resolve style issues
Co-authored-by: Tamara Dahlgren <35777542+tldahlgren@users.noreply.github.com></small>
vsoch pushed to flux-framework/flux-operator
try pushing again
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/rainbow
wip deletion (#43)
- wip deletion endpoints for cluster and subsystems
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/spack
build(deps): bump docker/build-push-action from 5.3.0 to 6.2.0 (#44910)
Bumps docker/build-push-action from 5.3.0 to 6.2.0.
updated-dependencies:
- dependency-name: docker/build-push-action dependency-type: direct:production update-type: version-update:semver-major …
Signed-off-by: dependabot[bot] support@github.com Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com></small>
vsoch pushed to converged-computing/jobspec-database
add pydantic model to gemini - gemma is kind of terrible
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/bare-vm-container-study
testing updating to save by pid (and with tgid still)
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #235 from singularityhub/update/containers-2024-06-27
[bot] update/containers-2024-06-27</small>
vsoch pushed to flux-framework/flux-operator
feat: add prototype/experiment of testing multiple applications per pod
combined with the fluxion scheduler as a service, this could be a pretty cool idea. I am not sold on this being a good idea for production, but I think it will afford interesting experiments and workflow designs.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/jobspec-database
gemma: preparing to test with template
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/bare-vm-container-study
LAMMPS runs for 32 iterations on singularity vs bare metal (#1)
- first round of results!
- add lists of kernel functions
- add ebpf dockerfile
- ebpf: add automation to determine functions of interest
- add gromacs mpi
This includes a lima vm, along with a simple set of steps to generate all the functions and run the script against it (and time the whole thing). This is not perfect but it is simple enough to understand and use, and we can use it for our further experiments!
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to oras-project/oras-py
fix(core): provider do_request to maintain verify in all request, basic headers (#145)
- fix(core): provider do_request to maintain verify in all request
- basic headers maintenance
- add test case
- amending test for GHA setup
- add CHANGELOG entry
- use tls_verify also for login for consistency
Signed-off-by: tarilabs matteo.mortari@gmail.com</small>
vsoch pushed to flux-framework/spack
check relase: don’t fail fast
vsoch pushed to converged-computing/bare-vm-container-study
add ebpf dockerfile
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Merge pull request #234 from singularityhub/update/containers-2024-06-24
[bot] update/containers-2024-06-24</small>
vsoch pushed to converged-computing/jobspec-database
gemini: remove outliers
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to rseng/software
Merge pull request #380 from rseng/update/software-2024-06-23
Update from update/software-2024-06-23</small>
vsoch pushed to converged-computing/bare-vm-container-study
preparing to run first experiment
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to researchapps/eksctl
fix unit test failures for pkg/actions/nodegroup/update_test.go
vsoch pushed to flux-framework/spack
cmake: remove version deprecated in 0.22 (#44628)
vsoch pushed to flux-framework/flux-k8s
graph: index should be scoped to parent
Problem: the current strategy to derive an index is scoped to a resource globally across the graph. Solution: instead, provide a direct index counter for each new resource to ensure it is scoped to the parent
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/rainbow
Add ssl (#42)
- save: wip to add ssl, not working yet
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/container-chonks
dockerfile: update analysis
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to researchapps/eksctl
Update generated files
vsoch pushed to flux-framework/spack
upcxx package: Add resilience to broken libfabric (#44618)
Some systems have a libfabric install that doesn’t work, so don’t
drop dead if a call to fi_info
fails (e.g. due to missing shared libraries)</small>
vsoch pushed to converged-computing/container-chonks
update layer and image similarity
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/bare-vm-container-study
chore: filter missing from script
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to pydicom/deid
Merge pull request #262 from nbelakovski-mssm/update_docker_docs
Update docker.md</small>
vsoch pushed to nbelakovski-mssm/deid
Update docker.md
The second line calling deid locally after running the docker container doesn’t make any sense. Also I think that adding --help
helps to show how this docker container could be used.</small>
vsoch pushed to converged-computing/container-chonks
add base image analysis
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/bare-vm-container-study
ebpf: add early work on program to time calls
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-guts
ci: remove centos:9
vsoch pushed to singularityhub/guts
ci: update deploy action
vsoch pushed to converged-computing/ebpf-hpc
add bcc base
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/container-chonks
dockerfile: script to calculate base images
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/container-chonks
add terms
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/container-chonks
add topic clouds for dockerfiles
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/bare-vm-container-study
build: finalize build script
This script installs all the applications onto the VM. The one thing to be careful about is the spack view for AMG2023 that has all the duplicated installs (flux, mpi, etc). It is important to target direct paths for things to be careful.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/spack
openmpi: disable remark 10441 for intel classic 2021.7.0 or newer (#44614)
- Compilation of openmpi fails when intel classic compiler 2021.7.0 or newer is used.</small>
vsoch pushed to flux-framework/flux-operator
docs: update deploy action to v4
vsoch pushed to converged-computing/jobspec-database
add license
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/bare-vm-container-study
add more analyses
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-guts
busybox 1.23 and 1.24 are deprecated
vsoch pushed to rseng/devstories-episodes-2
Add Jakob episode
vsoch pushed to rseng/devstories
update jakob post
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to vsoch/vsoch.github.io
fiber -> fabric
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/jobspec-database
add simple top2vec similar words
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to compspec/jobspec
feat: support for environment (#18)
- feat: support for environment
- attributes are under task
- add support for parsing script
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to compspec/jobspec
Add depends on (#17)
- feat: support for depends_on
This adds working support for depends on, which relies on adding a frobnicator plugin to support a dependency creation based on the job name. I am adding a full example directory and jobspec that works with the VSCode developer environment where I have “installed” the plugin.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/shpc-registry
Update quay.io/pawsey/hpc-python
Tags no longer exist.
Signed-off-by: Vanessasaurus <814322+vsoch@users.noreply.github.com></small>
vsoch pushed to converged-computing/flux-usernetes
Merge pull request #11 from converged-computing/update-plots-add-csv
update plots and csv</small>
vsoch pushed to rseng/software
Merge pull request #378 from rseng/update/software-2024-06-09
Update from update/software-2024-06-09</small>
vsoch pushed to flux-framework/spack
git: remove deprecated versions (#44631)
vsoch pushed to converged-computing/metrics-operator-experiments
performance: add missing amg2023 vtune container
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/metrics-operator-experiments
adding vtune containeres
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/jobspec-database
add run_top2vec script
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/flex-container
chore: update title
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to flux-framework/flux-framework.github.io
Merge pull request #117 from flux-framework/release-docs-2024-06-07
Update from release-docs-2024-06-07</small>
vsoch pushed to converged-computing/flux-usernetes
bug: launch template name should be scoped to local name
Problem: if we don’t scope the launch template name, we can get conflicts between different deployments. Solution: add local.name to it.</small>
vsoch pushed to compspec/jobspec
chore: move queries into separate module for clarity
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to oras-project/oras-py
Merge pull request #141 from tarilabs/tarilabs-20240606-manifest_config
chore: fix typing for manifest_config param of push fn</small>
vsoch pushed to converged-computing/jobspec-database
update analysis: now dataset has 31932 results
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/flux-usernetes
fix bandwidth plot (missing other setups)
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/container-chonks
add dockerfile analysis
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to compspec/jobspec
test update: requires on the level of a task is a reference
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to singularityhub/container-executable-discovery
Update README.md
vsoch pushed to flux-framework/spack
Automated deployment to update flux-core versions 2024-06-05 (#187)
Signed-off-by: github-actions github-actions@users.noreply.github.com Co-authored-by: github-actions github-actions@users.noreply.github.com</small>
vsoch pushed to converged-computing/metrics-operator-experiments
performance: add parallel cluster runs
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to compspec/compspec-spack
Update CHANGELOG.md
vsoch pushed to compspec/compspec
add utils function to read file (#24)
- add utils function to read file
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/metrics-operator-experiments
add notes about cleanup
Google Cloud did not cleanup the network because a new VM was spawned at apparently the wrong time. The fix is to manually delete “dangling” VMs and then issue the delete command again.
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/jobspec-database
Updated matrices to double size of dataset (#1)
- update search to include job managers
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to rseng/software
Merge pull request #377 from rseng/update/software-2024-06-02
Update from update/software-2024-06-02</small>
vsoch pushed to converged-computing/metrics-operator-experiments
performance: add aws ec2 (tf) testing with singularity
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/metrics-operator-experiments
performance: completed google cpu testing 32 node cluster
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>
vsoch pushed to converged-computing/flux-usernetes
add plots for bandwidth and latency
Signed-off-by: vsoch vsoch@users.noreply.github.com</small>