Releases: determined-ai/determined
Releases · determined-ai/determined
0.33.0
Release Notes
Changelog
- 0c2d3cf chore: bump version: 0.33.0-rc5 -> 0.33.0
- e165541 docs: add release notes for 0.33.0 (#9444)
- 8c69d8b chore: bump version: 0.33.0-rc4 -> 0.33.0-rc5
- e1a40b1 fix: dont utilize the default efs mount on normal aws deploys (#9437)
- ebe2698 chore: bump version: 0.33.0-rc3 -> 0.33.0-rc4
- b85b8b3 fix: set the defaults for shared_fs mount in genai correctly (#9433)
- 52c7d95 chore: bump version: 0.33.0-rc2 -> 0.33.0-rc3
- 9968dce fix: add feature gate for checking for blank admin/determined password [DET-10197] (#9425)
- 9c4fd74 chore: bump version: 0.33.0-rc1 -> 0.33.0-rc2
- 274b152 fix: Keep template modal open when config is invalid (#9424)
- f4d6f54 chore: bump version: 0.33.0-rc0 -> 0.33.0-rc1
- 2661ae0 chore: bump ngc image versions for release (#9418)
- cbc15db fix: master checks db newness before migrating [DET-10312] (#9414)
- d1b3343 chore: bump version: 0.33.0-dev0 -> 0.33.0-rc0
- ca45198 chore: lock published urls to preserve redirects
- f2cd018 chore: lock api state for backward compatibility check
- 6184f6f chore: bump version: 0.32.1-dev0 -> 0.33.0-dev0
- 4af9bfc revert: Framework splitting (#9405)
- 6fa1420 test: project create and delete react e2e [INFENG-456] (#9244)
- 860f6a8 docs: Describe config templates WebUI (#9399)
- 6ff8eb7 chore: Add slurm codeowners (#9403)
- 68b36c6 feat: require initial passwords on new cluster-up [DET-10197] (#9314)
- 0ef3e10 test: datagrid scrolling [INFENG-687] (#9379)
- 18ee0e3 chore: Update docker retag scripts (#9401)
- 6ed2976 pin setuptools in model hub tests (#9402)
- c4ebe5e feat: Release WebUI templates with notes (#9383)
- 3bbb51a feat: Display Log retention days and Remaining log retention days in Logs Tab (#9305)
- 047580c feat: update default scheduler to priority for agentrm (#9385)
- ce70c00 docs: Add more info helm install password (#9388)
- b84ee1f docs: cluster observability documentation and dashboard improvements (#9391)
- c3b3ae6 feat: helm install checks password complexity [DET-10293] (#9360)
- 5c51164 fix: Skip resource checking for unmanaged exp (#9372)
- 107e108 feat: add Sort menu to Flat Runs view (#9396)
- cb81a44 feat: Add charts to Comparison View (ET-99) (#9215)
- cd33c13 test: put flaky fix back in [INFENG-694] (#9394)
- d3e89b1 docs: add exp config for unmanaged example #2. (#9397)
- d4e23f4 chore: pin requests version < 2.32.0 so docker works (#9395)
- 5480c57 chore: don't use a seperate schema for views_and_triggers (#9392)
- 893f7f5 chore: add resource_pools intg test (#9356)
- de21593 chore: push oss images per commit (#9386)
- 95c70d4 docs: Add nav to genai docs (#9387)
- 0c42ced feat: SDK methods to fetch pachyderm configs [MD-406] (#9348)
- 0ff09e0 docs: Describe pwd requirements WebUI (#9378)
- 31bc08a refactor: rename multiRM to more intuitive name (#9350)
- df7a2af docs: Update release note (#9375)
- 2c9b9b9 feat: add pod labels with proper validation (#9364)
- 0a59c63 docs: Remove long metrics rn (#9374)
- 7e4b431 feat: add columns menu to Runs view (#9323)
- c10ae99 test: Remove flakiness of KillRun test (#9370)
- 653a0de chore: store database code as code [DET-9180] (#9302)
- d38e2e0 test: report individual test results from python tests (#9366)
- 7bce6ff chore: report ntsc names via cli at launch (#9228)
- 93c8d81 ci: keep waiting on failing workloads for sending slack alerts (#9371)
- 53edec9 test: More Page Models for Experiment Tracking [INFENG-694] (#9367)
- a96cafd feat: Framework splitting (#9318)
- 3b1d0df chore: remove test suite whose marks match no tests (#9363)
- 566b6af test: page model refactor for dropdown and select components [INFENG-694] (#9362)
- 68b7116 docs: deprecation notice for agentrm features (#9344)
- 2092943 docs: Add FSDP to deepspeed (#9182)
- 4f180db chore: update npm libraries (#9331)
- c7b78fa feat: edit/delete template from WebUI (#9353)
- 8c5fce7 test: provide GKE tests with a Helm value for initialUserPassword [DET-10196] (#9361)
- 0941fc4 feat: helm requires bootstrap password [DET-10196] (#9359)
- f91c2a3 docs: revert a doc format change to reenable slurm tests (#9358)
- 989341c feat: add options in flat run (#9341)
- 16a3f3b revert: resource pool intg tests (#9357)
- 54fb10a chore: add intg tests to resource_pool.go (#9199)
- c3901c8 fix: det ckpt download from s3. (#9332)
- 7c26fe1 refactor: columnpicker remove hard coded value (#9342)
- feb8a7b fix: remove pod labels with potentially incompatible names (#9349)
- eab4981 docs: Reformat grid tables (#9321)
- 758ffd7 chore: add retries to check-doc-links ci job (#9335)
- 80fac3d chore: update release notes date (#9334)
- efbcdee chore: update codecov to ignore e2e react [INFENG-689] (#9346)
- 2445d39 fix: Revert "feat: helm requires bootstrap password (DET-10196)" (#9345)
- df0b7f9 feat: Implement
/template/rename
to patch template name (#9320) - 0a0b3c3 feat: helm requires bootstrap password [DET-10196] (#9274)
- 86aa319 chore: Bunify and add test coverage for
ExperimentTotalStepTime
andExperimentNumSteps
(#9333) - 3c0eac6 test: experiement list tests [INFENG-457] (#9299)
- cc82cc9 chore: add missing setuptools to win cli tests (#9336)
- c0fdaa9 chore: remove step for authenticated master session check and use standard script (#9339)
- b868230 test: wait for background logout (#9340)
- ead928e ci: add missing var overide in ee release [skip ci] (#9338)
- cab9ac5 test: log in with the api rather than through the UI for most react tests (#9307)
- 349d2a5 feat: View templates from WebUI (#9304)
- 9d46f49 chore: update codecov to ignore e2e react [INFENG-689] (#9337)
- 6e10465 ci: send job level failure slack alerts (#9315)
- 2abacb9 docs: update "install on k8s" guide to use helm repo instead of tarball. (#9293)
- 492ef57 chore: bump version: 0.32.0-dev0 -> 0.32.1-dev0
- f74988c chore: add docs dropdown link for new version
- a1b6912 docs: add release notes for 0.32.0 (#9301)
- dab4946 feat: add integration config for pachyderm input datasets (ET-12) (#8933)
- 3d4e283 test: refactor nav spec to use sidebar pagemodels [INFENG-683] (#9326)
- 1779060 test: skip a flaky test [ET-233, ET-178] (#9324)
- 3b167c7 fix: filter action experiments, old ExperimentList (#9325)
- 5b73dc4 fix: filter batch action experiments (#9316)
- 6fb62ad feat: support for configuring the shared_fs mount path in genai (#9317)
- 5f4cbbf Revert "docs: Reformat tables with image names" (#9319)
- c9f5e8a docs: Reformat tables with image names (#9312)
- fdaa015 feat: support filter in flat run table (#9250)
- a76c549 ci: don't run test-e2e-longrunning tests on main (#9313)
- ebf19a6 chore: bumpenvs for efs-utils (#9309)
- f9a35d9 ci: run e2e-react manually [INFENG-676] (#9310)
- 9cab46d chore: drop unused postgres function experiments_best_validation_history (#9306)
- ee4f04e chore: stop writing database down migrations [RM-242] (#9289)
- 24aaff4 ci: store npm log (#9311)
- e90bd0d chore: improve messaging for e2e tests (#9286)
- 0aef4c7 fix: tensorboard metric overwrites and sync throttle [MD-328] [MD-291] (#9282)
- f9b96fe ci: don't run requests-hpc-tests on main (#9308)
- 4c314a2 chore: update efs-utils install for v2.0 (#9297)
- f6181ab test: revert runner size test-e2e-cpu (#9303)
- b0a008e feat: Update workspace for templates server side (#9272)
- 9357391 ci: circleci slack alerts should go to #ci-bots (#9300)
- 49ab75d docs: Update Chart.yaml [ci skip] (#9298)
- e31135a chore(deps): bump google.golang.org/grpc from 1.58.0 to 1.58.3 (#9292)
- f6e42cd fix: Bulk Action bug (#9255)
- a8d05fa test: skip a
useTypedParams
test case due to flakiness (#9287) - 2164912 chore: dependabot upgrade grpc/go-jose/net [RM-66] (#9280)
- f1aa92e chore: log health check failures in master logs (#9291)
- 7496445 fix: proto build shouldn't run if source files are unchanged (#9290)
- 21f76e9 fix: slots being filled returned out of order on k8s [RM-42] (#9276)
- a9c8700 test: e2e no floating promises [INFENG-668] (#9283)
- 7a296fa test: flaky user test fix [INFENG-663] (#9281)
- e4c6afe chore(deps): bump golang.org/x/net from 0.21.0 to 0.23.0 (#9202)
- b8eba3a ci: revert ee rebase changes to dependabot.yaml (#9278)
- 9ee9270 ci: gate hpc by request (#9198)
- 670ac40 chore: make command's run startup-hook.sh [RM-159] (#9275)
- b78020d feat: Create template through WebUI (#9263)
- abcc7b4 fix: Hide runs in archived experiments (#9270)
- b602ff2 docs: fix master config doc typo (#9256)
- 6bd2a8c ci: try to fix slurm podman tests by not building agent binary (#9273)
- 9c068d2 feat: webui create user prompts for password [DET-10221] (#9240)
- ae91042 feat: reuse HTTP sessions (#9116)
- a3f0fcf fix: show non det pods in other namespaces than 'default' [RM-141] (#9268)
- a611cf0 chore: stop publishing helm charts to NGC. (#9271)
- 2905180 test: increase runner size for react e2e (#9269)
- f8f8672 ci: try to fix podman tests by building proto once (#9267)
- 95b5164 feat: timeout change and package dedupe [ET-243] (#9265)
- 55b7fd9 chore: Image rename bumpenvs (#9253)
- 4d87127 test: some react tests are flaky [INFENG-663] (#9264)
- 86328cb fix: users can be removed from all groups in Web UI (#9259)
- aea83df chore: enable genai to connect to db over TLS (#9260)
- 703e6bd feat: Archive & Unarchive run (#9143)
- 8794e42 fix: historical-usage date calculation bug (#9257)
- cda4363 test: increase the timeout on a new users test [INFENG-455] (#9258)
- bd7b5ef test: user tests continued [INFENG-455] (#9214)
- dd4d0f9 ci:...
0.32.1
Release Notes
Changelog
- 7d0b38a chore: bump version: 0.32.1-rc0 -> 0.32.1
- 351826c docs: add release notes for 0.32.1 (#9351)
- 947585f chore: bump version: 0.32.1-dev0 -> 0.32.1-rc0
- f9da12f chore: lock api state for backward compatibility check
- 1e8f8de fix: remove pod labels with potentially incompatible names (#9349)
- 6995ca6 chore: bump version: 0.32.0 -> 0.32.1-dev0
0.32.0
Release Notes
Changelog
- a1b7242 chore: bump version: 0.32.0-rc8 -> 0.32.0
- d8580c2 docs: add release notes for 0.32.0 (#9301)
- 2244f71 chore: bump version: 0.32.0-rc7 -> 0.32.0-rc8
- 0322dc7 fix: filter action experiments, old ExperimentList (#9325)
- 5ebb008 chore: bump version: 0.32.0-rc6 -> 0.32.0-rc7
- b208794 fix: filter batch action experiments (#9316)
- 991818b chore: bump version: 0.32.0-rc5 -> 0.32.0-rc6
- e277782 fix: Bulk Action bug (#9255)
- b2663af chore: bump version: 0.32.0-rc4 -> 0.32.0-rc5
- ee63b67 fix: users can be removed from all groups in Web UI (#9259)
- 00b95c3 chore: bump version: 0.32.0-rc3 -> 0.32.0-rc4
- 642e323 fix: historical-usage date calculation bug (#9257)
- f506989 chore: bump version: 0.32.0-rc2 -> 0.32.0-rc3
- 1047e78 fix: hew update for select bug in log viewer (#9249)
- 4c59c9c chore: bump version: 0.32.0-rc1 -> 0.32.0-rc2
- f8ad009 fix: undo default log retention in values.yaml (#9245)
- 4b3adb9 docs: add a release note for aurora issue. (#9241)
- 004fe70 fix: allow genai deployments with agent GIDs set to share data properly (#9243)
- be231d9 chore: bump version: 0.32.0-rc0 -> 0.32.0-rc1
- 714264e chore: bump version: 0.32.0-dev0 -> 0.32.0-rc0
- dc88b9f chore: bump version: 0.31.1-dev0 -> 0.32.0-dev0
- 7ffdadf ci: add determined-ee context to python ee publish (#9234)
- c18ac83 fix: properly merge resource configs (#9233)
- 3b39d3c chore: add log retention to help charts (#9216)
- 3646395 chore: lock published urls to preserve redirects
- 80d8909 chore: lock api state for backward compatibility check
- 39b948c feat: add genai user role to rbac (#9206)
- 43289e9 test: ee and oss have separate handling (#9218)
- 1ca3613 fix: debounce
userSettings
update (#9220) - ab382b4 chore: update the license date (#9225)
- ff10ac0 docs: Fix broken links (#9219)
- ac68df8 chore: default observability.enable_prometheus to true (#9222)
- 26c1940 chore: upgrade protoc used in CI (#7935)
- 9f6bbc9 chore: Add streaming updates feature flag [MD-371] (#9190)
- f8b3736 ci: Exclude deploy/README.md from build (#9211)
- 3bfc212 fix: hew update for chart scroll bug (#9210)
- da8a040 feat: CLI allows and requires creating a user with a password DET-10184 (#9112)
- fbccaf1 chore: clean up rm module [RM-202] (#9191)
- 8caf3cb test: user tests [INFENG-455] (#9152)
- 3568f27 fix: Skip expected error from web socket (#9194)
- 1b212ae feat: add kill run endpoint (#9061)
- e7d870e test: use devcluster for react tests [INFENG-449] (#9185)
- bd4a54e fix: shared cluster test to work in OSS again (#9195)
- b874acb docs: fix another instance of broken docs link (#9208)
- 86be18a ci: pass ee into args to prevent latest main deploying as ee (#9207)
- f74ab9c docs: Describe multi rm k8s (#9025)
- 6fb1c52 ci: deploy awscli to system (#9188)
- 9cfbb59 docs: fix nvidia device plugin link for EKS (#9204)
- 3e865c6 test: skip flakey user provision tests (#9203)
- 598784d chore: make multi-RM an EE-only feature [RM-166] (#9192)
- 6d2be52 ci: fix test-det-deploy-local (#9196)
- 5f312ed test: can't launch NSC test assert 404 instead of 403 (#9197)
- 4b1c937 test: fix a test util issue with master config schema assumptions (#9193)
- 0bc13d8 feat: non-blocking metrics reports [MD-144] (#9107)
- 2ced9b9 ci: do dry runs of
publish-docs
for RCs (#9186) - 72344e0 feat: Use feature flag for streaming updates - manually update project store (#9170)
- dd7f4b5 docs: add profiling section for trainer API UG [MD-373] (#9177)
- 06586f0 fix: better exception handling in detached mode (#9183)
- 283daab feat: Unfork Enterprise Edition (EE) and require license key for EE features (#9168)
- f233c95 docs: FAQ for python SDK ckpt download, k8s deprecation labels. (#9187)
- 6fcefac chore: bump version: 0.31.0-dev0 -> 0.31.1-dev0
- 19688a9 chore: add docs dropdown link for new version
- 2b2e96a docs: add release notes for 0.31.0 (#9159)
- b194686 chore: style fix for helm initialUserPassword (#9158)
- a5e9f0c chore: add option to auto pick the only matching name on partial hits (#9108)
- 371c90b fix: louden server errors coming from deleteCheckpoints (#9184)
- 0765e38 chore: pass correct master scheme to genai (#9181)
- 26f5e0b fix: report errors from deletecheckpoints endpoint + improve feedback (#9178)
- 1037d83 chore: bumpenv update NGC base images version to 24.03 (#9132)
- 1cc9cd7 fix: count determined-system pods as det pods [RM-148] (#9148)
- 0fc247c fix: single-searcher MNIST example runs for multiple epochs (#9160)
- d41c4a7 fix: fix docs and wording (#9179)
- 5541e54 feat: RM-130 add determined info as pod labels (#9140)
- ee15da0 test: Djanicek/infeng 456/workspaces and projects (#9117)
- e6c0c99 chore: add typing annotations for zmq (#9176)
- 4ceaed0 docs: Add readme to toc (#9175)
- 3105407 chore: make the data_dir consistent to other advertised devc configs (#9157)
- d38fc3c fix: Reset table offset when filtering for models (#9167)
- 338d5d3 docs: remove max supported k8s version. (#9171)
- 35d249f chore: add flake8 relative-import rule (#9169)
- ffed598 feat: support for mounting a hostPath for the shared file system in genai (#9161)
- 2f874b9 test: experiment list page models and sample test [INFENG-451] (#9139)
- fd45ed8 ci: merge EE and OSS doc deploy together [INFENG-625] (#9162)
- 0b2eab0 docs: Copy debug to exp config (#9120)
- 3f70a46 chore: style fix for helm tls (#9163)
- 8a94574 chore: new image publishing (#9090)
- 8b83122 fix: TensorBoard visualization from batch actions. (#9156)
- 384e5c0 fix: fix disable button condition in launch jupyter notebook modal (#9155)
- b109108 feat: add helm master level config for tcd startup hooks (#9135)
- 0228a95 ci: publish-docs installs awscli into user space (#9153)
- 746ba26 chore: add alert metric for Prometheus and add Grafana alert docs [RM-118] (#9150)
- 291565b fix: keras and tensorflow import errors in new versions (#9141)
- 831df43 feat: create flat runs view [ET-24] (#9023)
- 5854b8b chore: add a devcluster config to run Determined across multiple Kubernetes clusters locally (#9151)
- d0497da fix: fix docs for log retention (#9149)
- bd29f1f fix: cli gives misleading error message when logging in with a bad password [MD-277] (#8990)
- 95f87d7 fix: ensure all columns have widths (#9136)
- 3f7a396 test: fix test_logging typehint syntax error (#9142)
- 93e7bdf test: ignore e2e test cases in vitest (#9128)
- 4d1b8ae docs: revert helm values change for multirm (#9145)
- c3d13b6 docs: revert-multiRM-mc-doc (#9144)
0.31.0
Release Notes
Changelog
- 583e0c3 chore: bump version: 0.31.0-rc7 -> 0.31.0
- 29574c4 docs: add release notes for 0.31.0 (#9159)
- 40c34cb chore: bump version: 0.31.0-rc6 -> 0.31.0-rc7
- 75b7e43 fix: louden server errors coming from deleteCheckpoints (#9184)
- 44503bb chore: bump version: 0.31.0-rc5 -> 0.31.0-rc6
- 956df40 fix: fix docs and wording (#9179)
- dae548d fix: report errors from deletecheckpoints endpoint + improve feedback (#9178)
- 592280d chore: style fix for helm tls (#9163)
- 7565447 chore: bump version: 0.31.0-rc4 -> 0.31.0-rc5
- 2daa1fc fix: TensorBoard visualization from batch actions. (#9156)
- 4bbc20d fix: fix disable button condition in launch jupyter notebook modal (#9155)
- ac15a86 chore: bump version: 0.31.0-rc3 -> 0.31.0-rc4
- 990fbfb feat: add helm master level config for tcd startup hooks (#9135)
- a6ae2aa ci: publish-docs installs awscli into user space (#9153)
- d5deffb chore: bump version: 0.31.0-rc2 -> 0.31.0-rc3
- 7f7d2bf fix: fix docs for log retention (#9149)
- 776c5c3 fix: ensure all columns have widths (#9136)
- e8b4fd7 chore: bump version: 0.31.0-rc1 -> 0.31.0-rc2
- 691d190 test: fix test_logging typehint syntax error (#9142)
- 61999f4 docs: revert helm values change for multirm (#9145)
- be36ecd chore: bump version: 0.31.0-rc0 -> 0.31.0-rc1
- f78ccf8 docs: revert-multiRM-mc-doc (#9144)
- 3014dba chore: bump version: 0.31.0-dev0 -> 0.31.0-rc0
- 828532a chore: lock api state for backward compatibility check
- 0547f7f chore: bump version: 0.30.1-dev0 -> 0.31.0-dev0
- 55ef649 chore: change multirm log messages to trace level [RM-151] (#9138)
- 1bb2fe4 feat: expose
hyperparameters
in experiments api to avoid using deprecatedconfig
property for experiment (#9012) - 5a588e0 chore: lock published urls to preserve redirects
- f46bc69 chore: lock api state for backward compatibility check
- 28b3aff feat: add cluster wide startup hook for tasks (#9124)
- fe2b616 docs: Describe pwds default accounts (#9137)
- d1c268b fix: down migrations (#9133)
- f7a5260 chore: update PyPi metadata (#8971)
- 2c3ce29 chore: set the default db storage as docker volume instead of a mount (#9127)
- 8b11e3a ci: publish docs without installing awscli (#9126)
- 133f838 fix: prevent table breaking on null columnWidths [ET-161] (#9131)
- ec43809 fix: det gcp down doesn't have a det_version argument (#9121)
- c89b3df fix: reduce time and increase reliability of tests (#9125)
- d5f807d feat: helm deploys with a password (#9113)
- 8a7832a fix: unlock mutex for experiment ResourcePool() [RM-152] (#9119)
- e70d38e docs: add a python sdk example for log following. (#8981)
- 3028efb docs: add helm doc updates (#9122)
- cf2f2be fix: fix regression caused by join on trials view (#9091)
- bdab9e4 feat: create Searches view (#9089)
- 65339d2 chore: PR template again [INFENG-600] (#9118)
- 0c6985b chore: update github PR template [INFENG-600] (#9098)
- f4b0471 docs: add instructions on deploying determined via HPE MLDES [SAAS-1877] (#9105)
- c32ac6f chore: add test to CODEOWNERS [INFENG-605] (#9115)
- 25767b9 fix: helm value for gke tests (#9114)
- a0847b8 fix: match GetJobQueueStats behavior in k8s RM to agent RM [RM-136] (#9097)
- 2ef5ab9 chore: better k8s testing with shared gke cluster (#9074)
- 7fc8d7a chore: add nightly gke cluster cleanup job (#9031)
- 5cb7927 chore: bump version: 0.30.0-dev0 -> 0.30.1-dev0
- 7f65779 chore: add docs dropdown link for new version
- 4ae4075 docs: add release notes for 0.30.0 (#9103)
- 9dce6f0 feat: Add model version streaming (#9029)
- 2c6fec7 test: user-page-models (#9084)
- 75b1ff4 feat: det deploy aws adds tags to dynamic agents [RM-140] (#9106)
- d6059e9 feat: Create MoveRun endpoint (#9001)
- 91d7e08 feat: Pre-select ws when launching notebook (#9109)
- f03a8a8 fix: add missing k8s job submission times to allocations (#9028)
- b8bf396 chore: upgrade Bun to fix race condition in tests [DET-10193] (#9082)
- bc8c31c fix: make sure that the genai helm chart services work across namespaces (#9102)
- 58cd22b ci: INFENG-600 remove single commit legacy validation (#9104)
- 943b2cb fix: prevent checkpoint modals from closing on their own [ET-116] [ET-120] (#9094)
- d4eed0e chore: change RM log message back to Debug level (#9093)
- 2af21ee chore: unshadow more builtins (#9092)
- 519d702 docs: update multiRM docs (#9050)
- 3688c3f fix: job queue panic for multirm [RM-123] (#9079)
- f78b9aa fix: add change in master config to devcluster.yaml (#9087)
- 8f02a7f fix: fix master config and experiment config for log retention (#9075)
- fe1a6bb fix: no more shadowing "license" (#9085)
- 3f2d6ab fix: spacing issue with exp list pagination (#9067)
- 37abc6c fix: stop showing loading indicator in
queued
state (#9081) - 4050eda chore: bumpenvs for jupyter security update. (#9077)
- fe66b86 chore: CODEOWNERS deploy owned by RM not MD (#9064)
- 94e5d21 ci: more printing about state of master (#9058)
- a3834ac chore: change log level for multirm messages [RM-125] (#9080)
- 0099f4e chore: update old e-mail address (#8944)
- b726cf9 feat: initialize genai shared_fs permissions to agent group in helm deployment (#9065)
- 5031807 chore: api level check if agent slotstats are pre computed (#9073)
- 1a38f0c feat: add /health endpoint [RM-114] (#9062)
- 8217508 style: update harness to eliminate I2041 flake8 errors (#8960)
- 2c7d2a1 feat: add new endpoints to change log retention for experiments and trials (#8982)
- 6c4bc44 fix: slot stats are not filled in everywhere (#9070)
- 53bf20e fix: fix TestScheduleRetention (#9069)
- 1e45918 fix: remove parent_id from create_experiment (#9068)
- 17287f5 fix: API migration to improve performance in resource pool page (#9056)
- 85bb3c8 feat: add log retention for database logger (#8622)
- 8f5de35 fix: remove calls to Pytorch Dataset len (#8647)
- d9e1088 feat: webui nav sidebar dropdown text changes (#9063)
- 06c86ee chore: Remove /lore redirect from deployment template (#9057)
- 18dd29e docs: Update release notes (#9044)
- 8d2a763 refactor: change GlideTable into a reusable component (ET-25) (#8956)
- 93bca2a fix: loading experiments without filterset (#9059)
- a07f0fb faster migrations (#9060)
- e8dba6d feat: add slot stats to /agents endpoints (#9048)
0.30.0
Release Notes
Changelog
- 5a63518 chore: bump version: 0.30.0-rc5 -> 0.30.0
- 97aaa02 docs: add release notes for 0.30.0 (#9103)
- c108443 chore: bump version: 0.30.0-rc4 -> 0.30.0-rc5
- 4ce78b2 fix: prevent checkpoint modals from closing on their own [ET-116] [ET-120] (#9094)
- 8bcdcc8 chore: bump version: 0.30.0-rc3 -> 0.30.0-rc4
- e90238a chore: bump version: 0.30.0-rc2 -> 0.30.0-rc3
- b8db2e6 fix: slot stats are not filled in everywhere (#9070)
- d2e3a5c fix: remove parent_id from create_experiment (#9068)
- 61958ef fix: API migration to improve performance in resource pool page (#9056)
- 62d102b chore: bump version: 0.30.0-rc1 -> 0.30.0-rc2
- bc241b6 docs: Update release notes (#9044)
- 2e31ece fix: loading experiments without filterset (#9059)
- 4efaede chore: bump version: 0.30.0-rc0 -> 0.30.0-rc1
- d2949d3 faster migrations (#9060)
- 4c6e35c feat: add slot stats to /agents endpoints (#9048)
- f32dc82 chore: bump version: 0.30.0-dev0 -> 0.30.0-rc0
- 10030a6 chore: lock published urls to preserve redirects
- 220f820 chore: bump version: 0.29.2-dev0 -> 0.30.0-dev0
- 1e6f0f7 feat: Use filtered resource pools when creating notebook (#9045)
- 74fe16b feat: profiling v2 [MD-27] (#9032)
- 133d127 docs: revert multirm docs changes #9016
- 1992c97 chore: optional DB migrations (#9047)
- 84ba688 fix: docs lint (#9052)
- 848b216 feat: add command det model delete (#9039)
- 1202d5c refactor: DET-9976 remove agentID type from agentrm (#9040)
- 0710c58 docs: Describe editorrestricted (#9049)
- 02da36f chore: mark db-dependent tests as needing to run in integration (#9041)
- 6c88e8d fix: move experiment SQL error (#9042)
- 3fa0df1 Revert "docs: add EditorRestricted role release note (#9007)" (#9046)
- 60cb003 test: Jcom/infeng 454/sign in tests (#9013)
- f08b406 ci: tag CI-deployed resources (#9043)
- 1868723 build(deps): bump google.golang.org/protobuf from 1.28.0 to 1.33.0 (#8996)
- d4ab20b build(deps): bump github.com/docker/docker (#9026)
- e4bc377 test: playwright config and browser usability (#9024)
- f6b9ac8 build(deps): bump github.com/jackc/pgx/v4 from 4.12.0 to 4.18.2 (#8987)
- c811947 chore: helm for multirm kubeconfig_path (#9033)
- 4441d6d feat: Add template to py sdk
create_experiment
(#8927) - 5ac1b85 chore: revert helm for multirm kubeconfig_path (#9030)
- 6fec24d chore: helm for multirm kubeconfig_path (#9015)
- 0518785 feat: streaming update code generation for typescript (#8988)
- 39afa3c docs: add documentation for multirm (#9016)
- 7e37c22 chore: add grpc based auth fallback to proxied requests (#8980)
- 5e1f2af fix: Experiment.await_first_trial exits when Experiment is terminal (#9022)
- a603f4c chore: logins return Sessions (#8883)
- 93b6aa2 feat: SearchFlatRuns api call for flat runs table support (#8852)
- fa43bff ci: test-perf uses determined version from github (#9019)
- 137bfcd feat: add model streaming (#8973)
- 8bf280d refactor: consolidate experiment list selection state (#8860)
- 674cd73 ci: DRY skip logic and clarity on step name (#9002)
- 00d145f chore: bump version: 0.29.1-dev0 -> 0.29.2-dev0
- a3ba9e9 chore: add docs dropdown link for new version
- e922a41 docs: add release notes for 0.29.1 (#9014)
- dfed63d chore: reassign ml-sys CODEOWNERship to model-dev (#9000)
- eac7ddf test: document ui e2e with backend test instructions for local (#9005)
- bc1b431 docs: add EditorRestricted role release note (#9007)
- f52f43b chore: warn about det deploy det-version mistmach (#8994)
- 5b17df3 chore: limit code coverage report to files in src; omit generated files (#9003)
- f73fd09 fix: escape regex in
ProjectDeleteModal
(#8998) - 73fd1cd feat: Add multi RM name to K8s (#8993)
- 978a02e ci: Djanicek/infraeng 487/circle test runner (#8977)
- 4730d76 chore: ban http.Transport & http.Client; add cleanhttp (#8991)
- 52572d4 fix: improved textcell performance for novels (#8986)
- 89d4708 docs: add EditorRestricted role to rbac docs (#8984)
0.29.1
Release Notes
Changelog
- 6f0810b chore: bump version: 0.29.1-rc2 -> 0.29.1
- d13dfac docs: add release notes for 0.29.1 (#9014)
- 8093bee chore: bump version: 0.29.1-rc1 -> 0.29.1-rc2
- cce4e6b chore: warn about det deploy det-version mistmach (#8994)
- a2576be chore: bump version: 0.29.1-rc0 -> 0.29.1-rc1
- 05a75b3 fix: escape regex in
ProjectDeleteModal
(#8998) - de8d02d chore: bump version: 0.29.1-dev0 -> 0.29.1-rc0
- 0a2fd28 chore: change GKE version (#8989)
- 055dd83 docs: Update Deploy on GCP (#8985)
- 47cb6fd fix: remove error text in continue trial modal (#8923)
- 2f40476 chore: bump version: 0.29.0-dev0 -> 0.29.1-dev0
- 84a846e chore: add docs dropdown link for new version
- 0fd6b61 docs: add release notes for 0.29.0 (#8955)
- 115bf13 fix: remove duplicate permissions in rbac CLI output (#8972)
- cc2e9b4 chore: Bumpenvs for NGC+ images (#8975)
- 1a35e5d test: add e2e_tests for multirm k8s [RM-11] (#8926)
- 18154f6 chore: add type ResourcePoolName string (#8978)
- a22656d chore: remove panics from rm initialization (#8983)
- 4ae0987 chore: amend contributing doc to point to correct make rule, as of #2892 (#8947)
- 26c985c fix: Check auth validity before setting isAuthenticated (#8967)
- f67c473 fix: nil deref in ReadPreemptionStatus (#8979)
- 6e1acf4 chore: multirm unique resource pool config changes [RM-74] (#8974)
- ca29879 chore: add multirm router layer to rm module (#8963)
- 2395dcb fix: stopping states are not handled in restore properly [RM-69] (#8958)
- b06c923 feat: allow k8srm to connect with a kubeconfig (#8953)
- f8f860d chore: react-virtuoso LogViewer companion (#8862)
- bf21896 chore: revert multirm refactors (#8962)
- 4309f7f feat: Display resource managers information (#8951)
- 4d538ae test: remove last quarantined test (#8922)
- 68017dd ci: update performance test script for breaking Determined change (#8961)
- b2b85d7 chore: [RM-68] improve readability for unit test (#8950)
- e1ca242 feat: Connect ProjectStore with streaming updates (#8834)
- 191a144 fix: don't access agentState when it may be nil (#8921)
- dcaa893 fix: update default aux container limits and instance types (#8959)
- c7e5d43 docs: fix pre_publish check (#8957)
- 6ecd81e chore: update AMIs - Nvidia minor version bump (#8945)
- e108ed7 chore: set CGO_ENABLED=0 (#8941)
- 54aa739 chore: fix multirm unit test flake (#8949)
- e8b0165 chore: add resource manager name/metadata to resourcepoolv1 proto (#8948)
- a5b425a test: Add e2e test for streaming updates python client (#8901)
- 77d1ede fix:
no data plot
in chart with data (#8935) - f416354 test: refactor usage within test_local (#8913)
- c3012ff chore: add multirm module to ResourceManager (#8857)
- d507edd test: CLI workflows in CI use new Python images (#8943)
- fa856ab chore: remove support for Python 3.7; prefer 3.8 (#7329)
- c404c8e fix: [RM-6] remove global max-slots-per-pod default when multiple RMs… (#8938)
- 0d61d15 build: bump up ci setup_remote_docker version (#8942)
- 60436ea chore: pin pandas and ray versions for ray tests (#8932)
- f997cd8 fix: malformed config with gcp up with --initial-user-password (#8936)
- 967e41f build: bump ci cpu image to latest ubuntu 2004 (#8940)
- 592a566 feat: streaming updates python client [MD-246] (#8778)
- 2dfc4f2 chore: remove unused constant (#8934)
- b99ad9f fix: det deploy gcp down shouldn't check quotas (#8931)
- 21fb6d1 fix:
det dev curl
support for URLs with curly brackets. (#8930) - 225dba3 fix: specify go1.22.0 (#8929)
- 63adae5 fix: cli fails when listing providers [DET-10127] (#8903)
- 9e8cd68 fix: slurm launcher authenticates preemption notification (#8928)
- acded32 tc: Add release note 8851 (#8864)
- dffda27 chore: cover and bunify project functions in postgres_experiments.go (#8912)
- beac348 fix: SSO button link target (#8925)
- 5367f4f chore: add codeowners for resource-mgmt team files (#8879)
- feb73de tc: Remove broken link (#8924)
- cd88bb5 chore: revert pod spec and test changes (#8920)
- 1dfd6d9 chore: bump up ebs size to 400gb for genai deployments
- 3a3b668 fix: canonicalize master urls shim code (#8919)
- ef49195 test: fix failing Go TestResourceCreationFailed test (#8918)
- 94c7bfe chore: minor tweaks as modev takes over streaming updates (#8909)
- 392f054 ci: fix failing nightlies after auth PR (#8904)
- ad7d260 chore: fix mp.pool test_streaming_metrics_api (#8917)
- 8af4148 test: upload test results to datadog (#8910)
- aab9b42 test: remove redundant (and brittle) assertion (#8894)
- 59385a0 feat: log podspec [DET-9861] (#8899)
- 6857ecf chore: refactor ResourceManager interface for multirm (#8847)
- 9f40603 test: skip tests that need to get scheduler type (#8911)
- 4c50601 chore: upgrade Go from 1.21 -> 1.22 (#8914)
0.29.0
Release Notes
Changelog
- 5079570 chore: bump version: 0.29.0-rc4 -> 0.29.0
- 8fa5b5a docs: add release notes for 0.29.0 (#8955)
- fffde7f chore: bump version: 0.29.0-rc3 -> 0.29.0-rc4
- f939a0f fix:
no data plot
in chart with data (#8935) - 5a74e37 build: bump ci cpu image to latest ubuntu 2004 (#8940)
- ad84759 build: bump up ci setup_remote_docker version (#8942)
- f0d9768 fix: malformed config with gcp up with --initial-user-password (#8936)
- 2a61ab3 chore: bump version: 0.29.0-rc2 -> 0.29.0-rc3
- 435e90a chore: fix mp.pool test_streaming_metrics_api (#8917)
- 18e2ea4 chore: bump version: 0.29.0-rc1 -> 0.29.0-rc2
- 799373f fix: slurm launcher authenticates preemption notification (#8928)
- 641174c tc: Add release note 8851 (#8864)
- f275252 chore: bump up ebs size to 400gb for genai deployments
- 8c855b7 fix: SSO button link target (#8925)
- 8d4acd5 tc: Remove broken link (#8924)
- 06875df fix: canonicalize master urls shim code (#8919)
- e5ae865 chore: bump version: 0.29.0-rc0 -> 0.29.0-rc1
- b847ede chore: bump version: 0.29.0-dev0 -> 0.29.0-rc0
- 28c385c chore: lock published urls to preserve redirects
- cbfd3c2 chore: lock api state for backward compatibility check
- b30f609 chore: bump version: 0.28.2-dev0 -> 0.29.0-dev0
- ad94c17 fix: return error from websocket handler if socket id is taken (#8877)
- 4618389 style: update genai logo on sidebar (#8907)
- 8f82087 test: fix tensorboard reattach k8s flake [RM-39] (#8906)
- d24b19a test: unquarantine deploy-local tests (#8896)
- 7c6bec9 chore: refactor proto, schema, and jobservice for multiRM (#8875)
- ca96da1 fix: Genai helm service fix (#8885)
- a89e51e fix: trial comparison text overflow bug fix (#8869)
- 9817a4d chore: add trigger to abort checkpoint deletion (#8878)
- 2689b0b chore: delete unused functions [RM-41] (#8888)
- 9a6afd2 docs: Organize docs (#8898)
- a8ac657 chore: small build system fixes (#8900)
- fa98bf3 fix: add missing ci context to preview cluster
- b15d508 fix: add deploy last main missing ci context (#8892)
- b47b477 chore: cleanup stray comments (#8889)
- ae08265 feat: force default user passwords for all det deploy and CI clusters [RM-28] (#8851)
- be1ab85 fix: unnecessary group related api calls during the initial group page loading (#8882)
- f37bc3e fix: move e2e_tests changes for slurm test from EE to OSS (#8887)
- 93ced86 fix: add missing check for external sessions on exp launch (#8859)
- 944732a ci: more e2e test fixes (#8881)
- ab9505c ci: fix e2e tests in ee (#8880)
- 0bc3106 docs: Add llm blog link to home page (#8874)
- c029327 docs: add link checker utility (#8738)
- e1da471 chore: api's default retry now session's default retry (#8872)
- 7bb9dbc chore: master config updates for multirm [RM-3, RM-4, RM-5, RM-7, RM-29] (#8831)
- f101f3d chore: add allocation info for cluster ui [DET-10018] (#8616) (#8876)
- 72d54be chore: canonicalize master urls everywhere [MLG-878] (#8670)
- e3709bd chore: document internal api errors (#8865)
- 27a279e fix: e2e CPU tests have wrong maxSlotsPerPod number (#8870)
- 03b9b30 chore: bunify postgres_jobs.go (#8858)
- e9ac112 build(deps): bump peter-evans/create-or-update-comment from 3 to 4 (#8760)
- dc3e41e Fix broken links (#8825)
- bccdf0c fix: stop allowing multi-container allocations to launch in single agent config (#8833)
- a1214d7 chore: add allocation info for cluster ui [DET-10018] (#8616)
- 76ec233 chore: refactor a bunch of auth-related python (#8347)
- 66b1e6c chore: bump version: 0.28.1-dev -> 0.28.2-dev0
- f250ad9 chore: add docs dropdown link for new version
- 9d44ca1 docs: add release notes for 0.28.1 (#8861)
- ac8c440 fix: allow experiments to configure k8s sidecars (#8854)
- d07ec40 ci: fix broken ci due to queue version change (#8853)
- c656aac chore: use npm build for hew (#8845)
- 6b63750 feat: add a master API to fetch a trial by external id. (#8730)
- e78a4c0 fix: correctly source bucket region when using minio (#8850)
- dba5f0f fix: replace
react-window
withreact-virtuoso
in transfer component (#8800) - 2a183da ci: fix performance feature branch using wrong db (#8835)
- 47061fa fix: revert config work from #8765 and #8789 due to feature regressions (#8849)
- a5f38cb chore: remove GetAllocationSummary from RM interface (#8846)
- de28a57 chore: cover postgres_jobs.go (#8841)
- ba8250a chore: update backend coverage target (#8798)
- 556639d fix: show error message from backend API for workspace deletion (#8848)
- 08dfa43 fix: job queue test failures (#8843)
- 876f9c3 chore: configure agent log level through config file (#8819)
- ba03375 chore: move project id onto runs (#8794)
0.28.1
Release Notes
Changelog
- f6cb624 chore: bump version: 0.28.1-rc3 -> 0.28.1
- baaa3bd docs: add release notes for 0.28.1 (#8861)
- fbf9df4 chore: bump version: 0.28.1-rc2 -> 0.28.1-rc3
- a965f15 ci: fix broken ci due to queue version change (#8853)
- d91e8b0 chore: bump version: 0.28.1-rc1 -> 0.28.1-rc2
- 3129d33 fix: revert config work from #8765 and #8789 due to feature regressions (#8849)
- 1888b90 chore: bump version: 0.28.1-rc0 -> 0.28.1-rc1
- c443073 fix: show error message from backend API for workspace deletion (#8848)
- 1fc1496 fix: job queue test failures (#8843)
- a74685f chore: bump version: 0.28.1-dev -> 0.28.1-rc0
- 5b2e32d chore: cleanup the last traces of experiment git fields. [MD-258] (#8830)
- 92a380f feat: Generic task restore (#8802)
- b0fa7dc feat: generic tasks: support startup hooks (#8840)
- ca80022 chore: bunify postgres_checkpoints and add tests (#8783)
- a4dbc03 chore: fix error on terminating experiments on restart (#8837)
- aa98d82 chore: agent state wasn't getting deleted and logged error (#8838)
- bb469fa fix: update hew with bugfixes (#8839)
- 393cfde Fix broken ref (#8836)
- 7a13863 perf: improve GetExperiments + SearchExperiments counting (#8801)
- d8d9965 chore: remove unused SetAllocationName (#8829)
- 1946d9a docs: Update slurm install (#8832)
- 1fd21e7 fix: Fix small typo in Webhook documentation (#8820)
- e341e27 feat: Generic Tasks (#8724)
- fff85e3 fix: handle helm templating in older go template versions (#8828)
- f300d97 chore: hide genai helm values config and fix var name (#8821)
- 6206bde feat: add streaming updates core functionality and project streaming (#8669)
- ed61121 fix: stop truncating log timestamps to avoid missing logs [WEB-1791] (#8815)
- 43d3f21 fix: check for models before deleting workspace (#8804)
- bb59fa2 ci: wait longer for performance test db to startup (#8796)
- cfffe96 docs: Remove legacy pages (#8818)
- 1c3f3c4 fix: mitigate many unnecessary api calls in user management table (#8816)
- 4612c41 fix: agent config precedence (#8656)
- 762fcef feat: Deploy GenAI in Helm (#8727)
- 8e067d9 fix: remove possible hang from ship_logs.py [MLG-1565] (#8803)
- 1daf9d3 docs: remove duplicated note (#8813)
- 56e7000 fix: remove extra quotes around IdentifyTask (#8792)
- 3805ebd chore: add testing for k8s informer panic (#8810)
- a35696d refactor: condense trial update functions (#8808)
- 45c578b chore: bump version: 0.28.0-dev0 -> 0.28.1-dev
- 6520629 chore: add docs dropdown link for new version
- ed2136d docs: add release notes for 0.28.0 (#8807)
- 8258565 chore: bump version: 0.27.2-dev0 -> 0.28.0-dev
- c5afb6c fix: fetch experiment in case config data is not contained (#8789)
- 4e17ef7 chore: differentiate between programmatic and web page requests (#8795)
- a1a6e20 chore: add ee helm chart changes to oss (#8799)
- 65c811c docs: Add mention of RPMs to on-prem _index.rst (#8773)
- ad765d4 docs: adds/corrects EE changes, merges to OSS (#8788)
- f1a45ae perf: update proto_checkpoint_view to use index (#8793)
- abd590d Revert "docs: Update oidc and saml docs (#8777)" (#8791)
- bb88b01 fix: improve trial log request cancelling (#8787)
- 17f305f ci: make perf tests only alert on failure (#8790)
- 422f5aa perf: avoid loading model def in experiment model (#8742)
- 7698452 perf: improve GetExperiments showTrialData performance (#8753)
- e801cfe perf: add index to checkpoints_v2 id (#8758)
- 71db4e1 perf: add indexes to tasks and allocations (#8757)
- e0e6cf0 perf: improve get_workspaces query (#8751)
- ef656bc perf: improve resource agg performance (#8735)
- e873381 fix: retry watcher failure causes infinite loop (#8786)
- ba2f190 fix: replace experiment config (#8765)
- 85d1053 chore: rename postgres_command_intg_test.go (#8785)
- 40a70cf test: performance test CI work (#8761)
- 36a2e29 chore: bunify db/postgres_tasks.go (#8764)
- 07494cf fix: update hew to a version without broken documentcard prompts (#8782)
- 9502059 feat: GCS client should retry on
TooManyRequests
. (#8780) - ec850ae test: add intg tests for db/postgres_tasks.go (#8750)
- 9ec2f7d chore: update gke version to comply with latest release for e2e tests (#8781)
- cefa242 chore: persist checkpoint storage backend ID (#8690)
- 905e449 chore: migrate db schema trials to runs (#8723)
- dfbb926 chore: clean up leftover debug print statements (#8755)
0.28.0
Release Notes
Changelog
- 7f9b082 chore: bump version: 0.28.0-rc4 -> 0.28.0
- ed1b7f0 docs: add release notes for 0.28.0 (#8807)
- c4b6f57 chore: bump version: 0.28.0-rc3 -> 0.28.0-rc4
- 27ce0a2 chore: add ee helm chart changes to oss (#8799)
- 959a096 chore: add ee helm chart changes to oss (#8799)
- f513174 chore: bump version: 0.28.0-rc2 -> 0.28.0-rc3
- 083c314 chore: bump version: 0.28.0-rc1 -> 0.28.0-rc2
- e272bf0 chore: bump version: 0.28.0-rc0 -> 0.28.0-rc1
- 6080d39 chore: bump version: 0.27.2-rc4 -> 0.28.0-rc0
- 3cbce1d chore: bump version: 0.27.2-rc3 -> 0.27.2-rc4
- 89df98b docs: adds/corrects EE changes, merges to OSS (#8788)
- 2e27c71 chore: bump version: 0.27.2-rc2 -> 0.27.2-rc3
- 1abe34f chore: bump version: 0.27.2-rc1 -> 0.27.2-rc2
- e23e162 fix: improve trial log request cancelling (#8787)
- 55b5bd4 chore: bump version: 0.27.2-rc0 -> 0.27.2-rc1
- 5edfd81 fix: retry watcher failure causes infinite loop (#8786)
- 6a21d44 fix: update hew to a version without broken documentcard prompts (#8782)
- 74e341d chore: bump version: 0.27.2-dev0 -> 0.27.2-rc0
- ea9e903 chore: lock published urls to preserve redirects
- 0321e1f chore: lock api state for backward compatibility check
- 3783f2b docs: Update oidc and saml docs (#8777)
- 141afa4 docs: update dependency version in contributing readme (#8776)
- 994527f fix: Text filter on ProjectMoveModal (#8775)
- aa65c07 chore: use vite-plugin-svg-to-jsx package (#8772)
- 98c61f3 test: do not import model_hub test requirements (#8771)
- 1e2da10 ci: retry git fetch for early stopping checks (#6318)
- c73712b docs: Replace basic quickstart (#8770)
- fda515d fix: python requirements for pytest and moto (#8769)
- 78929c0 fix: Filter value resets when switching column types [WEB-1949] (#8731)
- 7ddf965 docs: Fix minor issues (#8768)
- 31f6f99 fix: add default transport to proxy connection (#8767)
- 149b7fa build(deps): bump slackapi/slack-github-action from 1.24.0 to 1.25.0 (#8766)
- 719169a docs: Fix dropdown url (#8763)
- 56406a2 Update helm chart config ref (#8762)
- 5973f8e chore: bump version: 0.27.1-dev0 -> 0.27.2-dev0
- 260c2bc chore: add docs dropdown link for new version
- 4b4d14a docs: add release notes for 0.27.1 (#8746)
- 7841d9e feat: the new quick start guide link (#8759)
- 995311a feat: expconf flag to force scheduling on a single node/container/pod (#8743)
- 64d588f refactor: use hew Tree and Divider components [WEB-1920] (#8736)
- f771acb fix: cease many model fetch api calls in checkpoint tab (#8749)
- 96b9064 docs: Add qs for webui users (#8754)
- d68ffaa docs: API deprecate returning config for bulk endpoints (#8732)
- f21a516 tests: cover queries inside internal/users/postgres_users.go (#8729)
- 90a57cb fix: Experiment table, right-click context menu [WEB-1942] (#8756)
- 2ffc18f chore: import missing EE helm chart change [ci skip] (#8747)
- 87b6cf3 fix: use the new genai docker repo (#8745)
- 7c3650f chore:
make devcluster
to rebuild bindings before harness and webui. (#8748) - 7f3ddfb feat: Add a modal to enable/disable Agents [WEB-1718] (#8721)
- bd0a9ea fix: pagination fix in model detail page (#8744)
- 43c074e feat: helm option to mount shared_fs checkpoints to master (#8741)
- 9f06d35 fix: use selected checkpoints when registering (#8739)
- 6db8c06 test: cover agent_state.go SQL queries (#8740)
- eb48302 test: cover db.GroupCheckpointUUIDsByExperimentID (#8508)
- d661404 fix: compress data from API for the page load performance improvement (#8720)
- b71da7a fix: batch metric writes to TensorBoard [MLG-990] (#8688)
- bd78ec1 feat: Preserve 'redirect' query during logout [GAS-489] (#8728)
- 62941a2 refactor: remove antd App component [WEB-1922] (#8713)
- bf5b1d1 chore: fix unused-imports warning in protos build. (#8726)
- 0bda0d9 fix: use hew Alert [WEB-1918] (#8711)
- 1c21f6a chore: Move from internal glide-table-grid to v6.0.0 [WEB-1945] (#8725)
- f32e015 fix: local checkpoint download path fix (#8722)
- 190af1d docs: [FE-270] add PBS known issue - Cluster tab does not display GPU information (#8719)
- 6d744f7 feat: content-length for tar checkpoint downloads (#8684)
- 11e3ba9 chore: upgrade
[email protected]
(#8718) - 92fe3a6 docs: [FE-269] Add documentation detailing configuration steps to set the values for ngpus. (#8714)
- b69a49c chore: update github path in docker docs (#8687)
- faea553 chore: codecov reports to match go coverage reports (#8696)
- 0782c35 chore: standardize oidc/saml group & display attribute names in helm config (#8689)
- acca434 chore: update oss/ee oidc & saml helm config (#8680)
- 7188b69 fix: use Hew dropdown on FilterGroup [WEB-1938] (#8715)
- a410c45 chore: Upgrade to vite 5 (#8676)
- dbeb458 fix: support
CommandState
for experiment icon (#8709) - 83fe474 docs: fix references on children of "training reference" root (#8708)
- 71eaa5a chore: Replace antd reset.css with modern-normalize (#8706)
- e0e08b6 fix: Update hew for chart fix, avoid error from Typography.Label (#8712)
- fef93a4 build(deps): bump actions/cache from 3 to 4 (#8710)
- 4aedded docs: Update docs to pass linter (#8705)
- 00c2746 fix: restore original user store poll on leaving workspace details (#8702)
- 73760cd Revert "docs: Update docs to pass linter" (#8704)
- 4b7b705 [docs] Update docs to pass linter (#8703)
- 2402133 docs: Update Docker Installation Instructions (#8659)
- f8a2434 docs: Update Linux distros, add WSL, and archs to Quickstart (#8662)
- 132919f docs: Overhaul WSL deployment instructions (#8658)
- e7dc7aa chore: Replace custom archived note with Hew badge (#8695)
- f2899cc fix: fix CreateExperiment for Remote Users (#8700)
- f00768f chore: remove unused files (#8698)
- 2e60167 chore: TrialsComparisonModal style fixes [WEB-1919] [WEB-1909] (#8674)
- e8d6448 Revert "fix: restore original user store poll on leaving workspace details"
- 6d3f9ff fix: playwright fix (#8699)
- c869ce7 fix: restore original user store poll on leaving workspace details
0.27.1
Release Notes
Changelog
- e05d57d chore: bump version: 0.27.1-rc1 -> 0.27.1
- 94316ae docs: add release notes for 0.27.1 (#8746)
- b8a772a feat: helm option to mount shared_fs checkpoints to master (#8741)
- 3abd7c7 fix: use selected checkpoints when registering (#8739)
- c479a28 chore: bump version: 0.27.1-rc0 -> 0.27.1-rc1
- 6e58dca fix: playwright fix (#8699)
- 204c32e fix: use Hew dropdown on FilterGroup [WEB-1938] (#8715)
- f2baf39 fix: support
CommandState
for experiment icon (#8709) - 7a9b0fd fix: Update hew for chart fix, avoid error from Typography.Label (#8712)
- 4085b42 fix: fix CreateExperiment for Remote Users (#8700)
- 23574cf chore: bump version: 0.27.1-dev0 -> 0.27.1-rc0
- e2cf980 chore: bumpenvs 0.27.1 (#8701)
- 4fffb09 docs: Update references to RHEL and CentOS to Enterprise Linux (#8660)
- 6ccc7dc docs: Fix broken ext link (#8697)
- 06d7669 docs: fix a few hyperlinks in the Python SDK reference. (#8693)
- e0558f9 fix: Workspace icons display redundant tooltips [WEB-1912] (#8677)
- 0329667 fix: docs version switcher [MLG-1524] (#8692)
- 130ebb7 fix: Trial APIs
--local
should respect.detignore
. [MLG-1352] (#8683) - 3d69af2 test: make compute_stats have a min tolerance of 3 seconds (#8691)
- 4a30ce8 chore: improve error message for continuing a completed hp search [DET-10041] (#8636)
- 11a0f27 fix: add zmq heartbeat in DistributedContext [MLG-1133] (#8681)
- 10aaf02 chore: update helm image path to use main branch (#8686)
- 8671fbd feat: set jupyter notebook file browser root to
/
. (#8678) - 2fdda37 fix: update hf trainer api example (#8685)
- 4118806 fix: HP ScatterPlots scrollbar (#8682)
- 248e88c fix: update helm chart for new logo (#8675)
- 4e8d9a9 feat: Truncate and pad cell values in glide table [WEB-1778] (#8665)
- 38e6825 fix: Remove out of place spinner at dashboard page (#8668)
- df67efe fix: use FixedSizeList for column picker to fix jitter (#8628)
- 127cf98 fix: break out stopping... states from stopped states in webui (#8672)
- 766577d feat: add health check for genai deployments (#8613)
- 0d93855 feat:
det (notebook|tensorboard|shell|command) ls --sort-by ...
[DET-6126] (#8649) - 39ad129 fix: wait on algolia search upload (#8661)
- f5bebc3 chore: delete unused project model (#8673)
- c99bfc3 fix: Helm master-service clusterip mngt. openshift (#8546)
- 4354c8e chore: reinstate original DuplicateError to postgres_users.go (#8657)
- 67a2c40 chore: bump version: 0.27.0-dev0 -> 0.27.1-dev0
- 8446c43 chore: add docs dropdown link for new version
- 8651f41 docs: add release notes for 0.27.0 (#8671)
- 25f0ba9 chore: Add warning to Patch Mast config to specify changes are ephemeral (#8577)
- fd4c1a0 fix: don't drop active_user on expired tokens [MLG-1494] (#8653)
- 9dc4fc8 feat: add
det e unpause
alias. (#8562) - 7d9f295 style: reformat
det (n|t|s|c)
arguments code. (#8652) - c1dbed9 chore: Remove antd usage from determined [WEB-1723] (#8605)
- 0c9e19f feat: conditionally add genai to the sidebar (#8496)
- 4f4ef5a fix: stop using distutils (for python 3.12) [MLG-1519] (#8667)
- 1e8baee fix: Add theme to TableFilterDropdown [WEB-1952] (#8663)
- 24cfbb8 docs: Add section on viewing topology (#8638)
- ef26ad2 fix: Revert "chore: Job/task displays Running instead of Scheduled (#8335)" (#8654)
- 926a6de chore: delete unused experiment fields in db (#8639)
- 33dcfbd fix: replace View Logs link with useNavigate (#8655)
- 87f713f fix: Check if loading before saying no workspaces / projects [WEB-1904] (#8627)
- ae072f7 docs: Reorder the tutorials for visibility (#8650)
- 5774f5c chore: improve string_to_bool help text of cli [MLG-1208] (#8651)
- ab08345 chore: move postgres_command.go to command module (#8648)
- c1a7461 chore: add Go coverage ratchet to unit and integration tests (#8602)
- c0614a8 fix(cli/deploy/simple-rds): set template default snapshot (#8645)
- 2403cd5 fix: Sort order in users.active column (#8646)
- b60f8de fix: Trial metric chart series always have unique colors (#8626)
- d1b585c test: fix
test_tf_keras_parallel
. (#8641) - 5f19a53 feat(cli/deploy): add db snapshot flag for simple-rds [INFENG-269] (#8443)
- 2e6a634 chore: print TB status message to logs [MLG-1119] (#8642)
- c8d50a3 fix: ResourcepoolDetails polls for job stats [WEB-1914] (#8635)
- 2604235 chore: Update hew to 0.6.22 (#8634)
- 34891a1 docs: Add oidc note about uniqueness (#8637)
- 001d827 chore: extend CLI NTSC timeout [MLG-870] (#8632)
- 86aab14 docs: Document scim display name attribute (#8606)
- 134c2e1 fix: reduce tfkeras iris const batch size (#8633)
- a5e4b70 docs: Mention examples in the git readme (#8623)