Pauline Bailly-Masson
7b325cd573
Fix security issue 5 ( #42072 )
...
fix
Co-authored-by: Pauline <pauline@Paulines-MacBook-Pro-2.local >
2025-11-06 19:50:59 +01:00
Pauline Bailly-Masson
a9e2b80c71
add workflow to check permissions and advise a set of permissions req… ( #42071 )
...
add workflow to check permissions and advise a set of permissions required
Co-authored-by: Pauline <pauline@Paulines-MacBook-Pro-2.local >
2025-11-06 18:55:01 +01:00
Yih-Dar
5aa7dd07da
Revert back to use GitHub context ( #42066 )
...
* check
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-11-06 14:41:58 +01:00
Yih-Dar
76fea9b482
Fix another Argument list too long in pr_slow_ci_suggestion.yml ( #42061 )
...
* fix
* trigger
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-11-06 13:33:23 +01:00
Yih-Dar
8a96f5fbe8
Be careful at explicit checkout actions ( #42060 )
...
final
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-11-06 11:01:06 +01:00
Yih-Dar
17fdaf9b7a
Avoid explicit checkout in workflow ( #42057 )
...
* remove explicit checkout
* check 1
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-11-06 09:31:20 +01:00
Yih-Dar
bb65d2d953
Fix pr_slow_ci_suggestion.yml after #42023 ( #42049 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-11-05 22:10:12 +01:00
Yih-Dar
57bdb4a680
Cleanup workflow - part 1 ( #42023 )
...
* part 1
* part 2
* part 3
* part 4
* part 5
* fix 1
* check 1
* part 6
* part 7
* part 8
* part 9
* part 10: rename file
* OK: new_model_pr_merged_notification.yml
* part 11
* fix 2
* revert check
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-11-05 21:01:06 +01:00
Yih-Dar
561233cabf
Change trigger time for AMD CI ( #42034 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-11-05 14:17:12 +01:00
Pauline Bailly-Masson
20396951af
CodeQL workflow for security analysis ( #42015 )
...
* CodeQL workflow for security analysis
Created CodeQL workflow to use reusable workflow from internal and simplified configuration.
* Update CodeQL workflow for main branch only and remving python from analysis
Restrict CodeQL analysis to 'actions' language only.
* Disable pull_request trigger in CodeQL workflow temporarly
Comment out pull_request trigger for CodeQL workflow
2025-11-05 10:59:37 +01:00
Rémi Ouazan
dd4e048e75
Reduce the number of benchmark in the CI ( #42008 )
...
Changed how benchmark cfgs are chosen
2025-11-04 14:07:17 +01:00
Yih-Dar
6d4450e341
Fix torch+deepspeed docker file ( #41985 )
...
* fix
* delete
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-11-04 10:41:22 +00:00
Yih-Dar
258c76e4dc
Fix run slow v2: empty report when there is only one model ( #42002 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-11-04 06:46:21 +01:00
Guillaume LEGENDRE
1619a3475f
fix (CI): Refactor SSH runners ( #41991 )
...
* Change ssh runner type
* Add wait step to SSH runner workflow
* Rename wait step to wait2 in ssh-runner.yml
* Remove wait step from ssh-runner.yml
Removed the wait step from the SSH runner workflow.
* Update runner type for single GPU A10 instance
* Update SSH runner version to 1.90.3
* Add sha256sum to ssh-runner workflow
* Update runner type and remove unused steps
2025-11-03 18:16:32 +01:00
Rémi Ouazan
ff0f7d6498
More data in benchmarking ( #41848 )
...
* Reduce scope of cross-generate
* Rm generate_sall configs
* Workflow benchmarks more
* Prevent crash when FA is not installed
2025-11-03 18:05:26 +01:00
Rémi Ouazan
80305364e2
Move the Mi355 to regular docker ( #41989 )
...
* Move the Mi355 to regular docker
* Disable gfx950 compilation for FA on AMD
2025-11-03 16:41:06 +01:00
Mohamed Mekkouri
a623cda427
[kernels] Add Tests & CI for kernels ( #41765 )
...
* first commit
* add tests
* add kernel config
* add more tests
* add ci
* small fix
* change branch name
* update tests
* nit
* change test name
* revert jobs
* addressing review
* reenable all jobs
* address second review
2025-11-03 16:36:52 +01:00
Yih-Dar
8fb854cac8
Run slow v2 ( #41914 )
...
* Super
* Super
* Super
* Super
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-11-01 19:40:40 +01:00
Yih-Dar
cad7eeeb5e
Minor fix in docker image build workflow ( #41949 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-10-30 11:02:11 +01:00
Jitesh Gupta
76fc50a152
Cache latest pytorch amd image locally on mi325 CI runner cluster ( #41926 )
2025-10-29 19:45:37 +01:00
Yih-Dar
10d557123b
Update some workflow files ( #41892 )
...
* update
* update
* final check
* final check
* final clean
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-10-29 14:42:05 +01:00
Yih-Dar
e2e8dbed13
CI workflow for Flash Attn ( #41857 )
...
ci for flash attn
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-10-25 09:45:47 +02:00
Anton Vlasjuk
2c5b888c95
[Onnx docs] Remove some traces ( #41791 )
...
fix
2025-10-23 10:34:25 +02:00
Luc Georges
71db0d49e9
feat: add benchmark v2 ci with results pushed to dataset ( #41672 )
2025-10-20 08:56:58 +01:00
Yih-Dar
307c523854
further improve utils/check_bad_commit.py ( #41658 ) ( #41690 )
...
* fix
* Update utils/check_bad_commit.py
Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com >
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com >
2025-10-17 23:07:00 +02:00
Steven Liu
b9bd8c45a1
[CI] Build translated docs ( #41632 )
...
fix
2025-10-16 14:01:33 +02:00
Marc Sun
2b2c20f315
Update issue template ( #41573 )
...
* update
* fix
2025-10-15 13:54:37 +02:00
Rémi Ouazan
94df0e6560
Benchmark overhaul ( #41408 )
...
* Big refactor, still classes to move around and script to re-complexify
* Move to streamer, isolate benches, propagate num tokens
* Some refacto
* Added compile mode to name
* Re-order
* Move to dt_tokens
* Better format
* Fix and disable use_cache by default
* Fixed compile and SDPA backend default
* Refactor results format
* Added default compile mode
* Always use cache
* Fixed cache and added flex
* Plan for missing modules
* Experiments: no cg and shuffle
* Disable compile for FA
* Remove wall time, add sweep mode, get git commit
* Review compliance, start
* Apply suggestions from code review
Co-authored-by: Luc Georges <McPatate@users.noreply.github.com >
* Update benchmark_v2/framework/benchmark_runner.py
Co-authored-by: Luc Georges <McPatate@users.noreply.github.com >
* Disable workflow
* Pretty print
* Added some pretty names to have pretty logs
* Review n2 compliance (end?)
* Style and end of PR
---------
Co-authored-by: Luc Georges <McPatate@users.noreply.github.com >
2025-10-14 21:41:43 +02:00
Marc Sun
1a3a5f5289
Remove SigOpt ( #41479 )
...
* remove sigopt
* style
2025-10-09 18:05:55 +02:00
Yih-Dar
42bcc81ba2
Minor security fix for ssh-runner.yml ( #41317 )
...
security issue
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-10-03 14:14:34 +02:00
Yih-Dar
7adb43e60a
Build doc in 2 jobs: en and other languages ( #41290 )
...
* separate
* separate
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-10-02 14:33:57 +00:00
Yih-Dar
e1f1d32af0
Remove some previous team members from allow list of triggering Github Actions ( #41263 )
...
* delete
* delete
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-10-02 16:32:28 +02:00
Luc Georges
639ad8ccd9
feat: use aws-highcpu-32-priv for amd docker img build ( #41285 )
...
* feat: use `aws-highcpu-32-priv` for amd docker img build
* feat: add `workflow_dispatch` event to docker build CI
2025-10-02 12:53:14 +00:00
Yih-Dar
9d8f693c7e
add peft team members to issue/pr template ( #41262 )
...
* add
* Update .github/PULL_REQUEST_TEMPLATE.md
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com >
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com >
2025-10-01 17:26:59 +00:00
Yih-Dar
8e7b0655f1
update code owners ( #41221 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-09-30 16:21:19 +02:00
Tom Aarsen
1f1e93e095
Align pull request template to bug report template ( #41220 )
...
The only difference is that I don't users to https://discuss.huggingface.co/ for hub issues.
2025-09-30 14:25:41 +02:00
Ákos Hadnagy
399c589dfa
Separate docker images for Nvidia and AMD in benchmarking ( #41119 )
...
Separate docker images for Nvidia and AMD
2025-09-29 17:03:27 +02:00
Guillaume LEGENDRE
2dcb20dcec
CI Runners - move amd runners mi355 and 325 to runner group ( #41193 )
...
* Update CI workflows to use devmi355 branch
* Add workflow trigger for AMD scheduled CI caller
* Remove unnecessary blank line in workflow YAML
* Add trigger for workflow_run on main branch
* Update workflow references from devmi355 to main
* Change runner_scale_set to runner_group in CI config
2025-09-29 11:14:19 +02:00
Yih-Dar
03c92884b5
Update team member list for some CI workflows ( #41094 )
...
* update list
* update list
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-09-23 09:48:40 +00:00
Yih-Dar
1bb69cce82
Fix CI jobs being all red 🔴 (false positive) ( #41059 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-09-22 16:51:00 +02:00
Ákos Hadnagy
b9d337b6f3
Add write token for uploading benchmark results to the Hub ( #41047 )
...
* Separate write token for Hub upload
* Address review comments
* Address review comments
2025-09-22 14:13:46 +00:00
Ákos Hadnagy
67097bf340
Fix benchmark runner argument name ( #41012 )
2025-09-20 10:53:56 +02:00
Yuanyuan Chen
96a3e898cd
RUFF fix on CI scripts ( #40805 )
...
Signed-off-by: Yuanyuan Chen <cyyever@outlook.com >
2025-09-19 13:50:26 +00:00
Joao Gante
fce746512b
[docs] rm stray tf/flax autodocs references ( #40999 )
...
rm tf references
2025-09-19 12:04:12 +01:00
Ákos Hadnagy
61eff450d3
Benchmarking v2 GH workflows ( #40716 )
...
* WIP benchmark v2 workflow
* Container was missing
* Change to sandbox branch name
* Wrong place for image name
* Variable declarations
* Remove references to file logging
* Remove unnecessary step
* Fix deps install
* Syntax
* Add workdir
* Add upload feature
* typo
* No need for hf_transfer
* Pass in runner
* Runner config
* Runner config
* Runner config
* Runner config
* Runner config
* mi325 caller
* Name workflow runs properly
* Copy-paste error
* Add final repo IDs and schedule
* Review comments
* Remove wf params
* Remove parametrization from worfkflow files
* Fix callers
* Change push trigger to pull_request + label
* Add back schedule event
* Push to the same dataset
* Simplify parameter description
2025-09-19 08:54:49 +00:00
Yih-Dar
5ac3c5171a
Track the CI (model) jobs that don't produce test output files (process being killed etc.) ( #40981 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-09-18 18:27:27 +02:00
Yih-Dar
738b223f57
Add captured actual outputs to CI artifacts ( #40965 )
...
* fix
* fix
* Remove `# TODO: ???` as it make me `???`
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-09-18 15:40:53 +02:00
Yih-Dar
270da89708
Remove runner_map ( #40880 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-09-16 15:18:07 +02:00
Arthur
96d3795cfc
Update model tags and integration references in bug report ( #40881 )
2025-09-15 12:08:29 +02:00
Ákos Hadnagy
9c804f7ec4
Redirect MI355 CI results to dummy dataset ( #40862 )
2025-09-14 18:42:49 +02:00