-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make sure nvtx3 is used #4937
make sure nvtx3 is used #4937
Conversation
I verified kaldi compiles with this patch even on nvidia/cuda:11.8.0-cudnn8-devel-ubuntu20.0 (generation older drivers). |
I'll merge this and we will see what happens :) |
we got that same failure with lattice-utils-test. |
yeah, that wasn't supposed to fix that -- that was fixing some compilation
issues in ubuntu docker
The log of the failed test is here:
https://github.com/kaldi-asr/kaldi/actions/runs/10886905072/artifacts/1938314600
y.
…On Mon, Sep 16, 2024 at 7:08 PM Daniel Povey ***@***.***> wrote:
we got that same failure with lattice-utils-test.
—
Reply to this email directly, view it on GitHub
<#4937 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACUKYX5KR27LQJTDDMAHINTZW4GADAVCNFSM6AAAAABODZGOLSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNJTGQ3DMNZUHA>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
|
It looks like it might possibly be a failure with OpenFST's ShortestPath algorithm with n > 1 (i.e. requesting more than one path in n-best). Does anyone have time to see whether they made any changes to that algorithm from 1.7.x to 1.8.x? |
I can try maybe next week unless someone else volunteers
y.
…On Tue, Sep 17, 2024 at 6:13 AM Daniel Povey ***@***.***> wrote:
It looks like it might possibly be a failure with OpenFST's ShortestPath
algorithm with n > 1 (i.e. requesting more than one path in n-best). Does
anyone have time to see whether they made any changes to that algorithm
from 1.7.x to 1.8.x?
—
Reply to this email directly, view it on GitHub
<#4937 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACUKYX5YV6OEPR2OY2EKXXDZW6UALAVCNFSM6AAAAABODZGOLSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNJUGQ3DQNZYGA>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
|
I tried to have a look at diffs of the OpenFst code but don't see anything that looks very relevant. It will probably be necessary to see the FSTs in the failing test. |
Last time I was looking at this, the fats looked the same only the weights
were different (it was as if the second part of the composite lattice
weight disappeared, ie was eg 15 instead of 15, 0_1)
Y.
…On Tue, Sep 17, 2024 at 14:33 Daniel Povey ***@***.***> wrote:
I tried to have a look at diffs of the OpenFst code but don't see anything
that looks very relevant. It will probably be necessary to see the FSTs in
the failing test.
—
Reply to this email directly, view it on GitHub
<#4937 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACUKYXY556RVWN3NHSSY5PDZXAOTPAVCNFSM6AAAAABODZGOLSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNJVGYYTOMZXGI>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
|
* upstream/master: (76 commits) [CI] Automatically refresh docker images (kaldi-asr#4942) make sure nvtx3 is used (kaldi-asr#4937) [CI] Upgrade checkout action (kaldi-asr#4940) Upload logs after build failure (kaldi-asr#4939) upload the error logs to artifact repository (kaldi-asr#4938) fix cuda build with openfst 1.8.3 (kaldi-asr#4936) fix the non-compiling tests, thx to @csukuangfj add support for later openfst versions make nonconst catches const (kaldi-asr#4926) Support openfst-1.8.2 Support openfst-1.8.1 Support openfst-1.8.0 Support openfst-1.7.6 make codefactor happier improve compatibility with C++ standard, esp. C++20 do a full cleanup on apple silicon disable warning about unused flags msse and msse on Apple Silicon fix tests and address comments catch exception by reference so that compiler does not complain Fix reported issues w.r.t python2.7 and some apple silicone quirks ...
As far as I know, this is what nvToolsExt repo recommends (https://github.com/NVIDIA/NVTX) and they make it sound older version (sub-3) is ancient. Not sure how valid that claim is, but latest cuda docker (docker.io/nvidia/cuda:12.6.1-cudnn-devel-ubuntu22.04) needs this changes to compile -- otherwise it will end up complaining about mixing nvtx versions -- see https://github.com/jtrmal/kaldi/actions/runs/10830626409/job/30050811732#step:7:6551
I don't have much access to different grids/setups so not sure if this would be good enough or it will cause troubles -- in that case we would have to somehow detect nvtx3 and create define for conditional compiling. Or just get nvtx3 from github -- version 3 is header-only, so doesn't need any compilation and then linkage with a resulting library
TLDR: official docker image nvidia/cuda:12.6.1-cudnn-devel-ubuntu22.04 doesn't compile without this