Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stanc gives useless error message when model source has non-ASCII characters #501

Closed
ksvanhorn opened this issue Mar 9, 2018 · 4 comments

Comments

@ksvanhorn
Copy link

Summary:

If you have a file that contains a non-ASCII character, then running stanc on the file just tells you that there was a C++ exception, and nothing else.

Description:

If your Stan source contains a non-ASCII character, then stanc just dies with "c++ exception (unknown reason)", which doesn't help in tracking down the cause of the problem, and makes it look like some sort of system instability.

Reproducible Steps:

Write out the "Eight Schools" model to a file schools.stan, then replace all occurrences of the identifier "mu" with the unicode character "μ". Save the results as UTF-8. Then run

stanc("schools.stan")

Current Output:

The following error message:

Error in stanc("~/Tmp/foo.stan") : c++ exception (unknown reason)

Expected Output:

Something telling me that I have illegal, non-ASCII characters in my Stan source.

RStan Version:

2.17.2

R Version:

R version 3.4.3 (2017-11-30)

Operating System:

OS X 10.13.3

@maverickg
Copy link
Contributor

maverickg commented Mar 9, 2018 via email

@bob-carpenter
Copy link

That's not what happens for me. I'm on Mac OS X 10.10.5 and RStan 2.17.3 (one beyond where you're at). Is this still an issue for you in 2.17.3?

parameters {
  real μ;
}
model {
  μ ~ normal(0, 1);
}

Here's what I see:

SYNTAX ERROR, MESSAGE(S) FROM PARSER:

  error in 'modelbdbe5bf695ec_pe' at line 2, column 10
  -------------------------------------------------
     1: parameters {
     2:   real μ;
                 ^
     3: }
  -------------------------------------------------

PARSER EXPECTED: <identifier>

I agree that could be a much clearer message as to why there's a problem.

@ksvanhorn
Copy link
Author

I think this is related to issue #431; when I reinstalled rstan by compiling from source as recommended here under "Troubleshooting," this problem went away.

BTW, in response to Bob's comment, the really nasty problem here was not having any idea what line contained the error; even the minimal message "PARSER EXPECTED: ", with a specific line indicated, is hugely more helpful than "c++ exception (unknown reason)".

@bob-carpenter
Copy link

Thanks for reporting back. I'll close the issue, then. We really do need to be more proactive in the parser about non-ASCII characters and catch the problems right away. I'm going to open an issue to do just that in Stan.:

stan-dev/stan#2485

There's already an issue to allow unicode with UTF-8 encodings; I link to that issue from the issue above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants