Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Ultimate C parser crashes when there are Doxygen group comments in the C code #667

Open
bahnwaerter opened this issue May 23, 2024 · 3 comments
Labels

Comments

@bahnwaerter
Copy link
Member

The current C parser crashes when there are Doxygen group comments in the C code. This issue can be reproduced with the latest version of Ultimate Automizer (0.2.4-dev-1ececfe with OpenJDK 11.0.23) using the following input:

//#SAFE

typedef long unsigned int __uint32_t;
typedef __uint32_t uint32_t;

/** \ingroup  CMSIS_Core_FunctionInterface
    \defgroup CMSIS_Core_RegAccFunctions CMSIS Core Register Access Functions
  @{
 */

/**
  \brief   Get Control Register
  \details Returns the content of the Control Register.
  \return               Control Register value
 */
static inline uint32_t __get_CONTROL(void)
{
  uint32_t result;

  __asm volatile ("MRS %0, control" : "=r" (result) );
  return(result);
}

/**
  \brief   Set Control Register
  \details Writes the given value to the Control Register.
  \param [in]    control  Control Register value to set
 */
static inline void __set_CONTROL(uint32_t control)
{
  __asm volatile ("MSR control, %0" : : "r" (control) : "memory");
}

/*@} end of CMSIS_Core_RegAccFunctions */

int main(void)
{
  uint32_t control = __get_CONTROL();
  assert(control >= 0x00000000);
  __set_CONTROL(control);
  return 0;
}

Considering this example input, the C parser crashes when trying to parse the end marker of a Doxygen group comment (/*@} end of CMSIS_Core_RegAccFunctions */ in line 34). The following error message is reported during a crash:

[...]
[2024-05-23 17:51:32,051 WARN  L194 CommentParser]: Syntax Error: #120[rbrace](0/1 - 0/2)
[...]
[2024-05-23 17:51:32,421 INFO  L174    ResultUtil]:  * Results from CDTParser:
[2024-05-23 17:51:32,421 INFO  L198    ResultUtil]:   - SyntaxErrorResult [Line: 34]: Incorrect Syntax
[2024-05-23 17:51:32,422 INFO  L202    ResultUtil]:     Syntax Error: #120[rbrace](0/1 - 0/2)

In this example, the C parser for the comments expects ACSL specifications and instead gets Doxygen group comments that violate the grammar of ACSL. Unfortunately, the parser cannot ignore the Doxygen group comments.

@schuessf
Copy link
Contributor

We expect any comment starting with @ to be ACSL und thus try to parse the comment as such, see here.
It would be possible to add exceptions there, for this example to work, we could just ignore { and }.

@bahnwaerter
Copy link
Member Author

We expect any comment starting with @ to be ACSL und thus try to parse the comment as such, see here. It would be possible to add exceptions there, for this example to work, we could just ignore { and }.

That's a good idea, but we first have to check that the braces cannot appear in ACSL.

@bahnwaerter
Copy link
Member Author

In fact, the braces can also occur in ACSL comments. Here is an example from the ACSL standard (see 2.2.7 Structures, Unions and Arrays in logic), where parentheses are used in logic expressions for initialization and array access.

//@ type point = struct { real x; real y; };
//@ type triangle = point[3];
//@ logic point origin = { .x = 0.0 , .y = 0.0 };
/*@ logic triangle t_iso = { [0] = origin,
  @                          [1] = { .y = 2.0 , .x = 0.0 }
  @                          [2] = { .x = 2.0 , .y = 0.0 }};
  @*/

Fortunately, the braces are never at the beginning of a comment. We can therefore ignore the braces from Doxygen group comments at the beginning of each comment without restricting the ACSL syntax.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants