Skip to content

Support complex pointer arithmetic in dereference #230

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 6, 2025

Conversation

jserv
Copy link
Collaborator

@jserv jserv commented Aug 5, 2025

Previously, the parser only handled simple identifier dereference (*var) and would crash on complex expressions like *(ptr + offset). This change extends the dereference operator handling to accept general expressions in parentheses.

This enables compilation of previously failing patterns:

  • *(p + 4) - direct offset
  • *(p + i + 2) - variable in expression
  • *(p + i * 2) - arithmetic in expression

It also handles consecutive asterisks ('**pp', '***ppp') by counting dereference levels and applying them iteratively.

Summary by Bito

This pull request enhances the parser's capability to handle complex pointer arithmetic in dereference operations, allowing expressions like *(ptr + offset) and supporting multiple dereferencing levels. It improves robustness by preventing crashes with intricate pointer patterns and includes a comprehensive suite of tests to validate these enhancements.

@jserv jserv requested review from ChAoSUnItY, DrXiao and vacantron and removed request for ChAoSUnItY and DrXiao August 5, 2025 08:17
Copy link
Collaborator

@ChAoSUnItY ChAoSUnItY left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic changes look fine, but maybe adding some additional test suites to show that complex pointer arithmetics on both LHS and RHS are correctly parsed would be better?

Previously, the parser only handled simple identifier dereference (*var)
and would crash on complex expressions like *(ptr + offset). This change
extends the dereference operator handling to accept general expressions
in parentheses.

This enables compilation of previously failing patterns:
- *(p + 4) - direct offset
- *(p + i + 2) - variable in expression
- *(p + i * 2) - arithmetic in expression

It also handles consecutive asterisks ('**pp', '***ppp') by counting
dereference levels and applying them iteratively.

When expressions like arr[0] + arr[1] + arr[2] were parsed, the compiler
was incorrectly applying pointer arithmetic scaling to the values read
from the array elements, resulting in wrong calculations.

The issue was in read_lvalue() which was handling the '+' operator after
array indexing as if it were pointer arithmetic. After arr[0], we have
an integer value, not a pointer, so the '+' should be handled by the
expression parser, not by read_lvalue.

This fix adds a check to ensure pointer arithmetic handling only occurs
when we have a pointer/array that hasn't been dereferenced (i.e., when
lvalue->is_reference is false).

Test case that was failing:
  int arr[3] = {10, 20, 12};
  return arr[0] + arr[1] + arr[2];  // Was returning 26 instead of 42
@jserv jserv force-pushed the pointer-arithmetic branch from 79290dd to 8ce5f7c Compare August 6, 2025 02:27
@jserv jserv requested a review from ChAoSUnItY August 6, 2025 02:34
@jserv jserv merged commit 37eee15 into master Aug 6, 2025
12 checks passed
@jserv jserv deleted the pointer-arithmetic branch August 6, 2025 10:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants