Skip to content

Conversation

adamziel
Copy link
Collaborator

@dmsnell noted:

the passing around of the index with the -1 sentinel value is hard for me to follow. did you consider setting something like $this->currently_examined_attribute_name = null;?

This PR switches from to using indices to using a list of names.

or is the issue reentrancy in all of these functions?

Block markup parser is not re-entrant. We should never need to process block markup that blows up the memory so I didn't bother implementing re-entrancy there.

Testing instructions

CI

@adamziel adamziel force-pushed the simplify-url-attribute-iterations branch from e4dd7b5 to 75f322c Compare September 10, 2025 15:55

while ( count( $this->inspecting_html_attributes ) > 0 ) {
$attr = $this->inspecting_html_attributes[ count( $this->inspecting_html_attributes ) - 1 ];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose the order does or doesn’t matter, and that would suggest [count( $ ) - 1] or [0] is more appropriate. [0] is clear and easy to reason about. array_pop() is nice because I think it mostly updates an internal counter and frees the last item, vs array_shift() which does a mem copy?

or maybe not. I skimmed the PHP source and it looks like array_shift() might just unset the element and update the array value’s internal pointer, so maybe they are comparable in performance.

Copy link
Collaborator Author

@adamziel adamziel Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

array_shift() seems to be much slower empirically, which is why I went for count( $ ) - 1 even though I don't like it visually:

array_pop duration: 0.14569401741028 seconds
array_shift duration: 3.1403939723969 seconds
<?php

function benchmark_array_operations($iterations = 10000) {
    $array = range(1, 1000);

    // Benchmark array_pop
    $start_time = microtime(true);
    for ($i = 0; $i < $iterations; $i++) {
        $temp_array = $array;
        while (!empty($temp_array)) {
            array_pop($temp_array);
        }
    }
    $pop_duration = microtime(true) - $start_time;

    // Benchmark array_shift
    $start_time = microtime(true);
    for ($i = 0; $i < $iterations; $i++) {
        $temp_array = $array;
        while (!empty($temp_array)) {
            array_shift($temp_array);
        }
    }
    $shift_duration = microtime(true) - $start_time;

    echo "array_pop duration: " . $pop_duration . " seconds\n";
    echo "array_shift duration: " . $shift_duration . " seconds\n";
}

benchmark_array_operations();

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the thing that seemed different was that it could trigger re-hashing the array. if that’s the case, it would definitely over-represent in a synthetic microbenchmark.

it all seems reasonable what you are doing though 👍

@adamziel adamziel merged commit de6af10 into trunk Sep 10, 2025
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants