[WIKI-553] chore: improved pages components tracking #7966

NarayanBavisetti · 2025-10-14T15:09:35Z

Description

improved the tracking in pages for the images and mentions.

Type of Change

Improvement (change that would cause existing functionality to not work as expected)

Summary by CodeRabbit

Bug Fixes
- Prevents duplicate page activity entries and removes outdated ones when mentions change.
- Handles empty or missing content gracefully to avoid errors.
Refactor
- Streamlined component extraction into a single pass for faster, more reliable updates to page activity.
- Consolidated logging and error handling to improve observability.

before the refactor

after the refactor

there is no major shift in the memory, but have reduced the size of the content being passed in the worker (before this all the description formats were being passed in the task, now its change only description_html is been sent.

makeplane · 2025-10-14T15:09:44Z

Linked to Plane Work Item(s)

[WIKI-553] Implement tracking for editor attachment components

This comment was auto-generated by Plane

coderabbitai · 2025-10-14T15:10:03Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Centralizes HTML component extraction into a COMPONENT_MAP with extract_all_components and get_entity_details, updates page_transaction to use unified extraction and diffing (bulk insert/delete of PageLog), adds logging and import adjustments, and updates view call sites to pass new_description_html/old_description_html/page_id.

Changes

Cohort / File(s)	Summary
Page transaction refactor and helpers `apps/api/plane/bgtasks/page_transaction_task.py`	- Added `COMPONENT_MAP`-driven `extract_all_components(description_html)` for single-pass parsing of HTML into component->mentions lists. - Added `get_entity_details(component, mention)` to normalize `entity_name`, `entity_type`, `entity_identifier` per component. - Refactored `page_transaction(new_description_html, old_description_html, page_id)` to compute `old_components`/`new_components`, diff IDs per component, bulk-insert new `PageLog` entries (`ignore_conflicts=True`) and bulk-delete removed entries; preserved Page retrieval and error handling. - Added `logger` and adjusted imports placement.
View call-site updates (transaction payload args) `apps/api/plane/app/views/page/base.py`	- Replaced prior `new_value/old_value` call-site payloads with structured keyword args: `new_description_html`, `old_description_html`, and `page_id` across create, partial_update, binary update, duplication, and pages description endpoints. No public API signature changes.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Task as PageTransactionTask
  participant Extract as ComponentExtractor
  participant DB as DB (Page / PageLog)
  participant Log as Logger

  Note over Task,Extract: Single-pass extraction for old and new HTML

  Task->>DB: Fetch Page by page_id
  alt Page missing / error
    Task->>Log: log exception
    Task-->>Task: exit
  else Page found
    Task->>Extract: extract_all_components(old_description_html)
    Extract-->>Task: old_components {component: [mentions]}
    Task->>Extract: extract_all_components(new_description_html)
    Extract-->>Task: new_components {component: [mentions]}

    Note over Task: Normalize mentions -> entity details
    loop each component in new_components
      Task->>Task: get_entity_details(component, mention) -> (id,type,name)
    end

    Task->>DB: Query existing PageLog entries for page
    Task->>Task: Compute deleted_entity_ids and new_unseen_entities
    Task->>DB: Bulk insert new PageLog (ignore_conflicts=True)
    Task->>DB: Bulk delete removed PageLog entries
    Task-->>Task: finish
  end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

I hopped through HTML in one quick pass,
Found mentions, names, each little class.
Old crumbs removed, new crumbs kept near,
I bulk-insert, bulk-delete — tidy and clear.
A rabbit’s logbook saved without a fuss. 🐇✨

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description Check	⚠️ Warning	The pull request description is incomplete and lacks critical information required by the template. While the "Description" and "Type of Change" sections are present, the "Test Scenarios" section—which appears to be a required field based on the template structure—is entirely missing. The description itself is also quite brief, providing only minimal detail about what was improved. Additionally, the PR objectives note indicates that a reviewer requested benchmark results be added to the PR description, but this has not been completed.	The author should expand the PR description to include a "Test Scenarios" section that documents how the changes were tested and validated. More technical detail should be added to the "Description" section to explain the refactoring approach and benefits. Additionally, as requested by the reviewer, benchmark results comparing the old and new logic should be included in the description to demonstrate the performance impact of these changes.
Docstring Coverage	⚠️ Warning	Docstring coverage is 42.86% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title "[WIKI-553] chore: improved pages components tracking" accurately reflects the main objective of this pull request. The summary shows the changes introduce a centralized component extraction mechanism and refactor page transaction workflows to improve tracking of images and mentions in pages. The title is concise, specific, and clearly communicates the primary improvement being made without unnecessary noise or vague terms.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch chore-page-component-tracking

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull Request Overview

This PR enhances page component tracking by expanding support beyond mentions to include image components and implementing a more flexible, optimized architecture for tracking page content changes.

Refactored component extraction to handle multiple component types in a single pass
Added support for image-component tracking alongside existing mention-component tracking
Improved performance with optimized HTML parsing and increased bulk operation batch size

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

apps/api/plane/bgtasks/page_transaction_task.py

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

apps/api/plane/bgtasks/page_transaction_task.py (1)

130-136: Consider adding success logging for observability.

Adding log statements for successful bulk operations helps with monitoring and debugging in production.

         # Bulk insert and cleanup
         if new_transactions:
             PageLog.objects.bulk_create(
                 new_transactions, batch_size=50, ignore_conflicts=True
             )
+            logger.info(f"Created {len(new_transactions)} PageLog entries for page {page_id}")

         if deleted_transaction_ids:
             PageLog.objects.filter(transaction__in=deleted_transaction_ids).delete()
+            logger.info(f"Deleted {len(deleted_transaction_ids)} PageLog entries for page {page_id}")

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 97e6622 and 9790ff4.

📒 Files selected for processing (1)

apps/api/plane/bgtasks/page_transaction_task.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

apps/api/plane/bgtasks/page_transaction_task.py (2)

apps/api/plane/db/models/page.py (2)

Page (19-73)

PageLog (76-113)

apps/api/plane/utils/exception_logger.py (1)

log_exception (9-20)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Lint API
GitHub Check: Analyze (javascript)

🔇 Additional comments (4)

apps/api/plane/bgtasks/page_transaction_task.py (4)

3-3: LGTM!

The addition of logging import and logger setup improves observability for background task execution.

Also applies to: 11-17

42-69: LGTM!

The single-pass component extraction approach is efficient and the defensive error handling (returning empty dicts for missing HTML or exceptions) prevents cascading failures.

72-79: LGTM!

The normalization logic cleanly separates raw attribute extraction from entity detail mapping, and the fallback for unknown components prevents errors.

18-35: No change needed for entity_type. PageLog.entity_type isn’t used in any downstream filters or logic, so leaving it null in both extractors is intentional.

apps/api/plane/bgtasks/page_transaction_task.py

dheeru0198 · 2025-10-16T14:35:42Z

@NarayanBavisetti Can you benchmark the old and new logic and add the results to the PR description?

…component-tracking

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

apps/api/plane/bgtasks/page_transaction_task.py (2)
41-68: Consider logging exceptions for better observability.

The exception handler at line 66 silently returns empty results without logging. While this defensive approach prevents the task from failing, it makes debugging harder when HTML parsing issues occur.

Apply this diff to add logging:
     except Exception:
+        logger.warning("Failed to extract components from HTML", exc_info=True)
         return {component: [] for component in component_map.keys()}
This maintains the defensive behavior while providing visibility into parsing failures.

41-77: Optional: Consider adding type hints for clarity.

The new utility functions would benefit from type hints to make the interface clearer:
from typing import Dict, List, Any

def extract_all_components(description_html: str) -> Dict[str, List[Dict[str, Any]]]:
    """
    Extracts all component types from the HTML value in a single pass.
    Returns a dict mapping component_type -> list of extracted entities.
    """
    # ...

def get_entity_details(component: str, mention: dict) -> Dict[str, Any]:
    """
    Normalizes mention attributes into entity_name, entity_type, entity_identifier.
    """
    # ...

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9790ff4 and 913c134.

📒 Files selected for processing (2)

apps/api/plane/app/views/page/base.py (4 hunks)
apps/api/plane/bgtasks/page_transaction_task.py (1 hunks)

🧰 Additional context used

🧠 Learnings (2)

📚 Learning: 2025-10-17T08:23:54.935Z

Learnt from: NarayanBavisetti
PR: makeplane/plane#7966
File: apps/api/plane/bgtasks/page_transaction_task.py:108-127
Timestamp: 2025-10-17T08:23:54.935Z
Learning: In the page transaction system (apps/api/plane/bgtasks/page_transaction_task.py), entity names (entity_name) and entity identifiers (entity_identifier) for mentions and components remain constant once set and are not hardcoded values that change, so PageLog records don't need to handle updates to existing entity attributes.

Applied to files:

apps/api/plane/bgtasks/page_transaction_task.py

📚 Learning: 2025-10-17T08:21:37.502Z

Learnt from: NarayanBavisetti
PR: makeplane/plane#7966
File: apps/api/plane/bgtasks/page_transaction_task.py:37-39
Timestamp: 2025-10-17T08:21:37.502Z
Learning: In `apps/api/plane/bgtasks/page_transaction_task.py`, the `component_map` variable is intentionally kept separate from `COMPONENT_MAP` to allow other modules to extend it with additional component types in the future, even though it currently just copies `COMPONENT_MAP`.

Applied to files:

apps/api/plane/bgtasks/page_transaction_task.py

🧬 Code graph analysis (2)

apps/api/plane/bgtasks/page_transaction_task.py (2)

apps/api/plane/db/models/page.py (2)

Page (19-73)

PageLog (76-113)

apps/api/plane/utils/exception_logger.py (1)

log_exception (9-20)

apps/api/plane/app/views/page/base.py (1)

apps/api/plane/bgtasks/page_transaction_task.py (1)

page_transaction (81-142)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Analyze (javascript)

🔇 Additional comments (3)

apps/api/plane/app/views/page/base.py (1)

140-144: LGTM! Consistent parameter refactoring across all call sites.

All page_transaction.delay() invocations correctly pass the new structured parameters (new_description_html, old_description_html, page_id). The old/new values are captured at the right time:

Create and duplicate flows correctly pass old_description_html=None

Update flows correctly capture the old value before save

Transaction tasks are triggered before the database save, ensuring correct old values

Also applies to: 174-178, 508-512, 558-562

apps/api/plane/bgtasks/page_transaction_task.py (2)

81-142: LGTM! Efficient component-based extraction with correct diff logic.

The refactored approach is solid:

Single-pass HTML parsing for all component types (lines 92-94)

Correct set-based diffing to identify additions and deletions (lines 103-105)

Proper skip logic to avoid duplicate logs (line 109)

Bulk operations for performance (lines 131-136)

The transformation from ad-hoc extraction to a unified, extensible component map is a clear improvement in maintainability.

81-142: Provide benchmark results as requested.

Per the PR comments, @dheeru0198 requested benchmark comparison between the old and new logic. Please add the benchmark results to the PR description to help assess the performance impact of this refactoring.

Consider benchmarking:

HTML parsing time for typical page sizes

Database bulk operation performance

Overall task execution time

This data will help validate that the refactoring maintains or improves performance.

chore: page components tracking

9790ff4

NarayanBavisetti requested review from Copilot and dheeru0198 October 14, 2025 15:09

NarayanBavisetti added the requires approval label Oct 14, 2025

Copilot AI reviewed Oct 14, 2025

View reviewed changes

coderabbitai bot reviewed Oct 14, 2025

View reviewed changes

apps/api/plane/bgtasks/page_transaction_task.py Show resolved Hide resolved

apps/api/plane/bgtasks/page_transaction_task.py Show resolved Hide resolved

pushya22 modified the milestones: 0.12.1-dev, 0.13-dev Oct 15, 2025

NarayanBavisetti added 2 commits October 17, 2025 13:39

Merge branch 'preview' of github.com:makeplane/plane into chore-page-…

cb52148

…component-tracking

chore: changed the transaction task

913c134

coderabbitai bot reviewed Oct 17, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIKI-553] chore: improved pages components tracking #7966

[WIKI-553] chore: improved pages components tracking #7966

NarayanBavisetti commented Oct 14, 2025 •

edited

Loading

Uh oh!

makeplane bot commented Oct 14, 2025

Uh oh!

coderabbitai bot commented Oct 14, 2025 •

edited

Loading

Other AI code review bot(s) detected

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

dheeru0198 commented Oct 16, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[WIKI-553] chore: improved pages components tracking #7966

Are you sure you want to change the base?

[WIKI-553] chore: improved pages components tracking #7966

Conversation

NarayanBavisetti commented Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Summary by CodeRabbit

Uh oh!

makeplane bot commented Oct 14, 2025

Uh oh!

coderabbitai bot commented Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

dheeru0198 commented Oct 16, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

NarayanBavisetti commented Oct 14, 2025 •

edited

Loading

coderabbitai bot commented Oct 14, 2025 •

edited

Loading