Skip to content

Conversation

@addison74
Copy link
Contributor

Problem

This issue was reported first time in 2021 here #1653.

OpenMage frontend and backend forms could allow administrators and users to enter data for addresses, tax rates, tags, and admin accounts. Without explicit whitespace sanitization, leading or trailing spaces may be accidentally included in critical fields. This can result in:

  • Data inconsistencies (e.g., duplicate records, failed lookups)
  • User confusion (e.g., login or search issues)
  • Problems with validation, filtering, and integration with third-party systems

Solution

This PR trims leading and trailing whitespace from string data in several key models. Trimming is performed at the model level before saving, ensuring form input is stored cleanly and consistently in the database.

Modified Files and Rationale

  • Tag.php

    • Trim tag name before saving.
      Tags may be created with accidental whitespace, resulting in duplicates or unreliable searching. Trimming ensures consistency.
  • Rate.php

    • Trim code, country, region, postcode, zip range, and rate before saving.
      Tax rates configured in the backend can include spaces, causing validation failures or configuration errors. Trimming guarantees correct setup.

    Why is string conversion used with trim() before saving in this file? In this method, several fields (such as code, country, region, postcode, zip range, and rate) are populated from backend forms, where their values may be null, numeric, or string depending on user input, system defaults, or import sources. The PHP trim() function only operates safely on strings. If trim() is called on null or a numeric value, it will silently cast the value to string (e.g., null becomes an empty string, an integer becomes its string representation) and remove any leading or trailing whitespace. This prevents PHP warnings or errors, and ensures all input data is consistently sanitized before storage. Explicitly converting to string before trim() guarantees:

    • No unexpected errors, regardless of the original value type.
    • All fields are normalized to strings after trimming, which matches how data is stored in the database for these attributes.
    • Numeric fields (like rate) are validated after trimming, so type consistency is maintained.
      This approach is safe and robust for all typical Magento backend input scenarios, and ensures clean, reliable data storage.
  • User.php

    • Trim firstname, lastname, email, and username before saving.
      Whitespace in admin account fields can cause login and notification issues. Passwords are intentionally left untouched.
  • Address.php

    • Trim all string fields before saving.
      Addresses are manually entered and prone to accidental spaces. Trimming ensures clean data for shipping, validation, and export.
  • Http.php

    • Override getParam() to trim strings and arrays returned by request parameters.
      This sanitizes input at the earliest stage, preventing whitespace propagation into business logic and storage.

Testing and Compatibility

  • Changes are backward compatible and do not alter business logic.
  • When editing existing data in pages such as Orders, Invoices, Shipments, and Credit Memos, any string fields that contain leading or trailing spaces will be automatically trimmed before saving. This ensures that updates made through backend forms or API calls result in clean and consistent data, preventing issues with searching, filtering, exporting, and integration caused by accidental whitespace.
  • Fields where whitespace may be intentional (e.g. CMS content) are not affected.
  • Modifications tested on backend forms for reliability.

Summary

This PR standardizes data sanitization for critical backend models in Magento OpenMage LTS, preventing data corruption, improving user experience, and strengthening downstream integrations by ensuring stored values are free of unintended whitespace. Password fields, as well as inputs and textareas where users may intentionally use leading or trailing spaces (such as CMS content, product descriptions, or other free-form text fields), are intentionally left unmodified. This preserves the user's intended formatting and content. Trimming is only applied to fields where whitespace is unlikely to be deliberate and may cause data integrity issues, such as usernames, emails, codes, and other identifiers.

@github-actions github-actions bot added Component: Core Relates to Mage_Core Component: Sales Relates to Mage_Sales Component: Tag Relates to Mage_Tag Component: Admin Relates to Mage_Admin Component: Tax Relates to Mage_Tax labels Oct 1, 2025
@addison74
Copy link
Contributor Author

This is before implementation. The buyer inserted spaces before and after the text.

before

This is after pressing the [Save] button. the spaces have been removed.

after

After implementation, all parameters that come from the frontend forms, with some exceptions, have their spaces removed so that after the update there is no need to save before, but I kept it because the administrator can insert the spaces himself.

All green, ready for review.

@sreichel
Copy link
Contributor

sreichel commented Oct 1, 2025

Like that PR. 👍

Do you want to add some tests? (PHPunit/cypress)

@addison74
Copy link
Contributor Author

It is fine to me.

addison74 and others added 2 commits October 1, 2025 23:22
Removed redundant comment block from the save manipulations method.
@sreichel sreichel requested a review from Copilot October 2, 2025 05:23
Copilot

This comment was marked as outdated.

@sreichel sreichel requested a review from Copilot October 2, 2025 05:39
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

@sonarqubecloud
Copy link

@sreichel
Copy link
Contributor

imho ... I'd not touch "before_save" and http requests.

To trim withespaces i'd add getter (and setter) methods to the models.

@addison74
Copy link
Contributor Author

I chose to implement trimming in the _beforeSave() method (and, where relevant, in the HTTP request layer) for several reasons:

Centralized Data Integrity: Trimming in _beforeSave() ensures that, regardless of how data is set on the model (direct assignment, API, import, or via a form), it is always sanitized before being persisted. This reduces the risk of inconsistent data entering the database from multiple sources and keeps the logic predictable.

Framework Consistency: Magento's standard practice is to perform data normalization at the point of persistence rather than at the property accessor (getter/setter). This minimizes the risk of unintended side effects during runtime operations and keeps the model's behavior in line with community expectations.

Minimizing Side Effects: While trimming in setters/getters can help catch whitespace earlier, it may introduce unexpected behavior for developers who expect those methods to be "pure" and not modify data invisibly. This can lead to confusion or subtle bugs, especially for fields where whitespace may be intentional (such as free-form text or content fields).

Selective Trimming: The PR deliberately avoids trimming fields like passwords or content/textareas where whitespace may be valid and intentional. Trimming is only applied to fields where leading/trailing spaces are almost always accidental and problematic (codes, identifiers, names, emails, etc.).

HTTP Request Layer: Trimming here is a safeguard for user input at the earliest stage, but care is taken not to affect parameters where whitespace may be legitimate.

I appreciate your perspective about handling this at the getter/setter level, and I agree that in some cases (for critical identifiers or user-facing data) this can be beneficial. However, I believe that sanitizing at the persistence layer provides a more consistent and maintainable solution, especially given Magento's architecture and diverse data sources.

If there are specific models or scenarios where setter/getter trimming would provide clear benefits over the current approach, I am happy to discuss them further or consider a hybrid solution.

I am reviewing and considering feedback provided by Copilot comments throughout various sections of the code. Where appropriate, I will make further adjustments to improve clarity, maintainability, and consistency, ensuring that the final implementation aligns with best practices and community standards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Component: Admin Relates to Mage_Admin Component: Core Relates to Mage_Core Component: Sales Relates to Mage_Sales Component: Tag Relates to Mage_Tag Component: Tax Relates to Mage_Tax

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants