Ignore multi-line string contents optionally in line_length_linter #2799

MichaelChirico · 2025-02-27T19:45:52Z

Closes #856. Progress on #2737

codecov · 2025-02-27T19:50:14Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.24%. Comparing base (2b70b52) to head (a80c6e9).

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #2799   +/-   ##
=======================================
  Coverage   99.24%   99.24%           
=======================================
  Files         129      129           
  Lines        7282     7316   +34     
=======================================
+ Hits         7227     7261   +34     
  Misses         55       55

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

AshesITR

Looking at the tests I'm wondering whether all you need is supress lints if the lint line satisfies line1 < line < line2 for some //STR_CONST?

R/line_length_linter.R

MichaelChirico · 2025-03-03T18:04:44Z

Looking at the tests I'm wondering whether all you need is supress lints if the lint line satisfies line1 < line < line2 for some //STR_CONST?

there's also the issue of how to deal with lines like this:

x <- 'long string starting on a short
line which also has long strings in
the other parts of the body'

Before this rule, there's 3 lints, but under ignore_string_bodies=TRUE here, there are 0

MichaelChirico · 2025-05-07T23:10:05Z

Bumping for review :)

NEWS.md

Bisaloo

I had to come back to this several time because the current version of is_in_string_body() is quite complex. I think part of it is the switch between vectorized and vapply logic and the easily missed changes between str_idx/long_idx as well as str_data/parse_data.

Bisaloo · 2025-09-20T18:29:03Z

R/line_length_linter.R

+#' @param length Maximum line length allowed. Default is `80L` (Hollerith limit).
+#' @param ignore_string_bodies Logical, default `FALSE`. If `TRUE`, the contents
+#'   of string literals are ignored. The quotes themselves are included, so this
+#'   mainly affects wide multiline strings, e.g. SQL queries.


Suggested change

#' mainly affects wide multiline strings, e.g. SQL queries.

#' only affects wide multiline strings, e.g. SQL queries.

?

Looking at the code, this seems clearer & more precise.

Bisaloo · 2025-10-18T06:08:53Z

R/line_length_linter.R

+#' @param length Maximum line length allowed. Default is `80L` (Hollerith limit).
+#' @param ignore_string_bodies Logical, default `FALSE`. If `TRUE`, the contents
+#'   of string literals are ignored. The quotes themselves are included, so this
+#'   mainly affects wide multiline strings, e.g. SQL queries.


Could you add some minimal examples for this new argument please?

Bisaloo · 2025-10-18T06:49:17Z

R/line_length_linter.R

+is_in_string_body <- function(parse_data, max_length, long_idx) {
+  str_idx <- parse_data$token == "STR_CONST"
+  if (!any(str_idx)) {
+    return(rep(FALSE, length(long_idx)))
+  }
+  str_data <- parse_data[str_idx, ]
+  if (all(str_data$line1 == str_data$line2)) {
+    return(rep(FALSE, length(long_idx)))
+  }
+  # right delimiter just ends at 'col2', but 'col1' takes some sleuthing
+  str_data$line1_width <- nchar(vapply(
+    strsplit(str_data$text, "\n", fixed = TRUE),
+    function(x) x[1L],
+    FUN.VALUE = character(1L),
+    USE.NAMES = FALSE
+  ))
+  str_data$col1_end <- str_data$col1 + str_data$line1_width
+  vapply(
+    long_idx,
+    function(line) {
+      # strictly inside a multi-line string body
+      if (any(str_data$line1 < line & str_data$line2 > line)) {
+        return(TRUE)
+      }
+      on_line1_idx <- str_data$line1 == line
+      if (any(on_line1_idx)) {
+        return(max(str_data$col1_end[on_line1_idx]) <= max_length)
+      }
+      # use parse data to capture possible trailing expressions on this line
+      on_line2_idx <- parse_data$line2 == line
+      any(on_line2_idx) && max(parse_data$col2[on_line2_idx]) <= max_length
+    },
+    logical(1L)
+  )
+}


I think this is correct but I'm really struggling to wrap my head around this function, which concerns me a little bit for maintainability.

In particular, all current tests pass with this much simpler version of the vapply() call

Suggested change

is_in_string_body <- function(parse_data, max_length, long_idx) {

str_idx <- parse_data$token == "STR_CONST"

if (!any(str_idx)) {

return(rep(FALSE, length(long_idx)))

}

str_data <- parse_data[str_idx, ]

if (all(str_data$line1 == str_data$line2)) {

return(rep(FALSE, length(long_idx)))

}

# right delimiter just ends at 'col2', but 'col1' takes some sleuthing

str_data$line1_width <- nchar(vapply(

strsplit(str_data$text, "\n", fixed = TRUE),

function(x) x[1L],

FUN.VALUE = character(1L),

USE.NAMES = FALSE

))

str_data$col1_end <- str_data$col1 + str_data$line1_width

vapply(

long_idx,

function(line) {

# strictly inside a multi-line string body

if (any(str_data$line1 < line & str_data$line2 > line)) {

return(TRUE)

}

on_line1_idx <- str_data$line1 == line

if (any(on_line1_idx)) {

return(max(str_data$col1_end[on_line1_idx]) <= max_length)

}

# use parse data to capture possible trailing expressions on this line

on_line2_idx <- parse_data$line2 == line

any(on_line2_idx) && max(parse_data$col2[on_line2_idx]) <= max_length

},

logical(1L)

)

}

is_in_string_body <- function(parse_data, max_length, long_idx) {

str_idx <- parse_data$token == "STR_CONST"

if (!any(str_idx)) {

return(rep(FALSE, length(long_idx)))

}

str_data <- parse_data[str_idx, ]

if (all(str_data$line1 == str_data$line2)) {

return(rep(FALSE, length(long_idx)))

}

vapply(

long_idx,

function(line) {

# strictly inside a multi-line string body

any(str_data$line1 < line & str_data$line2 > line)

},

logical(1L)

)

}

Are we missing tests?

MichaelChirico added 2 commits February 27, 2025 19:43

ignore string contents optionally

98ebda0

expect_no_lint

1478a4c

test with no string input

75052f4

AshesITR reviewed Mar 1, 2025

View reviewed changes

R/line_length_linter.R Outdated Show resolved Hide resolved

R/line_length_linter.R Show resolved Hide resolved

R/line_length_linter.R Show resolved Hide resolved

MichaelChirico commented Mar 3, 2025

View reviewed changes

R/line_length_linter.R Outdated Show resolved Hide resolved

MichaelChirico added 3 commits March 3, 2025 09:55

revert as.integer()

cdcadd8

Merge branch 'main' into line-length-str-body

08856b1

more tests against off-by-one issue

b44c55a

Merge branch 'main' into line-length-str-body

39627ce

MichaelChirico added 4 commits May 8, 2025 13:42

Merge branch 'main' into line-length-str-body

e9018ec

Merge branch 'main' into line-length-str-body

314dd8b

Merge branch 'main' into line-length-str-body

ff20158

Merge branch 'main' into line-length-str-body

744f09e

MichaelChirico requested review from Bisaloo, IndrajeetPatil and olivroy and removed request for olivroy June 27, 2025 18:55

Merge branch 'main' into line-length-str-body

844c4e7

MichaelChirico added this to the 3.3.0 milestone Jul 29, 2025

MichaelChirico changed the title ~~Ignore multi-string contents optionally in line_length_linter~~ Ignore multi-line string contents optionally in line_length_linter Jul 29, 2025

Merge branch 'main' into line-length-str-body

2ac86a8

MichaelChirico commented Jul 29, 2025

View reviewed changes

NEWS.md Outdated Show resolved Hide resolved

MichaelChirico added 3 commits July 29, 2025 12:55

typo: with->width

4c0a210

Merge branch 'main' into line-length-str-body

eaf77c1

Merge branch 'main' into line-length-str-body

a80c6e9

Bisaloo reviewed Oct 18, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ignore multi-line string contents optionally in line_length_linter #2799

Ignore multi-line string contents optionally in line_length_linter #2799

Uh oh!

MichaelChirico commented Feb 27, 2025

Uh oh!

codecov bot commented Feb 27, 2025 •

edited

Loading

Uh oh!

AshesITR left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MichaelChirico commented Mar 3, 2025

Uh oh!

MichaelChirico commented May 7, 2025

Uh oh!

Uh oh!

Bisaloo left a comment

Uh oh!

Bisaloo Sep 20, 2025

Uh oh!

Bisaloo Oct 18, 2025

Uh oh!

Bisaloo Oct 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	#' mainly affects wide multiline strings, e.g. SQL queries.
	#' only affects wide multiline strings, e.g. SQL queries.

Ignore multi-line string contents optionally in line_length_linter #2799

Are you sure you want to change the base?

Ignore multi-line string contents optionally in line_length_linter #2799

Uh oh!

Conversation

MichaelChirico commented Feb 27, 2025

Uh oh!

codecov bot commented Feb 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

AshesITR left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MichaelChirico commented Mar 3, 2025

Uh oh!

MichaelChirico commented May 7, 2025

Uh oh!

Uh oh!

Bisaloo left a comment

Choose a reason for hiding this comment

Uh oh!

Bisaloo Sep 20, 2025

Choose a reason for hiding this comment

Uh oh!

Bisaloo Oct 18, 2025

Choose a reason for hiding this comment

Uh oh!

Bisaloo Oct 18, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov bot commented Feb 27, 2025 •

edited

Loading