-
-
Notifications
You must be signed in to change notification settings - Fork 4.7k
Refactor CropBox and PassThrough with FunctorFilter #4892
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
* Add separate loop for not dense point cloud * Change passing FunctionObject to setFunctionObject by value to move
* Implement experimental::CropBox * Copy set up the same CropBox unit tests
* Implement experimental::PassThrough * Copy set up the same PassThrough unit tests
You chose the approach to let |
const std::uint8_t* pt_data = reinterpret_cast<const std::uint8_t*>(&cloud.at(idx)); | ||
memcpy(&field_value, pt_data + *field_offset_, sizeof(float)); | ||
if (!std::isfinite(field_value)) | ||
return *negative_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you not return false
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
returning negative_
can make sure nan
points are always removed in FunctorFilter
(L99 will be always false
with any value of negative_
)
pcl/filters/include/pcl/filters/experimental/functor_filter.h
Lines 97 to 106 in fad821e
for (const auto index : *indices_) { | |
// function object returns true for points that should be selected | |
if (negative_ != functionObject_(*input_, index)) { | |
indices.push_back(index); | |
} | |
else if (extract_removed_indices_) { | |
removed_indices_->push_back(index); | |
} | |
} | |
} |
in the original PassThrough
filtering nan
points is not affected by negative_
pcl/filters/include/pcl/filters/impl/passthrough.hpp
Lines 55 to 71 in fad821e
if (filter_field_name_.empty ()) | |
{ | |
// Only filter for non-finite entries then | |
for (const auto ii : *indices_) // ii = input index | |
{ | |
// Non-finite entries are always passed to removed indices | |
if (!std::isfinite ((*input_)[ii].x) || | |
!std::isfinite ((*input_)[ii].y) || | |
!std::isfinite ((*input_)[ii].z)) | |
{ | |
if (extract_removed_indices_) | |
(*removed_indices_)[rii++] = ii; | |
continue; | |
} | |
indices[oii++] = ii; | |
} | |
} |
Agree, I have also thought it, but my concern is it would have two If it is a neglectable overhead, we could also do it in that way |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests look good, though a bit long
7f98d23
to
f30572a
Compare
filter.setInputCloud(this->getInputCloud()); | ||
filter.applyFilter(indices); | ||
if (this->extract_removed_indices_) | ||
*removed_indices_ = *filter.getRemovedIndices(); // copy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This copy is quite a tragedy, else it looks to be shorter than the previous version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as removed_indices_
is protected, we don't have a choice except adding a new API to return non const ptr of it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't we do remove_indices_ = filter.getRemovedIndices();
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because getRemovedIndices()
returns IndicesConstPtr
only (and the other overload return with copying)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An option I would see is to add a function applyFilter(Indices& indices, Indices& removed_indices)
to the functor filter. Indices would be placed in the given removed_indices
instead of the member removed_indices_
. Then that function could be called like filter.applyFilter(indices, *removed_indices)
. Just an idea, haven't thought it through completely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or we could make a change all the way up in Filter
. 3 possibilities for better integration:
- add ctor for removed_indices
- add additional output in
filter
- add additional function with output arg as
IndicesPtr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which one seems the best way forward?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't come up with any use case after applying to changes to Filter
, so I would prefer to limit this to functor filter only.
Maybe either:
- @mvieth suggestion
- add additional function with output arg as
IndicesPtr
inFunctorFilter
89c66f8
to
294593a
Compare
(pt.array() <= max_pt_.array()).template head<3>().all(); | ||
}; | ||
|
||
auto filter = advanced::FunctorFilter<PointT, decltype(lambda)>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be static?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found it can't easily be static, because lambda is passed only by ctor and it avoids later initialization by ctor. It makes transformation can't be updated, as assigning a new FunctorFilter is not allowed (this is the only way I know to update a static variable)
e34169b
to
955c885
Compare
const auto lambda = [&](const PointCloud& cloud, index_t idx) { | ||
const Eigen::Vector4f pt = pt_transform * cloud.at(idx).getVector4fMap(); | ||
return (pt.array() >= min_pt_.array()).template head<3>().all() && | ||
(pt.array() <= max_pt_.array()).template head<3>().all(); | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think a check for pt_transform being identity would help in increasing the speed a tiny bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly, but I think that would be rare in real-world scenarios. More ideas: 1. make pt_transform a 3x4 matrix because the last element of pt is not used anyway. 2. if pt_transform is translation only or rotation only, compute only what is necessary instead of full matrix-vector-multiplication (these ideas can be explored later though)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly, but I think that would be rare in real-world scenarios.
I usually transform my point cloud first so multiple downstream logic can run independently on it. So all crop-box/filter operations happen with no transform. Having no TF might not be so rare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having no TF might not be so rare
I agree this is not so rare. From my experience working with lidar, I use CropBox
to just remove the points far from the origin without any TFs, just to reduce the runtime for downstream tasks.
I added a check for identity transformation for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- make pt_transform a 3x4 matrix because the last element of pt is not used anyway.
I just benchmark it, and surprisingly found 4x3` is a bit slower. I guess it is due to the following conditions check can be run with SSE
return (pt.array() >= min_pt_.array()).template head<3>().all() &&
(pt.array() <= max_pt_.array()).template head<3>().all();
pt
will be Vector3f
with 3x4 matrix which may not trigger SSE
Benchmark
3x4
---------------------------------------------------------------
Benchmark Time CPU Iterations
---------------------------------------------------------------
320p_CropBox 1.50 ms 1.50 ms 461
320p_FunctorCropBox 1.35 ms 1.35 ms 580
480p_CropBox 3.03 ms 3.03 ms 230
480p_FunctorCropBox 2.72 ms 2.72 ms 253
720p_CropBox 7.93 ms 7.90 ms 67
720p_FunctorCropBox 8.25 ms 8.24 ms 118
1080p_CropBox 20.8 ms 20.8 ms 49
1080p_FunctorCropBox 18.5 ms 18.5 ms 37
1440p_CropBox 38.6 ms 38.5 ms 27
1440p_FunctorCropBox 33.2 ms 33.2 ms 26
4x4
---------------------------------------------------------------
Benchmark Time CPU Iterations
---------------------------------------------------------------
320p_CropBox 1.57 ms 1.57 ms 445
320p_FunctorCropBox 1.28 ms 1.28 ms 674
480p_CropBox 3.03 ms 3.03 ms 229
480p_FunctorCropBox 2.46 ms 2.46 ms 273
720p_CropBox 7.88 ms 7.88 ms 67
720p_FunctorCropBox 7.55 ms 7.54 ms 134
1080p_CropBox 21.1 ms 21.1 ms 48
1080p_FunctorCropBox 17.0 ms 17.0 ms 40
1440p_CropBox 38.3 ms 38.2 ms 27
1440p_FunctorCropBox 30.6 ms 30.6 ms 28
filter.setInputCloud(this->getInputCloud()); | ||
filter.applyFilter(indices); | ||
if (this->extract_removed_indices_) | ||
*removed_indices_ = *filter.getRemovedIndices(); // copy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which one seems the best way forward?
955c885
to
994a389
Compare
Hi, I just tested this MR. The improvement in quality of code architecture is huge, but I haven't seen any significant filtering speedup on average. Both on benchmarks and on real data. CPU: Ryzen 7 2700x
|
@BaltashovIlia Thanks for trying out this work! The main goal of using I just rerun the benchmark of the new
Compiled with default settings This PR is still WIP and experimental, so feel free to give us some suggestions or opinions on our idea |
On Ryzen 5800x + GCC 9.3.0 + pcl default compilation settings:
|
Glad to see at least it isn't slower than the original one on the AMD side 😌 Thanks for benchmarking it on your setup! @BaltashovIlia |
Description
FunctorFilter
to the existing filtersFunctorFilter
APICropBox
andPassThrough
withFunctorFilter
Related
functor_filter
#4247TODO