-
Notifications
You must be signed in to change notification settings - Fork 2.7k
[NPUW] Update NPUW caching properties #31876
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[NPUW] Update NPUW caching properties #31876
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it fixes some behavior, I believe there must be some tests reflecting it?
DEFINE_OPT(NPUW_LLM_CACHE_ROPE, bool, true, npuw::llm::cache_rope, CompileTime); | ||
DEFINE_OPT(NPUW_LLM_CACHE_ROPE, bool, true, npuw::llm::cache_rope, RunTime); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It impacts the IR that's being compiled, how it is a runtime option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of our properties are RunTime. CompileTime is used for the NPU compiler. Also caching property won't work if it's not RunTime since we would to define isAvailable
in filteredConfig
somewhere in the plugin
DEFINE_OPT(NPUW_LLM_SHARED_HEAD, bool, true, npuw::llm::shared_lm_head, CompileTime); | ||
DEFINE_OPT(NPUW_LLM_SHARED_HEAD, bool, true, npuw::llm::shared_lm_head, RunTime); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same thing here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Answered above
struct NPUW_LLM_ADDITIONAL_PREFILL_CONFIG final : OptionBase<NPUW_LLM_ADDITIONAL_PREFILL_CONFIG, ov::AnyMap> { | ||
static std::string_view key() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this, why
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To support caching for any additional configs we might pass
@@ -276,4 +306,95 @@ struct NPUW_LLM_GENERATE_CONFIG final : OptionBase<NPUW_LLM_GENERATE_CONFIG, ov: | |||
return false; | |||
} | |||
}; | |||
|
|||
struct NPUW_LLM_ADDITIONAL_GENERATE_CONFIG final : OptionBase<NPUW_LLM_ADDITIONAL_GENERATE_CONFIG, ov::AnyMap> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move under macro
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
build_jenkins |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks to the refactoring the net effect is just two more lines - that's the way to go!
build_jenkins |
No description provided.