using GLiNER for entities and relationships with explicit data? #174
Unanswered
ElJefeDSecurIT
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
HI! I'm kind of an old timer hacker but a first timer to ml training and looking to build out a knowledge graph based on the security domain. I've just come across GLiNER and I'm kind of interested in trying my hand at either fine-tuning it or perhaps building my own. model. I'm working on an experiment that takes the ATT&CK fwk relationships which is all in json, and cooked up 29k annotated sentences, like so:
{"sentence": "Lokibot uses Visual Basic.", "data": [["Lokibot", "malware", 0, 7], ["Visual Basic", "attack-pattern", 13, 25]]}
{"sentence": "Conti uses SMB/Windows Admin Shares.", "data": [["Conti", "malware", 0, 5], ["SMB/Windows Admin Shares", "attack-pattern", 11, 35]]}
{"sentence": "FunnyDream uses ccf32.", "data": [["FunnyDream", "campaign", 0, 10], ["ccf32", "malware", 16, 21]]}
now, I already have a list of labels that I think align with my labeled terms, And also want to keep common use terms (person, location, etc), and lastly - would love for it to be a super-accurate extraction model post-training, but: I just need some validation:
am I on the right path here? what am i missing? should I split my training dataset 70/30 for test data? should I train a clean model or fine tune? if anyone could offer a few pointers which way to go, this would be most helpful. 🙏🏽
Beta Was this translation helpful? Give feedback.
All reactions