-
Notifications
You must be signed in to change notification settings - Fork 17
Open
Labels
enhancementNew feature or requestNew feature or request
Description
As requested as part of #24
It would be neat to support CoNLL-U Plus:
- export only the requested fields (and mark the output CoNLL-U with
global.columns) - allow reading in a CoNLL-U Plus file
- it also supports custom columns but I am hesitant to support those. Perhaps we can use them, if a custom field is present in the private spaCy registered space
._.then we may use that destination. Will have to think about it some more.
Here is an example of a CoNLL-U Plus file. Note how the first line indicates which fields are present (separated by spaces).
# global.columns = ID FORM LEMMA UPOS XPOS FEATS HEAD DEPREL DEPS MISC
# newdoc id = mf920901-001
# newpar id = mf920901-001-p1
# sent_id = mf920901-001-p1s1A
# text = Slovenská ústava: pro i proti
# text_en = Slovak constitution: pros and cons
1 Slovenská slovenský ADJ AAFS1----1A---- Case=Nom|Degree=Pos|Gender=Fem|Number=Sing|Polarity=Pos 2 amod _ _
2 ústava ústava NOUN NNFS1-----A---- Case=Nom|Gender=Fem|Number=Sing|Polarity=Pos 0 root _ SpaceAfter=No
3 : : PUNCT Z:------------- _ 2 punct _ _
4 pro pro ADP RR--4---------- Case=Acc 2 appos _ LId=pro-1
5 i i CCONJ J^------------- _ 6 cc _ LId=i-1
6 proti proti ADP RR--3---------- Case=Dat 4 conj _ LId=proti-1
If you want to see this implemented, please give this post a thumbs up so that I know what to prioritize.
LaRuaNa
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request