This repository contains data packages (corpora, models, tokenizers, etc.) for use with NLTK.
To install data using the NLTK downloader, run:
import nltk
nltk.download()
For detailed instructions, please see the NLTK website.
Licensing Transparency (PR #242)
- Added a top-level
LICENSE
(Apache License 2.0) for the repository. - Added
LICENSE-OVERVIEW.md
summarizing the licensing structure, with emphasis on the diversity of dataset licenses and the importance of reviewing individual terms. - Added
DATASET-LICENSES.md
— a comprehensive, grouped list of all data packages and their licenses, highlighting any ambiguous or unclarified licensing. - These changes improve transparency, support responsible use, and aid compliance for all users.
- Introduced a detailed
CONTRIBUTING.md
with step-by-step instructions for adding a new data package using Git and GitHub. - Please see
CONTRIBUTING.md
for instructions on adding datasets and making other contributions. - Contributors are encouraged to clarify dataset licenses and to consult the new licensing overview and dataset license table.
For instructions on adding new data packages, please see CONTRIBUTING.md. For licensing details, see LICENSE-OVERVIEW.md and DATASET-LICENSES.md.