Skip to content

ekaf/nltk_data

 
 

Repository files navigation

Data Distribution for NLTK

This repository contains data packages (corpora, models, tokenizers, etc.) for use with NLTK.

Installation

To install data using the NLTK downloader, run:

import nltk
nltk.download()

For detailed instructions, please see the NLTK website.


Recent Enhancements

Licensing Transparency (PR #242)

  • Added a top-level LICENSE (Apache License 2.0) for the repository.
  • Added LICENSE-OVERVIEW.md summarizing the licensing structure, with emphasis on the diversity of dataset licenses and the importance of reviewing individual terms.
  • Added DATASET-LICENSES.md — a comprehensive, grouped list of all data packages and their licenses, highlighting any ambiguous or unclarified licensing.
  • These changes improve transparency, support responsible use, and aid compliance for all users.

Contribution Guidelines

  • Introduced a detailed CONTRIBUTING.md with step-by-step instructions for adding a new data package using Git and GitHub.
  • Please see CONTRIBUTING.md for instructions on adding datasets and making other contributions.
  • Contributors are encouraged to clarify dataset licenses and to consult the new licensing overview and dataset license table.

For instructions on adding new data packages, please see CONTRIBUTING.md. For licensing details, see LICENSE-OVERVIEW.md and DATASET-LICENSES.md.

About

NLTK Data

Resources

License

Apache-2.0, Unknown licenses found

Licenses found

Apache-2.0
LICENSE
Unknown
LICENSE-OVERVIEW.md

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 41.4%
  • XSLT 31.8%
  • Shell 22.2%
  • Makefile 4.6%