Knowledge Base Processor

A Python tool for extracting, analyzing, and managing metadata from Markdown-based knowledge bases. The processor parses Markdown files to extract tags, headings, links, and other structured information, supporting advanced knowledge management workflows.

Features

Extracts metadata, tags, and structural elements from Markdown files
Modular architecture for analyzers, extractors, and enrichers
Easily extensible for new metadata types or processing logic
Command-line interface for batch processing
Comprehensive test suite

Developing

Setup

Clone the repository:

git clone https://github.com/your-username/knowledgebase-processor.git
cd knowledgebase-processor

Install Poetry (if not already installed):

curl -sSL https://install.python-poetry.org | python3 -

Install dependencies:
```
poetry install
```

Running Tests

Run all tests using the provided script:

poetry run python scripts/run_tests.py

Running the Processor

To process your knowledge base, use:

poetry run python scripts/run_processor.py

For available options and arguments, run:

poetry run python scripts/run_processor.py --help

Process and Load Knowledgebase into SPARQL Endpoint

The process-and-load command processes all files in a knowledgebase directory, generates RDF data, and loads it into a SPARQL endpoint in a single step.

Basic usage:

poetry run python -m knowledgebase_processor.cli.main process-and-load /path/to/knowledgebase

Options:

--pattern PATTERN Only process files matching the given glob pattern (e.g., *.md).
--graph GRAPH_URI Specify the named graph URI to load data into.
--endpoint ENDPOINT_URL Override the default SPARQL endpoint URL.
--update-endpoint UPDATE_ENDPOINT_URL Specify a separate SPARQL update endpoint.
--cleanup Remove temporary RDF files after loading.

Example invocations:

# Process all Markdown files and load into default endpoint
poetry run python -m knowledgebase_processor.cli.main process-and-load ./sample_data

# Process only daily notes and load into a specific named graph
poetry run python -m knowledgebase_processor.cli.main process-and-load ./sample_data --pattern "daily-note-*.md" --graph "http://example.org/graph/daily"

# Specify a custom SPARQL endpoint and cleanup temporary files
poetry run python -m knowledgebase_processor.cli.main process-and-load ./sample_data --endpoint http://localhost:3030/ds --cleanup

Progress and errors will be reported in the console. For more options, run:

poetry run python -m knowledgebase_processor.cli.main process-and-load --help

And the processor also handles wikilinks [[A wikilink]]

Contributing

Fork the repository, create a feature branch, and submit a pull request. Please ensure all tests pass before submitting.

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
.claude		.claude
.devcontainer		.devcontainer
.github/workflows		.github/workflows
.roo		.roo
.state/tasks		.state/tasks
.vscode		.vscode
docs		docs
examples		examples
memory		memory
sample_data		sample_data
scripts		scripts
src/knowledgebase_processor		src/knowledgebase_processor
test-cli		test-cli
tests		tests
vocabulary		vocabulary
{{workspace}}		{{workspace}}
.dockerignore		.dockerignore
.gitignore		.gitignore
.python-version		.python-version
.roomodes		.roomodes
ARCHITECTURE_V2.md		ARCHITECTURE_V2.md
CLAUDE.md		CLAUDE.md
CLI_V2_DEMO.md		CLI_V2_DEMO.md
Dockerfile		Dockerfile
Dockerfile.alpine		Dockerfile.alpine
LICENSE		LICENSE
README-docker.md		README-docker.md
README.md		README.md
docker-compose.app.yml		docker-compose.app.yml
docker-compose.prod.yml		docker-compose.prod.yml
docker-compose.yml		docker-compose.yml
poetry.lock		poetry.lock
project.toml		project.toml
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
task_log.jsonl		task_log.jsonl
task_queue.jsonl		task_queue.jsonl
test_issue_49_integration.py		test_issue_49_integration.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Knowledge Base Processor

Features

Developing

Setup

Running Tests

Running the Processor

Process and Load Knowledgebase into SPARQL Endpoint

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

dstengle/knowledgebase-processor

Folders and files

Latest commit

History

Repository files navigation

Knowledge Base Processor

Features

Developing

Setup

Running Tests

Running the Processor

Process and Load Knowledgebase into SPARQL Endpoint

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages