Skip to content

NillPointer/disertation-msc

Repository files navigation

Reducing Computational Requirements for Large Language Models

Overview

This repository contains my master's dissertation titled "Methods of Reducing Computational Requirements for Large Language Models". The research explores various techniques to compress large language models (LLMs) to reduce their computational requirements, making them more accessible for consumers and small organizations with strict security or privacy requirements. The study focuses on post training quantization methods such as GGUF, AWQ, and VPTQ, as well as pruning techniques, and evaluates their effectiveness on selected LLMs like Gemma2 9B, LLaMa 3.1 8B, and Qwen2.5 7B.

Key Contributions

  • Investigates the effectiveness of popular quantization methods (GGUF, AWQ, VPTQ) and pruning techniques.
  • Evaluates the impact of these methods on the performance of LLMs across various benchmark tests.

Dissertation PDF

You can access the full dissertation PDF directly from this repository. Click the link below to download or view the PDF:

Reducing Computational Requirements for Large Language Models (PDF)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published