Post-training quantization reduces the computational demand of Large Language Models (LLMs) but can weaken some of their capabilities. Since LLM abilities emerge with scale, smaller LLMs are more sensitive to quantization. In this paper, we explore how quantization affects smaller LLMs’ ability to perform retrieval-augmented generation (RAG), specifically in longer contexts. We chose personalization for evaluation because it is a challenging domain to perform using RAG as it requires long-context reasoning over multiple documents. We compare the original FP16 and the quantized INT4 performance of multiple 7B and 8B LLMs on two tasks while progressively increasing the number of retrieved documents to test how quantized models fare against longer contexts. To better understand the effect of retrieval, we evaluate three retrieval models in our experiments. Our findings reveal that if a 7B LLM performs the task well, quantization does not impair its performance and long-context reasoning capabilities. We conclude that it is possible to utilize RAG with quantized smaller LLMs.
MULTIFILE
The value of a decision can be increased through analyzing the decision logic, and the outcomes. The more often a decision is taken, the more data becomes available about the results. More available data results into smarter decisions and increases the value the decision has for an organization. The research field addressing this problem is Decision mining. By conducting a literature study on the current state of Decision mining, we aim to discover the research gaps and where Decision mining can be improved upon. Our findings show that the concepts used in the Decision mining field and related fields are ambiguous and show overlap. Future research directions are discovered to increase the quality and maturity of Decision mining research. This could be achieved by focusing more on Decision mining research, a change is needed from a business process Decision mining approach to a decision focused approach.
DOCUMENT