Post-training quantization reduces the computational demand of Large Language Models (LLMs) but can weaken some of their capabilities. Since LLM abilities emerge with scale, smaller LLMs are more sensitive to quantization. In this paper, we explore how quantization affects smaller LLMs’ ability to perform retrieval-augmented generation (RAG), specifically in longer contexts. We chose personalization for evaluation because it is a challenging domain to perform using RAG as it requires long-context reasoning over multiple documents. We compare the original FP16 and the quantized INT4 performance of multiple 7B and 8B LLMs on two tasks while progressively increasing the number of retrieved documents to test how quantized models fare against longer contexts. To better understand the effect of retrieval, we evaluate three retrieval models in our experiments. Our findings reveal that if a 7B LLM performs the task well, quantization does not impair its performance and long-context reasoning capabilities. We conclude that it is possible to utilize RAG with quantized smaller LLMs.
MULTIFILE
This white paper is presented by the Ethics Working Group of the uNLock Consortium This white paper presents findings of the Ethics Working Group, from the conceptual phase of investigation into the ethical issues of the uNLock solution, providing identity management solutions for sharing and presentation of medical COVID-19 credentials (test results) in the context of healthcare institutions. We have provided an outline of direct and indirect stakeholders for the uNLock solution and mapped values, benefits, and harms to the respective stakeholders. The resulting conceptual framework has allowed us to lay down key norms and principles of Self Sovereign Identity (SSI) in the specific context of uNLock solution. We hope that adherence to these norms and principles could serve as a groundwork for anticipatory mitigation of moral risk and hazards stemming from the implementation of uNLock solution and similar solutions. Our findings suggest that even early stage of conceptual investigation in the framework of Value Sensitive Design (VSD), reveals numerous ethical issues. The proposed implementation of the uNLock app in the healthcare context did not proceed further than prototype stage, thus our investigation was limited to the conceptual stage, and did not involve the practical implementation of VSD method involving translation of norms and values into engineering requirements. Nevertheless, our findings suggest that the implementation of VSD method in this context is a promising approach that helps to identify moral conflicts and risks at a very early stage of technological development of SSI solutions. Furthermore, we would like to stress that in the light of our findings it became painfully obvious that hasty implementation of medical credentials system without thorough ethical assessment, risks creating more ethical issues rather than addressing existing ones.
DOCUMENT
In this paper, we explore the design of web-based advice robots to enhance users' confidence in acting upon the provided advice. Drawing from research on algorithm acceptance and explainable AI, we hypothesise four design principles that may encourage interactivity and exploration, thus fostering users' confidence to act. Through a value-oriented prototype experiment and value-oriented semi-structured interviews, we tested these principles, confirming three of them and identifying an additional principle. The four resulting principles: (1) put context questions and resulting advice on one page and allow live, iterative exploration, (2) use action or change oriented questions to adjust the input parameters, (3) actively offer alternative scenarios based on counterfactuals, and (4) show all options instead of only the recommended one(s), appear to contribute to the values of agency and trust. Our study integrates the Design Science Research approach with a Value Sensitive Design approach.
MULTIFILE