Metadata-Version: 2.1
Name: questionnaire_mistral
Version: 1.5
Home-page: https://github.com/skillfi/questionnairemistral
Author: Alex
License: Apache 2.0 License
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Education
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: langchain
Requires-Dist: langchain-community
Requires-Dist: langchain-huggingface
Requires-Dist: playwright
Requires-Dist: html2text
Requires-Dist: sentence-transformers
Requires-Dist: faiss-cpu
Requires-Dist: pandas
Requires-Dist: peft ==0.4.0
Requires-Dist: trl ==0.4.7
Requires-Dist: pypdf
Requires-Dist: bitsandbytes
Requires-Dist: accelerate
Requires-Dist: datasets
Provides-Extra: torch
Requires-Dist: torch ; extra == 'torch'
Requires-Dist: torchvision ; extra == 'torch'
Requires-Dist: torchaudio ; extra == 'torch'

MistralAI Questionnaire
=================================

This project provides a toolkit for generating questionnaire from documents: [``txt``, ``docx``, ``pdf``] to ``.csv`` dataset format.

Requirements
------------

Before starting, you need to install the following libraries:
 .. code-block:: python

  pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124

- ``langchain``
- ``langchain_community``
- ``langchain-huggingface``
- ``playwright``
- ``html2text``
- ``sentence_transformers``
- ``faiss-cpu``
- ``pandas``
- ``peft==0.4.0``
- ``trl==0.4.7``
- ``pypdf``
- ``bitsandbytes``
- ``accelerate``

Description
-----------

ModelManager
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This class is responsible for loading mistralai model and generating QA.

Constructor
^^^^^^^^^^^

.. code-block:: python

   __init__(self, model_name)

- **model_name**: The path or name of the pre-trained model.


Methods
^^^^^^^

- **setup_tokenizer()**: Loads and configures the tokenizer for the model.
- **setup_bitsandbytes_parameters()**: Configures parameters for bit quantization (BitsAndBytes).
- **from_pretrained()**: Loads the model with pre-trained weights and quantization configuration.
- **print_model_parameters(examples)**: Prints the number of trainable and total parameters of the model.
- **__call__(self, *args, **kwargs)**: The main method for running the generate tasks.

Usage
-----

To start generating QA, you should create an instance of the ``ModelManager`` class and call its ``__call__`` method, passing the necessary arguments.

.. code-block:: python
   from questionnaire_mistral.models import ModelManager
   model = ModelManager(model_name="path_to_model")
   model(document=document, task=task, document_content=document_content, task_count=task_count)

License
-------

The project is distributed under the MIT License.
