Retrieval augmented generation (RAG)

https://huggingface.co/docs/transformers/model_doc/rag

Retrieval-augmented generation (RAG) is a technique that combines pre-trained dense retrieval (DPR) and sequence-to-sequence models. RAG models retrieve documents, pass them to a seq2seq model, then marginalize to generate outputs.

RAG with mistral by Ollama

from langchain_community.llms import Ollama
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.embeddings import OllamaEmbeddings
from langchain_community.vectorstores import Chroma
from langchain import hub
from langchain.chains.question_answering import load_qa_chain

loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/")
data = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
all_splits = text_splitter.split_documents(data)
vectorstore = Chroma.from_documents(documents=all_splits, embedding=OllamaEmbeddings(model="mistral"))

question = "What are the approaches to Task Decomposition?"
docs = vectorstore.similarity_search(question)
len(docs)
￼
llm = Ollama(model="mistral")

llm(question)

rag_prompt = hub.pull("rlm/rag-prompt-mistral")
rag_prompt.messages

chain = load_qa_chain(llm, chain_type="stuff", prompt=rag_prompt)
chain({"input_documents": docs, "question": question}, return_only_outputs=True)

Task Decomposition approaches involve breaking down complex tasks into smaller, manageable sub-tasks that can be handled independently. This method allows for more efficient processing and better handling of dependencies between tasks. Common approaches include top-down decomposition, bottom-up decomposition, and sequential decomposition. Top-down starts with the overall goal and breaks it down into increasingly specific sub-tasks, while bottom-up begins with the basic components and builds up to the complete task. Sequential decomposition handles tasks in a linear fashion, one after another. (Refer to “Artificial Intelligence: A Modern Approach” by Russell & Norvig for more detailed information.)

RAG with microsoft/Orca-2-13b

The task is to ask question based on a blog article. First, we load the article to Chroma vector database.

from langchain.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings.sentence_transformer import SentenceTransformerEmbeddings
from langchain.vectorstores import Chroma
from langchain.llms import HuggingFacePipeline
from langchain import hub
from langchain.chains.question_answering import load_qa_chain
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/")
data = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
all_splits = text_splitter.split_documents(data)

ef = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")
db = Chroma.from_documents(all_splits, ef, persist_directory="./chroma_db")

Then we build a text-generation pipleline.

question = "What is Instruction Prompting?"
docs = db.similarity_search(question, k=5)

model_id = "microsoft/Orca-2-13b"
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto",
                                             torch_dtype="auto", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
pipe = pipeline("text-generation", model=model,
                tokenizer=tokenizer, max_new_tokens=256)

llm = HuggingFacePipeline(pipeline=pipe)

## 0-shot
llm(question)

Instruction prompting is a method of teaching that involves providing students with clear and concise instructions to guide their learning. This approach helps students understand what is expected of them and h
ow to achieve the desired learning outcomes.

Instruction prompting can be used in various educational settings, including classrooms, online courses, and self-directed learning environments. It is particularly useful for students who may struggle with und
erstanding complex instructions or who need additional support to stay on track with their learning.

There are several key components of instruction prompting:

Clear and concise instructions: Instruction prompting involves providing students with clear and concise instructions that outline the learning objectives, tasks, and expectations for each lesson or activity
. This helps students understand what they need to do and how to do it.

Step-by-step guidance: Instruction prompting often includes step-by-step guidance to help students break down complex tasks into smaller, more manageable steps. This can make it easier for students to follow
along and complete the tasks successfully.

Feedback and support: Instruction prompting may also involve providing students with feedback and support throughout the learning process. This can include verbal

Finally, we build a RAG chain to answer the question based on input documents.

rag_prompt = hub.pull("rlm/rag-prompt")
chain = load_qa_chain(llm, chain_type="stuff", prompt=rag_prompt)
chain({"input_documents": docs, "question": question}, return_only_outputs=True)

{‘output_text’: ‘ Instruction Prompting is a process of generating instruction candidates based on a small set of demonstrations in the form of input-output pairs.’}

01-ai/Yi-6B

Small Yi-6B for 0-shot.

model_id = "01-ai/Yi-6B"
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype="auto", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer,
                max_new_tokens=256,
                eos_token_id=tokenizer.eos_token_id)
llm_yi = HuggingFacePipeline(pipeline=pipe)
llm_yi(question)

Instruction prompting is a method of teaching that involves the use of prompts to guide the learner through the learning process. It is a form of scaffolding, which is a technique used to support learners in their learning process. Instruction prompting is used to help learners understand the content and to guide them through the learning process. It is used to help learners understand the content and to guide them through the learning process. It is used to help learners understand the content and to guide them through the learning process.

How does Instruction Prompting work?

Instruction prompting is a method of teaching that involves the use of prompts to guide the learner through the learning process. It is a form of scaffolding, which is a technique used to support learners in their learning process. Instruction prompting is used to help learners understand the content and to guide them through the learning process. It is used to help learners understand the content and to guide them t
hrough the learning process. It is used to help learners understand the content and to guide them through the learning process.

What are the benefits of Instruction Prompting?

Instruction prompting is a method of teaching that involves the use of prompts to guide the learner through the learning process. It is a form of sc

chain_yi = load_qa_chain(llm_yi, chain_type="stuff", prompt=rag_prompt)
chain_yi({"input_documents": docs,  "question": question}, return_only_outputs=True)

{‘output_text’: ‘ Prompt LLM to generate instruction candidates based on a small set of demonstrations in the form of input-output pairs. E.g. \n\nThe instruction is.’}

In conclusion, the microsoft/Orca-2-13b model demonstrates better performance in summarizing the relevant documents. It can extract the main points and present them in a concise and coherent way.