Step 1: Create a new Hugging Face project!
To make sure everyone can collaborate on the same project, follow the steps in the Hugging Face Collaboration resource.
Step 2: Build your chatbot in the
app.py file.Start by building a basic version of your chatbot, similar to what we did in Lesson 11: Generative AI. Reference the lesson if you need a reminder of how to set up a basic chatbot using a LLM to generate responses.
Step 3: Add your HF token as a secret in your Space.
The steps for how to do this are in Lesson 11: Generative AI if you need a reminder! You won’t be able to use the Hugging Face Inference API in this space until you’ve set up this token as a secret. Remember to use “HF_TOKEN” for the name of your secret and paste your token in the value field.
Step 4: Create a
requirements.txt file.The
requirements.txt file is crucial for your Hugging Face Space project as it specifies all the Python dependencies your chatbot needs to run. When your app is deployed, Hugging Face will automatically install these dependencies based on what's listed in this file!- To add this file to your project, go to the Files page for your Space. Then click the Contribute button and select Create a new file.
Open this toggle to see a screenshot of this step!
- To implement Retrieval Augmented Generation (RAG) in your final chatbot using the tools outlined in Lesson 12: Semantic Search, you’ll need to include these dependencies in your
requirements.txtfile: gradio: Creates the web interface for your chatbothuggingface_hub: Provides the InferenceClient to interact with the Zephyr modelsentence-transformers: Powers your RAG system with the MiniLM embedding model you saw in the RAG Rebuild exercisetorch: Required for tensor operations and similarity calculationsnumpy: Used for various numerical operations
gradio huggingface_hub sentence-transformers torch numpy
Each of these packages serves a specific purpose in your application:
- After adding the dependencies to the
requirements.txtfile, click the Commit changes to main button.
Open this toggle to see a screenshot of this step!
Step 5: Incorporate RAG into your chatbot!
Now that you have a basic chatbot, you can add a knowledge base and use what you learned in Lesson 12: Semantic Search to incorporate Retrieval Augmented Generation (RAG) into your chatbot!
- Add a knowledge base related to your specific topic.
Open this toggle to get a hint!

- Modify the code in your
app.pyfile: - Load and process your knowledge base text file.
- Chunk the knowledge base and use an embedding model to convert the chunks to embeddings.
- Create a function that converts the user’s query to an embedding and calculates similarity scores to return the most relevant context from your knowledge base.
- Pass this context to the LLM along with the user’s query to generate the response!
Step 6: Make it better!
You have a chatbot that combines a language model with a search or retrieval system to generate more accurate and fact-based responses by pulling in relevant external information. 🙌 Now you can continue making improvements. Here are some things you might consider implementing in your project to make it even better:
- Improve the user experience by changing the layout or design
- Use Gradio’s built-in tools to provide example questions for the user
- Change your chunking strategy to identify the most effective approach for your specific knowledge base
- Add guardrails to handle cases where the query is not related to the knowledge base
- Prevent truncated responses