Google Click to Deploy containers
Deploy a sample Retrieval Augmented Generation (RAG) application on GKE using Ray, Jupyter notebooks, Langchain, Mistral-7b and CloudSQL pgvector. The application generates & stores vector embeddings for a sample dataset and augments user prompts to an LLM with context fetched from the embeddings.
Retrieval Augmented Generated (RAG) is a technique used to give Large Language Models (LLMs) additional context related to a customer’s prompt. RAG has many benefits including providing external information (from knowledge repositories, for example) and introducing “grounding”, which helps the LLM generate an appropriate response.
Google Kubernetes Engine (GKE) and Cloud SQL, combined with open source tools and frameworks Ray, LangChain, HuggingFace TGI, and Jupyter address many challenges around building GenAI RAG powered chatbot applications including multi-tenancy, scaling, reliability, and cost. GKE is commonly adopted by customers for large scale deployment of container-based, multi-tenant applications.
By using this product you agree to the Google Cloud Marketplace Terms of Service and the terms and conditions of the following software license(s): Pytorch
Google Cloud Console has failed to load JavaScript sources from www.gstatic.com.
Possible reasons are: