Catalog Details
CATEGORY
deploymentCREATED BY
UPDATED AT
June 17, 2024VERSION
1.0
What this pattern does:
Serve a large language model (LLM) with GPUs in Google Kubernetes Engine (GKE) mode. Create a GKE Standard cluster that uses multiple L4 GPUs and prepares the GKE infrastructure to serve any of the following models: 1. Falcon 40b. 2. Llama 2 70b
Caveats and Consideration:
Depending on the data format of the model, the number of GPUs varies. In this design, each model uses two L4 GPUs.
Compatibility:
Recent Discussions with "meshery" Tag
- Jun 15 | Where I can find the code of Layer5 Cloud ui?
- Jun 13 | Meshery Build and Release Meeting | June 13th 2024
- Jun 10 | Error while setting up local dev environment for docker desktop extension
- Jun 07 | Unable to deploy Meshery Adapters
- Jun 07 | Looking for a meshmate to help me with Docker extension development
- Jun 04 | Error on terminal when I ran `mesheryctl system start`
- Jun 04 | How to setup Meshery Operator for local machine
- Jun 05 | Meshery Development Meeting | July 5th 2024
- Jun 04 | No connection shown in Docker Desktop Meshery extension
- Jun 04 | Showing no connections in meshery playground as well as in Docker meshery extension