9d3b33bf Eaf6 4d06 9dd6 64aa725ab383

What this pattern does:

Serve a large language model (LLM) with GPUs in Google Kubernetes Engine (GKE) mode. Create a GKE Standard cluster that uses multiple L4 GPUs and prepares the GKE infrastructure to serve any of the following models: 1. Falcon 40b. 2. Llama 2 70b

Caveats and Consideration:

Depending on the data format of the model, the number of GPUs varies. In this design, each model uses two L4 GPUs.

Compatibility:

Recent Discussions with "meshery" Tag

Jun 15 | Where I can find the code of Layer5 Cloud ui? Aviral Asthana
Jun 13 | Meshery Build and Release Meeting | June 13th 2024 Yash Sharma
Jun 10 | Error while setting up local dev environment for docker desktop extension Faisal Imtiyaz123
Jun 07 | Unable to deploy Meshery Adapters Faisal Imtiyaz123
Jun 07 | Looking for a meshmate to help me with Docker extension development Faisal Imtiyaz123
Jun 04 | Error on terminal when I ran `mesheryctl system start` Ngole Lawson
Jun 04 | How to setup Meshery Operator for local machine animesh chaudhri
Jun 05 | Meshery Development Meeting | July 5th 2024 Yash Sharma
Jun 04 | No connection shown in Docker Desktop Meshery extension Faisal Imtiyaz123
Jun 04 | Showing no connections in meshery playground as well as in Docker meshery extension Faisal Imtiyaz123

Serve an LLM with multiple GPUs in GKE

Catalog Details

Pattern Snapshot

Related Patterns

Istio Operator

MESHERY4a76

What this pattern does:

Caveats and Consideration:

Compatibility:

Recent Discussions with "meshery" Tag