Setting up an Agent (Hosting a LLM)

An Agent is a machine that hosts LLM and responds to the prompts it receives.
The agents typically high-end computers with at least 8GB RAM/VRAM to run inference at a reasonable speed (>10 tokens per second).
Currently, there are two ways to host a LLM