Servers⚓︎

TL;DR

Learn how to build the servers needed to power your agents . Dive into the tutorials below to check them out, or keep scrolling to learn more .

Ollama

Power the decision making and response generation processes of your agents with LMs.

Get started with Ollama
SearXNG

Enhance the knowledge store of your agents by giving them the ability to search the web.

Get started with SearXNG
Milvus

Create vectorstores ² for all of your documents that can be searched by your agents.

Get started with Milvus
Multi-Server Setup

Combine these servers together to create a one-stop server stack for all of your needs.

Get started with multi-server setups

Why are we using servers?⚓︎

An AI agent requires methods to make decisions, generate responses, and in the best builds, use tools.

FAQ

How can we give agents these abilities? How do we create an agent in the first place?

Well, with LangChain and LangGraph, agents are already built in. This means the whole process of creating the agent is abstracted away . All we need to do is pass the agent whatever tools we want it to use and an LM to act as a stand-in brain .

FAQ

Ok, why not just pass it an LM? Why should we give our agents tools?

To get them to work their magic , LMs are given a bunch of data over a period of time and build up rules for generating responses based on this data . So, their knowledge is only as good as the data they were fed . For practical purposes, this can only go up to a certain time period. But, in a lot of cases it will be important to keep our agents up to date .

Also, some of the information that we'll want our agents to analyze will be esoteric . It was possibly not available or only in very small doses when the LM was fed its data. This means the LM won't have many connections between our data and its knowledge store , so the rules it built up won't apply well. Its responses will be uninformed and off base or complete gibberish .

As a general rule, the more context our agents have the better their responses will be ¹. By giving our agents the proper tools , we can give them the ability to use other stores of information that are relevant to our projects . This way our agents will generate more informed responses and make more informed decisions even when discussing matters unfamiliar to the LM .

Ok, giving our agents tools does sound pretty useful .

FAQ

But, it sounds like we'll want a handful of tools. How will we create and keep track of them all?

After having a similar conversation with myself, I learned that two common ways of getting tools to agents are using: cloud or local servers. Using cloud servers is really convenient. You don't need to set anything up (besides some sort of an account) and all the server upkeep is out of your hands . This is probably the way to go if you don't care about paying for server use or having your data on a sever managed by someone else. It also seems to be a promising route for production level scalability .

However, I found myself compelled to learn how to setup the servers I would need on my own machine and manage them myself . I think it was mostly because I wanted all of my data on my local machine, and I had recently gotten myself a viable enough GPU for hosting some of the medium sized LLMs. Also, if you haven't noticed by now, I really like learning .

What I found was that hosting all the servers I needed was really easy once I understood how to use platforms like Docker .

Why are we using Docker?⚓︎

Ollama Docker Intro PNG

Ollama Docker Intro GIF

What we'll see in these tutorials is that each of our local servers can be setup and built in a Docker container using just a litte bit of code. Then, it's just a simple command to start and stop the containers that house the servers whenever you need them .

We'll also see that the containers can be combined into one stack so that all our servers can be started and stopped simultaneously. With this setup, you'll be able to easily start the LM and tool servers to pass over to your agents when you need them and simply stop all the servers when your done, all on your local machine.

A viable alternative that I hear great things about, but that I haven't tried to get working for myself is the completely open source project Podman. It's more lightweight and secure compared to Docker as it doesn't rely on a daemon and necessitating root priveleges to run containers, or pods. I plan to include this platform in my tutorials as an alternative to using Docker, at some point, because I want to learn how to use both .

For the agent brain, we're going to use LMs served with Ollama and to enhance these LMs, we're going to get a SearXNG metasearch engine and a Milvus vectorstore up and running. All of these servers can be containerized in Docker and passed over to our agents with ease .

Check out any of tutorials above to get started ! To see how to pass all these servers over to our agents in order to create assistants that can grab us up to date information or information relevant to our own personal documents, check out the agents tutorials.

I've found that this only works up to a point. Too much context at once can cause confusion . To solve this, summarizing the context or breaking the agent task up into smaller tasks to give it less context at a time works pretty well. ↩
A vectorstore is a special type of database that can be used to store and search your data. It stores your data with additional representations called embeddings and searches your data based on special variables called indices. The type of indices used for the vectorstore and the way your data is embedded result in different types of searches that can be done . For example, think of the difference between searching data for a specific keyword versus searching data based on an abstract embedding that maps nuanced relationships between data . In the first case, we're only going to get the results we expect, whatever contains the specific keyword. In the second case, we can get results based on relationships between the data that aren't immediately apparent. ↩