Mastering AI Agents with Hugging Face Smolagents: Build Your Own Multi-Tool Chatbot
7XipjLJEGP4 • 2026-01-12
Transcript preview
Open
Kind: captions Language: en Have you ever wanted to build your own AI assistant? I'm not talking about just a chatbot, but something that can actually do things. Use tools, browse the internet, generate images, and wrap it all up in a slick user interface. Well, if that sounds exciting, you are in exactly the right place. Today, we're going to tear down the mystery behind all of that. We're doing a deep dive into the small agents library from Hugging Face, and you're about to see just how easy and incredibly powerful building your own AI agent can. So, what's on the docket for today? Well, here's our road map. We'll start by defining our quest. Then we'll set up our digital workshop, getting all the tools we need. After that, the real fun starts. We're going to bring our agent to life, teach it some, frankly, amazing new skills, and then build it a sleek control panel. By the end of this, you'll be more than ready to kick off your very own agent building adventure. All right, let's jump right into part one, our quest. You know, the whole world of AI agents can feel super intimidating, but our goal today is to cut right through that noise. We're going to make this whole process not just something you can understand, but something you can actually do. And let's be real, it's so easy to feel that way. You hear all these buzzwords flying around, lang chain, langraph, agentic loops, and you see these super complicated diagrams and think, "Nope, this is way beyond me." Well, today we are flipping that script completely. We're going to replace that overwhelming feeling with a genuine sense of I can do this by showing you exactly what's going on under the hood, but in a way that's so much easier to get your head around. Now, this isn't just some hypothetical question. This is our actual goal today. We are going to build the exact application you see on screen. A chatbot that can think for itself, figure out a plan, use a whole bunch of different tools to solve a problem, and then give you the answer. For instance, look at this prompt. Generate an image of the chanc of Germany from 2010 playing the flute. Now, think about that. The agent first has to figure out who the chancellor even was back in 2010, and then it has to use a different tool to create the image. That kind of sophisticated multi-step thinking is exactly what we're going to build together by the end of this explainer. Okay, time to roll up our sleeves. You know, any great project, whether it's a painting or a piece of code, it all starts with getting your workshop set up correctly. So, let's get our digital tools and materials in order. You know, the foundation here is actually way simpler than you might think. It all kicks off with one single line in your terminal. Pip install small agents. That's it. That one command installs this entire minimalist framework we're about to use. Next up, we need our API keys. Think of these like the keys to a supercars engine. They're what give our code access to the powerful AI models that will act as our agents brain. So, our primary key is going to be a hugging face token. This thing unlocks a massive universe of really powerful open- source models. And getting one is super easy. Just go to your hugging face profile, find access tokens, and create a new one. Now, we're also going to grab keys for OpenAI and Enthropic. And we're not doing this because we absolutely need them right now, but to show you just how flexible this framework is. You'll see in a bit just how easy it is to swap different brains in and out of our agent. Okay, our workshop is set up, the tools are laid out. Now for the really exciting part, the moment we've been building towards. We're going to put together the core pieces of our agent and see it think for the very first time. First up, we have the tool calling agent. The best way to think about this is as the project manager or maybe the conductor of an orchestra. When you give it a task, this is the part that looks at the problem, checks out all the tools it has available, and decides, okay, I need this tool, I need to give it this information, and then I'll do this with the result. It's the absolute maestro of the entire operation. Now, if the agent is the manager, the inference client model is the connection to the actual brain, the large language model or LLM that does all the reasoning. We're going to start with an open- source model from HuggingFace called Kimmy K2 Instruct. You can actually find models like this yourself. Just go to the models tab on HuggingFace, filter for text generation and inference available, and see what pops up. And here's the best part. Not only are these models incredibly powerful, but as we're about to see, they are ridiculously cost effective compared to some of the big proprietary names out there. And this this is where the magic of AI agents really starts to click. Our third component is the web search tool. [snorts] It comes pre-built with small agents. And with this one little addition, our agent is no longer trapped by its training data. It's not stuck in the past. It can now tap into the entire internet for real time up to the second information. This is literally our agent's window to the world. So here's the beautiful part. Putting all these pieces together is just simple. We initialize the model telling it which one we want to use. Then we initialize our tool calling agent. We give it a list of its tools. Right now just the web search and we tell it which model to use for its brain. And that's that's it. Seriously, in just a handful of lines of code, we've built a thinking reasoning AI agent that is connected to the internet. Okay, let's take it for a spin. We're going to ask it something it could never know from its training data. Now watch this. The agent actually logs its entire thought process for us. First, it sees the query and realizes, hey, I don't know this. So, it logs a tool call. It decides to use the web search tool to find the answer. Then, it gets an observation back, the result from that web search. And now, armed with that new piece of information, it puts it all together to formulate the final answer. This right here, this is that famous agentic loop in action. And small agents just lays it all out for us, clear as day. So, our agent can think and it can search the web. That's already pretty cool. But the real magic, the thing that makes agents so special starts right now. We're going to teach it brand new custom skills. This is where our agent goes from being a generic tool to becoming our specialized assistant. Now, just take a second to appreciate how elegant this is. If you've ever tried to do this from scratch, you know the pain. You have to write your function and then you have to manually create this complex JSON schema just to describe what your function does to the AI. It's awful. But with small agents, all of that just disappears. All you do is write a normal Python function and add this one little line at tool right above it. That's it. The library does all the complicated background work to make it understandable for the AI. It's incredible. Okay, this is absolutely critical. So listen up. When you make a custom tool, the single most important part is not your code. It's the doc string. That little text description right underneath the function name. the AI, the LLM brain, it never ever looks at your Python code. It only reads this description. You have to think of this as the instruction manual you're writing for the AI. You're telling it exactly what this tool is for and when it should use it. A clear, well-written dock string is the secret to a tool that works every time. So, let's talk about a few best practices here. First, always use type hints. Saying that an input like prompt is a string stir just helps the agent structure its requests properly. Second, and I can't say this enough, describe your tool thoroughly. Be super explicit. And here's a little pro tip for you. If you think you might ever share your tool on the hugging face hub for others to use, put your import statements inside the function itself. That makes it totally self-contained and portable. And this this is where it all comes together. We're going to give our agent a brand new custom tool, a function called generate image. Now, let's go back to that really complex query from the beginning and watch the agents logic. It knows it can't make the image right away because it's missing a key piece of information. So, what's its first move? It uses the web search tool to find out who the chancellor was. As soon as it gets that answer, Angela Merkel, it immediately triggers a second tool call to our image generator, feeding the output from the first tool directly into the second. This ability to chain tools together is what separates a simple chatbot from a true AI agent. But what if you don't even want to write the tool yourself? Well, this is where things get absolutely wild. Hugging face spaces has thousands of public AI apps that people have built for all sorts of things. And with one command, tool from space, you can point your agent to any one of those public apps and add it to its toolbox. You just give it the space ID and a quick description and your agent can now use that app as if it were its own built-in skill. It is a massive game-changing shortcut. So, our agent is now a multi-talented powerhouse. It can think, it can search, it can perform custom tasks. Now, it's time for the final upgrades. We're going to look at swapping out its brain and then we're going to give it a professional-grade control panel, a real user interface. Okay, let's talk brains. We started with that open model Ky K2 which costs about two bucks for every million tokens of output, but small agents makes it so easy to switch. You can import the open AI model class and pop in a model like GPT40, which for that same million tokens might run you closer to $14. You can even use that exact same class to connect to other providers like Anthropic just by changing the base URL. This flexibility is key. It lets you pick the right balance of power and price for whatever you're building. I mean, this chart really puts it in perspective. The blue bar is our open model at $2. The red one is the proprietary model at 14. That's a seven times price difference. When you're just starting out, developing and experimenting and running tons of tests, that cost difference is an absolute gamecher. Starting with these powerful but super affordable open models is just a smarter way to build. And now for the grand finale. This might be the most magical part of the whole thing. We've got this amazing agent running in our code, but how do we let other people use it? How do we share it? Well, it's almost laughably simple. You import Graddio UI from the library. You call it with your agent and then you just type. That is it. Two lines of code and that command instantly spins up a complete professional sharable web application for your agent. It even gives you a public URL. It's just wow. We have covered so much ground today. We went from just an idea all the way to a fully functional multi-tool AI agent with its own web app. So, let's just take a quick second to recap the powerful new skills you've just picked up and talk about what's next for you on your own agent building adventure. I mean, just look at what you can do now. You know how to set up the entire environment. You can initialize an agent with a brain and tools. You can build your own custom tools from scratch. You can pull in thousands of other tools from the HuggingFace hub. And you can deploy the whole thing as a web app for anyone to use. You've now seen the entire life cycle of creating an AI agent from the first line of code to the final pro. But really, this isn't the end. It's the starting line. The real question now is what are you going to build with these new skills? Are you going to make an agent that automates your email or one that interacts with your favorite APIs to get work done? Maybe a creative partner to help you write or code. The possibilities are, and I mean this literally, endless. I really hope you enjoyed this deep dive into small agents. It is such a fantastic, simple way to really understand how agents work under the hood. We're going to be exploring more advanced frameworks like Open AI's assistance and laying chain in future explainers to solve even bigger problems. So, if you want to continue on this journey with us, make sure you subscribe. Thanks so much for watching, and I can't wait to see you in the next one.
Resume
Categories