How to Build an LLM Application With Google Gemini

cover
5 Jun 2024

It seems like there are endless possibilities for innovation with LLMs. If you’re like me, you’ve used GenAI applications and tools — like ChatGPT built into Expedia, Copilot for code writing, or even DALL-E for generating images. But as a technologist, I want to do more than just use LLM-based tools. I want to build my own.

I told DALL-E to “Generate a watercolor of a computer programmer thinking about all the things he could build with LLMs.” Yeah, this is pretty spot-on.

With all new technologies, becoming a builder means starting simple. It’s like this for any new programming language I’m learning or any new framework I’m checking out. Building with LLMs is no different. So, that’s what I’m going to walk through here. I’m going to build a quick and dirty API that interacts with Google Gemini, effectively giving me a little chatbot assistant of my own.

Here’s what we’ll do:

  1. Briefly introduce Google Gemini.
  2. Build a simple Node.js application.
  3. Deploy the application to Heroku.
  4. Test it.

What Is Google Gemini?

Most everyday consumers know about ChatGPT, which is built on the GPT-4 LLM. But when it comes to LLMs, GPT-4 isn’t the only game in town. There’s also Google Gemini (which was formerly known as Bard). Across most performance benchmarks (such as multi-discipline college-level reasoning problems or Python code generation), Gemini outperforms GPT-4.

What does Gemini say about itself?

As developers, we can access Gemini via the Gemini API in Google AI Studio. There are also SDKs available for Python, JavaScript, Swift, and Android.

Alright. Let’s get to building.

Build the Node.js Application

Our Node.js application will be a simple Express API server that functions like a Gemini chatbot. It will listen on two endpoints. First, a POST request to /chat (which will include a JSON payload with a message attribute) will send the message to Gemini and then return the response. Our application will keep a running chat conversation with Gemini. This turns our chatbot into a helpful assistant who can hold onto notes for us.

Second, if we send a POST request to /reset, this will reset the chat conversation to start from scratch, effectively erasing Gemini’s memory of previous interactions with us.

If you want to skip this code walkthrough, you can see all the code at my GitHub repo here.

Initialize the Application

To get started, we initialize our Node.js application and install dependencies.

~/project$ npm init -y && npm pkg set type="module"
 
~/project$ npm install @google/generative-ai dotenv express

Then, we add this to the scripts in our package.json file:

"scripts": {
    "start": "node index.js"
  },

The index.js file

Our application consists of one file, and it’s pretty simple. We’ll walk through it a section at a time.

First, we import all the packages we’ll be using. Then, we initialize the SDK from Google AI. We’ll use the Gemini-pro model. Lastly, we call startChat(), which creates a new ChatSession instance for what Google calls a multi-turn conversation.

import 'dotenv/config';
import express from 'express';
import { GoogleGenerativeAI } from '@google/generative-ai';

const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY);
const model = genAI.getGenerativeModel({ model: "gemini-pro"});
let chat = model.startChat();

Next, we instantiate a new Express app, which is our API server.

const app = express();
app.use(express.json())

Then, we set up our listener for POST requests to the /chat endpoint. We make sure that the JSON payload body includes a message. We use our chat object to send that message to Gemini. Then, we respond to our API caller with the response text from Gemini.

app.post('/chat', async (req, res) => {
  if ((typeof req.body.message) === 'undefined' || 
      !req.body.message.length) {
    res.status(400).send('"message" missing in request body');
    return;
  }

  const result = await chat.sendMessage(req.body.message);
  const response = await result.response;
  res.status(200).send(response.text());
})

Keep in mind that, by using a ChatSession, there’s a stored, running history of our interaction with Gemini across all API calls. Giving Gemini a “memory” of our conversation is helpful for context.

But what if you want Gemini to start over completely and forget all previous context? For this, we have the /reset endpoint. This simply starts up a new ChatSession.

app.post('/reset', async (req, res) => {
  chat = model.startChat();
  res.status(200).send('OK');
})

Finally, we start up our server to begin listening.

const PORT = process.env.PORT || 3000;
app.listen(PORT, () => {
  console.log(`Server is running on port ${PORT}`)
})

As a side note, this entire project is just a mini demo. It’s not meant to be used in production! The way I’ve designed it for now (without authentication), anybody with the URL can send a request to /chat to /reset. In a production setup, we would have proper authentication in place, and each user would have their own instance of a conversation with Gemini which nobody else could manipulate.

Obtaining a Gemini API Key

At this point, we’re almost ready to go. The last thing we need is an API key to access the Gemini API. To get an API key, start by signing up for a Google AI for Developers account.

Once you’re logged in, select Launch Google AI Studio to start a new Google Gemini project.

Within the project, click on Get API key to navigate to the API keys page. Then, click on Create API key to generate a key. Copy the value.

In your project, copy the file called .env.template as a new file called .env. Paste in the value of your Gemini API key. Your .env file should look similar to this:

GEMINI_API_KEY=ABCDEFGH0123456789_JJJ

Test Our Application Locally

With everything in place, we can spin up our server locally to test it.

~/project$ npm start

> [email protected] start
> node index.js
Server is running on port 3000

In a different terminal, we can send some curl requests:

$ curl -X POST -H 'content-type:application/json' \
  --data '{"message":"I would like to bake a shepherds pie to feed 8 \
          people. As you come up with a recipe, please keep a grocery \
          list for me with all of the ingredients that I would need to \
          purchase."}' \
  http://localhost:3000/chat

**Shepherd's Pie Recipe for 8**
**Ingredients:**
**For the Filling:**
* 1 pound ground beef
* 1/2 pound ground lamb
* 2 medium onions, diced
…
**For the Mashed Potatoes:**
* 3 pounds potatoes, peeled and quartered
* 1/2 cup milk
…
**Instructions:**
**For the Filling:**
1. Heat a large skillet over medium heat. Add the ground beef and lamb
and cook until browned.
…

$ curl -X POST -H 'content-type:application/json' \
  --data '{"message":"I also need to buy fresh basil, for a different
          dish (not the shepherds pie). Add that to my grocery list \
          too."}' \
  http://localhost:3000/chat

**Updated Grocery List for Shepherd's Pie for 8, and Fresh Basil:**
* 1 pound ground beef
* 1/2 pound ground lamb
* 2 medium onions
* 2 carrots
* 2 celery stalks
* 1 bag frozen peas
* 1 bag frozen corn
* 1 tablespoon Worcestershire sauce
* 1 teaspoon dried thyme
* 1 cup beef broth
* 1/4 cup tomato paste
* 3 pounds potatoes
* 1/2 cup milk
* 1/4 cup butter
* **Fresh basil**

$ curl -X POST -H 'content-type:application/json' \
  --data '{"message":"What items on my grocery list can I find in the \
          produce section?"}' \
  http://localhost:3000/chat

The following items on your grocery list can be found in the produce
section:
* Onions
* Carrots
* Celery
* Potatoes
* Fresh basil

$ curl -X POST http://localhost:3000/reset

OK

$ curl -X POST -H 'content-type:application/json' \
  --data '{"message":"What items are on my grocery list?"}' \
  http://localhost:3000/chat

I do not have access to your grocery list, so I cannot give you the
items on it.

It’s working. Looks like we’re ready to deploy!

Deploy Our Application to Heroku

To deploy our application, I’ve opted to go with Heroku. It’s quick, simple, and low-cost. I can get my code running in the cloud with just a few simple steps, without getting bogged down in all of the nitty-gritty infrastructure concerns. This way, I can just focus on building cool applications.

After signing up for a Heroku account and installing the CLI, here’s what it takes to deploy.

Add Procfile to the Codebase

We need to include a file called Procfile which tells Heroku how to start up our application. The contents of Procfile look like this:

web: npm start

We commit this file to our codebase repo.

Log in to Heroku (via the CLI)

~/project$ heroku login

Create App

~/project$ heroku create gemini-chatbot

Creating ⬢ gemini-chatbot... done
https://gemini-chatbot-1933c7b1f717.herokuapp.com/ | https://git.heroku.com/gemini-chatbot.git

Add Gemini API Key as Config Environment Variable

~/project$ heroku config:add \
  --app gemini-chatbot \ 
  GEMINI_API_KEY=ABCDEFGH0123456789_JJJ

Setting GEMINI_API_KEY and restarting ⬢ gemini-chatbot... done, v3
GEMINI_API_KEY: ABCDEFGH0123456789_JJJ

Push Code to Heroku Remote

~/project$ git push heroku main

...
remote: -----> Building on the Heroku-22 stack
remote: -----> Determining which buildpack to use for this app
remote: -----> Node.js app detected
...
remote: -----> Build succeeded!
remote: -----> Discovering process types
remote:        Procfile declares types -> web
remote: 
remote: -----> Compressing...
remote:        Done: 45.4M
remote: -----> Launching...
remote:        Released v4
remote:        https://gemini-chatbot-1933c7b1f717.herokuapp.com/ deployed to Heroku

That’s it? That’s it.

Test Our Deployed Application

With our application deployed, let’s send some curl requests to our Heroku app URL.

$ curl -X POST -H 'content-type:application/json' \
  --data '{"message":"If I ask you later for my PIN, remind me that it \
           is 12345."}' \
  https://gemini-chatbot-1933c7b1f717.herokuapp.com/chat

Sure, if you ask me for your PIN later, I will remind you that it is 12345.
**Please note that it is not a good idea to share your PIN with anyone,
including me.** Your PIN is a secret code that should only be known to you.
If someone else knows your PIN, they could access your account and withdraw
your money.

$ curl -X POST -H 'content-type:application/json' \
  --data '{"message":"What is my PIN?"}' \
  https://gemini-chatbot-1933c7b1f717.herokuapp.com/chat
Your PIN is 12345.

$ curl -X POST https://gemini-chatbot-1933c7b1f717.herokuapp.com/reset
OK

$ curl -X POST -H 'content-type:application/json' \
  --data '{"message":"What is my PIN?"}' \
  https://gemini-chatbot-1933c7b1f717.herokuapp.com/chat

Unfortunately, I am unable to provide your personal PIN as I do not have
access to your private information. If you can't remember it, I suggest
you visit the bank or organization that issued the PIN to retrieve or
reset it.

Conclusion

Now is a great time to build LLM-based applications. Ride the wave!

We’ve walked through how to build a simple LLM-based application on top of Google Gemini. Our simple chatbot assistant is basic, but it’s a great way to familiarize yourself with the Gemini API and its associated SDKs. And by using Heroku for deployment, you can offload the secondary concerns so that you can focus on learning and building where it counts.