When you need to pass a lot of data to an AI model for analysis, you'll want to consider its conversational memory, also known as the context window. If you send too much, the model will forget parts of your instructions, pushing out hallucinations and useless results.
The Google Gemini 1.5 Pro model can process a large amount of data in one go, supporting up to a million tokens in each prompt—a little over 700,000 words. Plus, 1.5 Pro is multimodal, meaning it can work with up to one hour of video, 9.5 hours of audio, and over 30,000 lines of code.
I wrote this Gemini API guide to help you understand what Gemini can do and how to set up the API calls to start talking to it. With its massive context window, you can send extremely long prompts with complex instructions, examples, and data to transform—all without getting a machine learning certification. Let's go.
If you're looking to connect the Gemini models to the apps you use at work, you can do it without any API setup. Learn more about how Zapier can add AI to your existing workflows and how to use Gemini with Zapier.
Before you begin
I wrote this article for anyone with a solid grasp of how software works. Depending on your experience, you may need a few extra skills to configure the Gemini API connection.
If you're not familiar with what an API is and how it works, check out Zapier's API guide—it'll explain everything about calls, status codes, and returned data.
One more detail: computers need a structured way to communicate during API calls, and that's why it's important to understand what JSON is and how this notation system works.
If everything I just said is already familiar to you, let's head forward.
Table of contents:
What is the Gemini API?
The Gemini API provides access to Google's AI model suite:
Gemini 1.0 Pro, a natural language processing (NLP) model with chat and code generation features
Gemini 1.5 Pro, a multimodal model with up to a 1 million token context window
Gemini 1.5 Flash, a faster multimodal model with tighter input and output limits
When connecting these models to your internal tools, apps, or products, you can leverage their features wherever you're working without constantly switching to the chat interface.
There are two ways to connect to the Gemini API. The first one is by using the free plan via Google AI Studio—this is the easiest way to set it up and the one I'll walk through in a bit. If you prefer deeper control and integrations with other models, you can set up access via the Google Vertex AI Model Garden.
What is Google Vertex AI?
Google Vertex AI is a service that lets you access all of Google's AI models as well as others available on the market, such as Anthropic's Claude and Meta's Llama. Sign up for a Google Cloud Platform account, grab your $300 worth of free credits, and start setting them up. It's useful if you want to integrate multiple models or AI features in the same project and want to manage billing and usage in one dashboard.
There's a disadvantage here: using Vertex AI is complicated—it's meant for developers who know how to set up cloud infrastructure. If you're super tech-forward, open the documentation, and use ChatGPT to guide you, but I'd recommend having someone with coding skills on speed dial.
The tutorial in this article focuses instead on Google AI Studio and the API free plan, but if you like challenges or prefer to hop on the paid plan, here are some Vertex AI resources to get started with:
What can you do with the Gemini API?
Gemini API features
The Gemini models can:
Generate text
Generate images
See images and video
Analyze and process audio
Turn text into speech
Turn speech into text
Not all models can run these tasks, so check out the capabilities of each one on the technical specifications page. I'll break down what each model can do in the next section.
How to apply the Gemini API to your projects
Gemini 1.5 Pro and 1.5 Flash are multimodal models—meaning they can take text, audio, video, files, or images—and can help with:
Summarizing text
Understanding images or video and answering questions about those files
Recognizing objects
Understanding digital content such as charts, tables, or web pages
Generating structured content such as HTML or JSON
Image and video captioning
Generating long-form text content
Reasoning
Analyzing audio for transcription, summarization, or regenerating as a Q&A format
Multimodal processing, meaning it can take multiple types of input (text and images, for instance) and generate an output based on all of them
While not as powerful as its bigger siblings, you can use Gemini 1.0 Pro for tasks like:
Answering questions based on content passed in a prompt
Classifying and applying labels to text
Extracting entities
Generating code
How to brainstorm uses and test Gemini for your projects
There are two ways you can get a feel of how Gemini responds to your requests. The first one is using the free Gemini chatbot. Similar to ChatGPT, you can send messages and get responses that will help you understand if the output is what you're looking for.
The other option is the Google AI Studio. Offering far more input and output controls, you can work your way to the perfect prompt for the task at hand.
Gemini API pricing
The Gemini API has two pricing tiers. The free one isn't private, meaning the data you pass to Google may be used to train future models. The paid one is managed in Google Vertex AI: it has a harder setup process, but your data is safer. Here's the walkthrough of both plans.
Gemini API free plan
When you open the pricing page, you'll see that you can use the Gemini API for free up to:
15 requests per minute
1 million tokens per minute
1,500 requests per day
The free plan doesn't include context caching. This means that Google Cloud won't let you store reusable long prompts on their servers—this feature is helpful for repetitive tasks or chatbots with long instructions.
Free is an attractive word, but please note: prompts and responses on the free plan will be used to train Google's models. If you plan to use this tier, be sure to never feed sensitive data to the API, as it may end up a part of Gemini's future training data.
Gemini API pay-as-you-go pricing plan
Like most of the competition, Google has pay-as-you-go pricing for each request. You'll pay less for inputs and more for outputs, with context caching also having its price tag. It will also vary depending on the size of the prompt, with anything crossing the 128,000-token threshold costing two times more. Here's the breakdown:
Input: $0.35 per million tokens
Output: $1.05 per million tokens
Context caching: $0.08 per million tokens
Your data won't be used to train any models. You can only access the pay-as-you-go option if you use the Gemini API via Vertex AI—again, be ready to code or bring someone who can.
How to get a Gemini API key and set up Gemini API connections
Put on your favorite playlist, turn off notifications, and let's start building.
Step 1: Create a Google AI Studio account
Go to the Gemini API website, and click Sign In to Google AI Studio. Follow the steps to create a new account or sign in with your existing Google credentials.
Step 2: Open the API documentation and reference
Each API works differently, so you'll need to rely on the documentation to understand the features and use cases. The API reference, on the other hand, is an in-depth technical collection of commands, parameters, and setup instructions to help you implement it in your project.
Here are the links to the Gemini API documentation and to the generate content API reference.
Step 3: How to get a Gemini API key
Once you log in to Google AI Studio, read, accept, and close the informational pop-ups that appear on your screen. You can test the Gemini models here, adjusting some basic settings on the right side.
On the top left of the screen, click the Get API key button.
Then, click the Create API key button.
Accept and close the safety setting reminder. After that, click the Create API key in new project button.
Google will create a new API key. Copy that key, and then close the pop-up.
Very important: You need to keep this API key safe at all times. If someone finds your key, they may use it themselves, and depending on the usage, that may disable the endpoint for you. Don't share this key with anyone who doesn't need it, and if you're publishing an app to the public web, be sure to read up on your API security best practices.
Back on the API keys dashboard, you'll notice a new key is added to the list and a new section with a cURL command appears below. If you can't see it, try refreshing your browser page.
Let's break down what each of these lines mean.
curl \
For users on a terminal, this initiates a new connection. The backslash is a line break for better readability; it doesn't affect the command itself. We won't be needing this.
-H 'Content-Type: application/json' \
This is the request header, marked by the -H flag. It contains the Content-Type key set to the application/json value. This tells the API endpoint what kind of data to expect. Postman, the platform I'm using to call the API in this tutorial, already sets this up by default, so we won't need this either.
-d '{"contents":[{"parts":[{"text":"Explain how AI works"}]}]}' \
The -d flag marks the data you're passing with the request. Written in JSON, "contents" marks the content of the request, which is broken up into "parts." It contains one "text" part with the value "Explain how AI works"—this is the prompt you're sending to the AI model.
-X POST 'https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-latest:generateContent?key=YOUR_API_KEY'
The -X flag sets the HTTP request type—in this case, it's the POST method. The URL refers to the API endpoint URL, the place you're sending your request to.
Step 4: Set up the API call
We have all the details we need to start calling the Gemini API. If you want to follow along, you can create a free account in Postman, an API design and testing platform. You can also get the job done with a no-code app builder or internal tool builder.
If you're following along in Postman, log in to the dashboard and click on New Request at the top of the screen.
We'll start by adding the details in the -X flag. On the request input field, click the dropdown displaying GET and change it to POST. Then, copy the -X flag URL without the single quotes at the beginning and end, and paste that in.
Directly below, you'll see the interface updates to show a new parameter. Postman detected it in the URL: it's the "key=YOUR_API_KEY". This is where you'll pass your unique API key. Delete YOUR_API_KEY (either on the URL input field or the query parameters table below), and replace it with your key.
Note that I've blurred my API key in the following image, but you'll be able to see yours just fine on your screen.
Setting up a new API is more troubleshooting marathon than smooth sailing, so let's make a mistake to understand how this works. Click the Send button, and see what happens.
That's an HTTP 400 Bad Request error. The message tells us the contents aren't specified. Not surprising: we haven't configured the request body in Postman, so it's passing an empty request to the API.
Whenever you get an error when making an API call, watch out for syntax mistakes and refer back to the documentation and API reference. If you're having trouble working through it, remember that ChatGPT can help you with the requests and JSON formatting.
Step 5: Set up the body of the call
Let's set up the body of the call to fix the error. Back on the top request tab in Postman, under the endpoint input field, click the Body tab and then choose raw.
We'll paste the entirety of the content in the -d flag of the request from the last step. I've formatted it for readability. You can copy and paste it into line 1 of the interface:
{
"contents": [
{
"parts": [
{
"text": "Explain how AI works"
}
]
}
]
}
That should work. Only one way to be sure: click the Send button.
If you've followed this guide to the letter so far, you'll see something new in the response tab: an HTTP 200 OK status code and the full response from Gemini.
There's an interesting aspect of this long response. Scroll all the way down and take a look at this.
The "citationMetadata" and "citationSources" keys reveal that Gemini searched the web to generate our response. It was based on the page shown on the "uri" key. This may also explain why the API took longer to reply than normal: in my case, I waited 10 seconds for the response.
Step 6: Pass your prompts
We're talking to the Gemini API now, but this isn't useful if you can't pass your prompts. On the Postman request tab, replace the value inside the "text" key with your text. Make sure the double quotes remain both at the beginning and end, otherwise the call will return an error.
As you click the Send button, you'll see the new response on the bottom part of the screen.
Step 7: Change the settings
You can add more parameters to the body of the call to control the generation settings in Gemini: head over to the model reference page of the API documentation to see them all.
This page can be confusing to read for the first time. The first section you want to look at is the Request body guidance.
When you copy and paste the entirety of this JSON into the Postman request body, you'll be able to control the generation process. Notice that the parameters are marked with the acceptable data types—string, integer, float, number—so remember to replace them with actual values before running the call.
If there's any parameter that you don't need, you can simply delete it from the body. Make sure to also delete all the brackets associated with it, so every part of the call is properly opened and closed. Postman will alert you if it finds these issues, and if you can't find a way to make it work, paste it into ChatGPT and ask it to correct your syntax.
The second useful part of the page is the explanation of what each parameter does, right below the Request body section.
You'll find handy explanations of what each parameter means, what it does, what models it works with, and the accepted values.
I decided to add some of these parameters to my call on Postman.
Here's a quick guide to what each of these configuration parameters mean:
temperature controls creativity and randomness.
top_p controls vocabulary variety.
top_k controls how many words can be probable to appear as it generates a response. For example, a top_k of 64 will tell the model to only pick among the top 64 most probable words.
max_output_tokens controls the total response length. In my case, I limited mine to only 100 tokens.
And here's the result I got after clicking the Send button.
As you can see, the max output tokens parameter cut the response short, which means the settings are working as intended.
Step 8: Change the AI model
So far, we've been chatting with the latest version of Gemini 1.5 Flash, but there are other models we can call using this API. You can do so by changing the model name in the endpoint URL.
In the Postman request input field, find the name of the Gemini model you're using.
Replace that name with another model's name. You can find the full list on this page or copy and paste one of them from here:
gemini-1.5-pro-latest
gemini-1.0-pro
Make sure the forward slash at the beginning and the colon at the end remain in the endpoint URL. Once you click the Send button, your instructions will be sent to the new model, and you may see a response with a different level of quality.
Step 9: Integrate Gemini into your apps
You can integrate basic Gemini functionality into your apps using Google AI Studio and the free API. Refer to the documentation of your no-code or internal tool builder on how to connect APIs to it, and you'll be able to start setting up calls in no time. For example, here's a guide on how to set up API connections using FlutterFlow.
But if you're looking to deeply integrate Gemini models into your apps and safeguard your data, using Vertex AI via Google Cloud Platform is the best option. In this case, you'll need to either know or understand code. Alternatively, find an expert to help set up the API endpoints and calls, and from that point on, you can add those settings into your product or app.
Use Zapier to connect to Gemini
I know you just read an entire guide on how to use the Gemini API, but depending on what you're looking to do, there's an easier way. Zapier can connect Google AI Studio and Google Vertex AI with thousands of other apps, with powerful workflows and an easy-to-use interface. Learn more about how Google Gemini's models work with Zapier, or get started with one of these pre-made workflows.
Generate draft responses to new Gmail emails with Google AI Studio (Gemini)
Promptly reply to Facebook messages with custom responses using Google AI Studio (Gemini)
Send prompts to Google Vertex AI from Google Sheets and save the responses
Zapier is the leader in workflow automation—integrating with thousands of apps from partners like Google, Salesforce, and Microsoft. Use interfaces, data tables, and logic to build secure, automated systems for your business-critical workflows across your organization's technology stack. Learn more.
Empower your apps with Gemini API
Google may have started last in the AI race—remember the PaLM model?—but it's working hard to catch up and provide the highest context window on the market. If you're struggling with hallucinations or projects with a lot of data, Gemini is a great option to try before you venture into embeddings, vector databases, or retrieval-augmented generation—especially if you're a no-coder like me.
Related reading: