Skip to main content

Singleton Sessions, Retries, and Rate Limits in Python Requests

Python · Requests · API Clients

Singleton Sessions, Retries, and Rate Limits in Python Requests

Shared session

One reusable session per base URL can centralize headers, adapters, and connection behavior.

Retries

Retries help after temporary HTTP failures such as 429 and 503.

Rate limiting

Rate limiting is preventive. It controls pacing before the server needs to push back.

Introduction

Most beginner API code works the same way at first. You import requests, send a request, inspect the response, and move on. That is enough to get data back. It is not enough to build something calm, repeatable, and reliable. The moment a script makes repeated calls, hits temporary failures, or bumps into rate limits, the easy version starts to feel incomplete.

This article is not a general Singleton explanation. It is a practical one. The question here is what happens when one shared session object becomes useful inside real API code, and how retries and rate limiting fit beside that decision.

First version

The First Version Usually Works

import requests


response = requests.get("https://api.example.com/resource", timeout=10)
print(response.status_code)
print(response.text)

There is nothing wrong with this. It sends a request and returns a response. The problem begins when the same base URL is being called repeatedly and the code still behaves as though each call is a completely separate event.

Why sessions matter

Why a Session Changes the Shape of the Code

A session gives repeated requests one shared home. That means headers, cookies, and connection behavior can live in one place instead of being rebuilt over and over.

import requests


session = requests.Session()
response = session.get("https://api.example.com/resource", timeout=10)

Once you are reusing a session, the next question becomes obvious. If several parts of the same application talk to the same API, should each part quietly create its own session, or should that base URL reuse one shared session object?

Why this matters Sessions are not only about convenience. They also make it easier to attach shared behavior once instead of recreating it on every request.
Singleton-style manager

A Singleton-Style Session Manager

Singleton matters here because one base URL often benefits from one shared session policy.

import requests
from requests.adapters import HTTPAdapter


class SessionManager:
    _instances = {}

    def __new__(cls, base_url):
        if base_url not in cls._instances:
            instance = super().__new__(cls)
            instance.session = requests.Session()
            instance.session.mount("https://", HTTPAdapter(max_retries=3))
            cls._instances[base_url] = instance
        return cls._instances[base_url]

This is not Singleton in the abstract. It is Singleton in service of a specific job: one shared session per base URL.

base_url = "https://api.example.com"

first_manager = SessionManager(base_url)
second_manager = SessionManager(base_url)

print(first_manager is second_manager)
print(first_manager.session is second_manager.session)

The important point is not the pattern name by itself. The important point is that one reusable session object can now carry connection rules for every request aimed at that same API base.

Retries

Retries Solve a Different Problem

A shared session helps with reuse. It does not by itself solve temporary HTTP failures. Responses like 429 or 503 often call for patience rather than immediate failure.

from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry


retry = Retry(
    total=3,
    backoff_factor=1,
    status_forcelist=[429, 503],
)

adapter = HTTPAdapter(max_retries=retry)

The point of back-off is to avoid hammering the same endpoint in a tight loop. Each retry waits longer than the last.

Important Retries are useful for temporary conditions. They are not a substitute for understanding why the API is rejecting or delaying requests in the first place.
Attach retry behavior

Mounting Retry Behavior Onto the Session

base_url = "https://api.example.com"
session = SessionManager(base_url).session

retry = Retry(
    total=3,
    backoff_factor=1,
    status_forcelist=[429, 503],
)

adapter = HTTPAdapter(max_retries=retry)
session.mount("https://", adapter)

Now the retry policy lives with the shared session instead of being scattered around individual requests.

That is one of the biggest advantages of reusing a session object. Cross-cutting behavior becomes easier to define once and apply consistently.

Rate limiting

Rate Limiting Is More Preventive Than Retry Logic

Retries answer the question, “What should happen after a temporary failure?” Rate limiting answers the earlier question, “How fast should requests be sent in the first place?”

import time


class RateLimiter:
    def __init__(self, requests_per_minute):
        self.requests_per_minute = requests_per_minute
        self.interval = 60 / requests_per_minute
        self.last_request_time = 0

    def wait(self):
        current_time = time.time()
        time_since_last_request = current_time - self.last_request_time

        if time_since_last_request < self.interval:
            time_to_wait = self.interval - time_since_last_request
            time.sleep(time_to_wait)

        self.last_request_time = time.time()

This is a different responsibility. Instead of recovering after the server pushes back, it slows the client down so that fewer pushbacks happen in the first place.

Putting it together

Putting the Pieces Together

rate_limiter = RateLimiter(requests_per_minute=60)
session = SessionManager("https://api.example.com").session

for _ in range(100):
    rate_limiter.wait()
    response = session.get(
        "https://api.example.com/resource",
        timeout=10,
    )
    print(response.status_code)

This is the practical shape you want to notice. The session owns connection behavior. The retry policy owns temporary HTTP recovery. The rate limiter owns pacing. One concern per component.

Clean mental model A reusable session is about reuse. Retries are about recovery. Rate limiting is about pacing. Keeping those responsibilities separate makes client code much easier to reason about.
What to keep

What a Beginner Should Keep

The important lesson is not the pattern name. It is responsibility. A shared session can make API code cleaner when one base URL really does deserve one reusable interaction object. Retries and rate limiting solve different problems beside that choice, and together they make the client behave much more reliably.

FAQ

Frequently Asked Questions

These are the practical questions beginners usually have when shared sessions, retries, and rate limits first start to come together.

Why use a requests.Session() instead of plain requests.get() calls?

A session gives repeated requests one shared home for headers, cookies, adapters, and connection behavior.

What is the point of the Singleton-style session manager here?

It makes one shared session available per base URL so different parts of the same application can reuse the same session policy.

Do retries and sessions solve the same problem?

No. Sessions help with reuse and shared behavior. Retries help recover from temporary HTTP failures.

Why retry on 429 or 503?

Because those responses often indicate temporary conditions where a short wait and retry may succeed.

What does backoff_factor do?

It increases the wait time between retries so the client does not hit the same endpoint again in an aggressive tight loop.

How is rate limiting different from retries?

Rate limiting is preventive. It controls how quickly requests are sent so the client is less likely to hit server-side limits in the first place.

Should all three ideas live in one big class?

Usually no. It is cleaner when the session, retry behavior, and rate limiter each keep a distinct responsibility.

What is the simplest takeaway from this whole setup?

Reuse the session, recover thoughtfully from temporary failures, and pace requests before the server forces you to slow down.

Further reading

Further Reading

If you want the broader “when does Singleton help?” discussion next, read When the Singleton Pattern Actually Helps .

If you want implementation details next, read Comparing Two Singleton Implementations in Python .

If you want a lower-level URL-handling companion, pair this with your base URL parsing post.

Raell Dottin

Comments