Building HackerNews Podcast Generator with Gemini 3, Elevenlabs

Jan 19, 2026

min read

TL;DR

This article shows how to build a simple podcast generator that turns Hacker News posts into short audio summaries using a single Tensorlake Application. The entire agent workflow runs on Tensorlake to achieve reliable execution and scalable data preparation, including web scraping, text cleaning, summarization, and audio generation. Gemini and ElevenLabs are invoked as external services from within Tensorlake functions for summarization and text-to-speech.

I spend a lot of time on Hacker News, but I rarely have the time to read every post and the long articles they link to. On many days, I just want a quick way to understand what people are discussing without opening multiple tabs and skimming through everything.

That led me to build a small podcast generator. The idea is simple: collect content linked from Hacker News, distill it into short summaries, and turn those summaries into audio that I can listen to while doing other things.

What I wanted to explore in this project was not just summarization or text-to-speech, but how to run an agent workflow cleanly and reliably. Instead of stitching together scripts, background jobs, and retries by hand, I wanted a single execution model where data preparation and tool invocation are orchestrated predictably.

In this project, the entire agent workflow runs as a single Tensorlake Application. Tensorlake is used as the execution runtime that coordinates scraping, text preparation, summarization, and audio generation. Gemini and ElevenLabs are invoked from within this workflow as external services, while Tensorlake manages execution, orchestration, and outputs.

‍

Architecture Overview

The system is designed as a single, end-to-end workflow that transforms trending links into short, podcast-style audio summaries.

Content retrieval and preparation: Links sourced from Hacker News are fetched and converted into clean, readable text so they can be reliably processed by downstream steps.
Text summarization: The prepared content is sent to a large language model to generate concise summaries suitable for audio narration.
Audio generation: These summaries are then converted into spoken audio using a text-to-speech service.
End-to-end execution: All steps run as part of one coordinated workflow, ensuring that data flows consistently from content ingestion to final audio output, with intermediate results preserved across stages.

‍

Why Tensorlake Runs the Agent

This project requires running multiple execution steps in sequence, including content retrieval, text normalization, summarization, and audio generation. These steps must be coordinated reliably as part of a single agent workflow rather than as loosely connected scripts or background jobs.

These requirements are execution focused rather than business logic-focused, which makes Tensorlake a good fit for running the agent.

Durable function execution: Each step of the podcast agent runs as a durable Tensorlake function. For example, if the summarization step fails due to a transient Gemini API error, Tensorlake retries only that function. Previously completed steps, such as web scraping and text preparation, are not re-executed. This makes the workflow resilient to partial failures without requiring the entire pipeline to be restarted.
Serverless orchestration with dynamic fan out: The scraping stage handles a variable number of articles or links discovered at runtime. The workflow coordinates content retrieval and preparation within a single managed execution.
Built-in orchestration and state handling: Tensorlake manages the ordering and data flow between steps. Scraped content flows into text preparation, then into summarization, and finally into audio generation, all as a single, managed execution. Retries, timeouts, and dependencies between functions are handled by Tensorlake rather than by custom glue code.
Native support for large inputs and outputs; The workflow produces outputs of different sizes, including full article text, summary scripts, and MP3 audio files. Tensorlake supports function inputs and outputs of arbitrary size, so the agent does not need special handling for large text payloads or audio files.

Together, these capabilities allow the podcast agent to be implemented as a single Tensorlake Application composed of multiple Tensorlake Functions, with external services like Gemini and ElevenLabs invoked only where needed.

‍

Building the Podcast Generator

Now, we’ll build a simple podcast generator that automatically pulls the top articles from Hacker News, processes each article individually, and generates podcast audio for each one. For every selected article, the system crawls and cleans the page content, produces a concise summary using Gemini, and converts that summary into natural-sounding audio using ElevenLabs.

In this project, we build a single Tensorlake application where every step is implemented as a Tensorlake function. These functions are composed into one end-to-end agent that crawls content, prepares clean text, generates a podcast script, and produces the final audio file output.

‍

Prerequisites

Before starting, make sure you have the following:

Python 3.11 or later
Gemini API Keys
ElevenLabs API Keys

All steps below are executed locally using Python.

Generate a Gemini API Key

Go to https://aistudio.google.com/
Create an API key
Keep it available for the next step
Replace the key in your .env file

‍

Generate an ElevenLabs API Key

Go to https://elevenlabs.io/app/developers/api-keys
Create an account
Generate an API key from your profile
Replace the key in your .env file

Step 1: Set Up a Virtual Environment

Create a new project folder and open it in your editor.

Then create and activate a virtual environment.

python -m venv venv

Activate the environment:

On Windows:

venv\Scripts\activate

On macOS or Linux:

source venv/bin/activate

Once activated, your terminal should show that the virtual environment is in use.

Step 2: Install Dependencies

Create a requirements.txt file in the web-scraper folder with the following content:

tensorlake pydoll-python streamlit google-genai requests python-dotenv

beautifulsoup4

Install the required dependencies:

pip install -r requirements.txt

This installs Tensorlake, the headless browser dependency used by the scraper, and the libraries required for Gemini and ElevenLabs integration.

Create the .env file

The .env file is used to securely store API keys and configuration values outside the source code, and it will be referenced in the following steps when integrating Gemini and ElevenLabs.

In the same directory as create a file named:

.env

Add the following exact variable names:

GEMINI_API_KEY=PASTE_YOUR_GEMINI_API_KEY_HERE ELEVENLABS_API_KEY=PASTE_YOUR_ELEVENLABS_API_KEY_HERE

Now, create a Python file named podcast_agent.py

Step 3: Web Scraping with Tensorlake

Web scraping is implemented as a Tensorlake Function within a Tensorlake Application, making it a composable and reusable stage in the podcast generation workflow. The pipeline begins with a source selection function, fetch_hackernews_top_articles, which automatically retrieves the top Hacker News articles and serves as the entry point for downstream processing.

For each selected article, the crawl function performs a controlled depth-first traversal, starting from the article URL and bounded by configurable parameters such as max_depth and max_links. This keeps execution predictable while allowing limited link discovery when needed. The crawler runs as an independent Tensorlake Function, enabling isolated execution and observability.

Actual page retrieval is handled by a dedicated scraper function, fetch_content, which executes inside a purpose-built scraper image containing Chromium and PyDoll. This setup ensures reliable rendering of JavaScript-heavy pages. Each page fetch runs as an isolated function invocation, so failures on individual pages do not interrupt the overall crawl.

HTML pages are normalized into clean, readable text, while binary assets such as images or PDFs are detected and handled separately with appropriate metadata. Domain boundaries are enforced, and visited URLs are tracked to prevent redundant processing.

Refer to the full code here

‍

Step 4: Summarization with Gemini

Once the crawl completes, a Tensorlake function extracts clean, readable text from the scraped results. This step normalizes the data and prepares it for language model input by removing empty content and consolidating text across pages.

The summarization step uses Gemini through the google-genai client. The cleaned article text is passed to the gemini-2.5-flash model with a prompt designed to generate a concise, podcast-style script. Input size is intentionally limited to remain compatible with free or low-tier usage.

Gemini is used strictly as an external inference service, while Tensorlake manages execution, orchestration, and data flow between functions.

‍

1@function(secrets=["GEMINI_API_KEY"])
2def summarize_with_gemini(clean_text: str) -> str:
3    """
4    Generate a podcast-style summary.
5    """
6    from google import genai
7    import os
8
9    client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])
10
11    prompt = f"""
12    Create a short podcast-style summary of the following article.
13    Keep the tone clear, neutral, and easy to listen to.
14
15    Article:
16    {clean_text[:6000]}
17    """
18
19    response = client.models.generate_content(
20        model="gemini-2.5-flash",
21        contents=prompt
22    )
23
24    return response.text

Step 5: Audio Generation with ElevenLabs#

The final step converts the generated podcast script into audio using ElevenLabs. A dedicated Tensorlake function reads the text content and sends it to the ElevenLabs Text-to-Speech API using a fixed voice ID.

The model used is eleven_v3, configured with stability and similarity settings to produce clear and natural narration. The resulting MP3 audio is returned as a file object, completing the podcast generation pipeline.

This separation allows audio generation to be retried or swapped with different voices or models without modifying the rest of the application.

1@function(secrets=["ELEVENLABS_API_KEY"])
2def generate_audio(script_text: str) -> File:
3    """
4    Convert podcast script text into audio using ElevenLabs TTS.
5    """
6    import os
7    import requests
8
9    VOICE_ID = "21m00Tcm4TlvDq8ikWAM"
10
11    url = f"https://api.elevenlabs.io/v1/text-to-speech/{VOICE_ID}"
12
13    headers = {
14        "xi-api-key": os.environ["ELEVENLABS_API_KEY"],
15        "Content-Type": "application/json",
16        "Accept": "audio/mpeg",
17    }
18
19    payload = {
20        "text": script_text,
21        "model_id": "eleven_v3",
22        "voice_settings": {
23            "stability": 0.5,
24            "similarity_boost": 0.5,
25        },
26    }
27
28    response = requests.post(url, json=payload, headers=headers)
29
30    if response.status_code != 200:
31        raise RuntimeError(
32            f"ElevenLabs TTS failed: {response.status_code} {response.text}"
33        )
34
35    return File(
36        content=response.content,
37        content_type="audio/mpeg",
38    )

The complete implementation, including the Tensorlake application, scraping logic, summarization agent, and audio generation pipeline, is available in the project repository:

Repository: https://github.com/tensorlakeai/examples/tree/main/podcast-agent

This structure makes it easy to extend the agent with additional processing steps or deploy it as a cloud-hosted Tensorlake application.

Run the script:

python podcast_agent.py

Creating the UI

Now that the backend workflow is in place, the next step is to build the user interface. The UI is implemented using Streamlit and serves as a lightweight interaction layer for the Tensorlake-powered podcast generation pipeline. The interface automatically selects the top article from Hacker News and exposes only basic configuration options.

The interface focuses on clarity and ease of use. All core processing, such as crawling, summarization, and audio generation, runs inside the Tensorlake application, while the UI only triggers the workflow and displays the final podcast audio with playback and download options.

For the full implementation, refer to the app.py file in the repository.

Installing Streamlit

Install Streamlit in the same virtual environment used for the project:

pip install streamlit

Now run the file app.py using the command below:

streamlit run app.py

We’ll get an interface like this:

Deploying to the Tensorlake Cloud

Login

Authenticate with Tensorlake from your terminal:

tensorlake login

Set Secrets

The podcast agent uses external services (Gemini and ElevenLabs), so secrets must be configured.

Option A: Using Tensorlake UI

Go to Agentic Apps → Secrets
Add the following secrets:
- GEMINI_API_KEY
- ELEVENLABS_API_KEY
Save the changes

Option B: Using CLI

tensorlake secretsset GEMINI_API_KEY=your_gemini_key
tensorlake secretsset ELEVENLABS_API_KEY=your_elevenlabs_key

Secrets are securely injected into the functions at runtime.

Verify that the secrets are set correctly:

tensorlake secrets list

Export Tensorlake API Key

If required for local testing or automation, export your Tensorlake API key:

export TENSORLAKE_API_KEY=tl_apiKey_xxxxxxxxx

This allows the CLI and local execution helpers to communicate with Tensorlake.

Deploy the Agent

Deploy the application using the same podcast_agent.py file:

tensorlake deploy podcast_agent.py

During deployment:

Tensorlake validates the application and functions
Container images are built
The agent is registered under Agentic Apps

On successful deployment, Tensorlake returns a permanent endpoint for the agent:

Invoke the Agent

Option A: Using Tensorlake UI

Open Agentic Apps → Your App
Click Invoke
Provide input parameters (example):

Option B: Using CLI

1curl https://api.tensorlake.ai/applications/podcast_agent \
2-H "Authorization: Bearer $TENSORLAKE_API_KEY" \
3--json '{ "url": "example_string", "max_depth": 3, "max_links": 5}'

After you invoke, a Request ID will be generated.

Observe Execution (Graph & Timing)

After invocation, Tensorlake provides a full execution overview, including:

Observable function graph
Execution timing per function
Clear parent → child function relationships

At the end of the workflow, you have:

The top Hacker News articles are automatically selected and processed
Clean, normalized text extracted from each article using the Tensorlake crawler
Concise podcast-style summaries generated by Gemini for each article
High-quality MP3 audio outputs, one per article, generated using ElevenLabs
Clear visibility into each pipeline stage through Tensorlake’s function-level execution model

Key Takeaways

This project demonstrates how a structured, end-to-end workflow can transform live web content into podcast-ready audio with minimal manual input.
Tensorlake acts as the execution backbone, orchestrating article selection, crawling, summarization, and audio generation as discrete, composable functions.
By separating source selection, scraping, summarization, and voice synthesis into individual Tensorlake functions, the pipeline remains easy to understand, debug, and extend.
External models such as Gemini and ElevenLabs are used purely for inference, while Tensorlake manages execution boundaries, retries, and function-level isolation.
The same architectural pattern can be readily extended to other automated content workflows, such as daily news podcasts, research summaries, or technical briefings.

If you want to build similar execution-heavy AI workflows without managing infrastructure, explore what Tensorlake offers for running tools, preparing data, and scaling agent-style applications.

Start with the Tensorlake Applications Quickstart, experiment with the cookbooks, and see how far you can take this pattern with your own use cases.

‍

No items found.

Get server-less runtime for agents and data ingestion

Data ingestion like never before.

TRY TENSORLAKE

REQUEST A DEMO

TRUSTED BY PRO DEVS GLOBALLY

Tensorlake is the Agentic Compute Runtime the durable serverless platform that runs Agents at scale.

“With Tensorlake, we've been able to handle complex document parsing and data formats that many other providers don't support natively, at a throughput that significantly improves our application's UX. Beyond the technology, the team's responsiveness stands out, they quickly iterate on our feedback and continuously expand the model's capabilities.”

Vincent Di Pietro

Founder, Novis AI

"At SIXT, we're building AI-powered experiences for millions of customers while managing the complexity of enterprise-scale data. TensorLake gives us the foundation we need—reliable document ingestion that runs securely in our VPC to power our generative AI initiatives."

Boyan Dimitrov

CTO, Sixt

“Tensorlake enabled us to avoid building and operating an in-house OCR pipeline by providing a robust, scalable OCR and document ingestion layer with excellent accuracy and feature coverage. Ongoing improvements to the platform, combined with strong technical support, make it a dependable foundation for our scientific document workflows.”

Yaroslav Sklabinskyi

Principal Software Engineer, Reliant AI

"For BindHQ customers, the integration with Tensorlake represents a shift from manual data handling to intelligent automation, helping insurance businesses operate with greater precision, and responsiveness across a variety of transactions"

Cristian Joe

CEO @ BindHQ

“Tensorlake let us ship faster and stay reliable from day one. Complex stateful AI workloads that used to require serious infra engineering are now just long-running functions. As we scale, that means we can stay lean—building product, not managing infrastructure.”

Arpan Bhattacharya

CEO, The Intelligent Search Company