๐งฉ Tutorial: Structured Output & Multi-step Chains with HuggingFace + OpenAI (LangChain)
In this tutorial you’ll learn:
-
How to get JSON output from LLMs with
JsonOutputParser -
How to get strongly-typed objects using
PydanticOutputParser -
How to do multi-step prompting (report ➜ summary) with:
-
Manual steps (HuggingFace)
-
A single chain pipeline (OpenAI)
-
-
How to use
StructuredOutputParser+ResponseSchemafor custom structured outputs
We’ll use:
-
google/gemma-2-2b-itviaHuggingFaceEndpoint -
ChatOpenAIfor OpenAI models
Assumes:
-
.envcontains your keys (HuggingFace, OpenAI) -
pip install langchain langchain-core langchain-openai langchain-huggingface pydantic python-dotenv
1. JSON Output Using JsonOutputParser (HuggingFace + Gemma)
๐ฏ Goal
Ask the model for 5 facts about a topic and get them as proper JSON instead of free-text paragraphs.
๐ป Code
from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import JsonOutputParser
load_dotenv()
# HuggingFace LLM (Gemma)
llm = HuggingFaceEndpoint(
repo_id="google/gemma-2-2b-it",
task="text-generation"
)
model = ChatHuggingFace(llm=llm)
# Parser that expects JSON
parser = JsonOutputParser()
template = PromptTemplate(
template='Give me 5 facts about {topic} \n{format_instruction}',
input_variables=['topic'],
partial_variables={'format_instruction': parser.get_format_instructions()}
)
chain = template | model | parser
result = chain.invoke({'topic': 'black hole'})
print(result)
๐ What’s happening?
-
JsonOutputParser:-
Provides
get_format_instructions()→ instructions like:
“Return output as a JSON object …”
-
-
PromptTemplate:-
Adds these instructions into the prompt via
{format_instruction}.
-
-
chain = template | model | parser:-
Builds a mini pipeline:
-
Format the prompt
-
Call Gemma
-
Parse response as JSON
-
-
⏰ When to use this?
-
When you need raw JSON, not just text:
-
Lists of facts
-
Attributes of a product
-
Configuration-like data
-
❓ Why is it useful?
-
Easier to use in code:
-
You get a Python dict/list directly.
-
No need to manually
json.loads()loosely formatted text.
-
๐งพ Example result (shape, not exact):
[
"Black holes are regions of spacetime where gravity is so strong that nothing can escape.",
"They are formed from the remnants of massive stars that have collapsed.",
"The boundary of a black hole is called the event horizon.",
"Supermassive black holes are found at the center of many galaxies.",
"Black holes can grow by absorbing matter and merging with other black holes."
]
2. Strongly Typed Output with PydanticOutputParser (HuggingFace + Gemma)
๐ฏ Goal
Generate a fictional person (name, age, city) with type constraints.
๐ป Code
from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field
load_dotenv()
# HuggingFace LLM (Gemma)
llm = HuggingFaceEndpoint(
repo_id="google/gemma-2-2b-it",
task="text-generation"
)
model = ChatHuggingFace(llm=llm)
# Pydantic model for structured output
class Person(BaseModel):
name: str = Field(description='Name of the person')
age: int = Field(gt=18, description='Age of the person')
city: str = Field(description='Name of the city the person belongs to')
parser = PydanticOutputParser(pydantic_object=Person)
template = PromptTemplate(
template='Generate the name, age and city of a fictional {place} person.\n{format_instruction}',
input_variables=['place'],
partial_variables={'format_instruction': parser.get_format_instructions()}
)
chain = template | model | parser
final_result: Person = chain.invoke({'place': 'Sri Lankan'})
print(final_result)
print(final_result.name, final_result.age, final_result.city)
๐ What’s happening?
-
Person(BaseModel)defines the schema:-
age: intwithgt=18(must be > 18).
-
-
PydanticOutputParser:-
Generates detailed format instructions (JSON format, field names/types).
-
Parses model output into a Person instance.
-
-
You get runtime validation:
-
If the model returns invalid data, Pydantic will raise an error.
-
⏰ When to use this?
-
In backend services where:
-
LLM result feeds directly into DB or API response
-
You need guaranteed fields & types
-
❓ Why is this better than plain JSON?
-
You get:
-
Validation
-
Intellisense / autocomplete
-
Clean, typed Python objects
-
๐งพ Example result (approx):
name='Kamal Perera' age=29 city='Colombo'
You can also do:
print(final_result.model_dump())
To get a dict.
3. Two-Step Prompting (Report ➜ Summary) with HuggingFace (Manual Steps)
๐ฏ Goal
-
Generate a detailed report on a topic
-
Then generate a 5-line summary of that report
๐ป Code
from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
load_dotenv()
llm = HuggingFaceEndpoint(
repo_id="google/gemma-2-2b-it",
task="text-generation"
)
model = ChatHuggingFace(llm=llm)
# 1st prompt -> detailed report
template1 = PromptTemplate(
template='Write a detailed report on {topic}',
input_variables=['topic']
)
# 2nd prompt -> summary
template2 = PromptTemplate(
template='Write a 5 line summary on the following text:\n{text}',
input_variables=['text']
)
# Step 1: create and send first prompt
prompt1 = template1.invoke({'topic': 'black hole'})
result = model.invoke(prompt1)
# Step 2: create summary prompt using result.content
prompt2 = template2.invoke({'text': result.content})
result1 = model.invoke(prompt2)
print(result1.content)
(I fixed /n to \n in the template.)
๐ What’s happening?
-
First prompt:
-
"Write a detailed report on black hole"→ Gemma → long explanation.
-
-
Second prompt:
-
"Write a 5 line summary on the following text:\n<report>"→ Gemma → short summary.
-
You’re manually:
-
Building prompts via
template.invoke -
Calling
model.invokefor each step
⏰ When to use this manual style?
-
When experimenting, learning, or debugging each step.
-
When you want to inspect intermediate results (e.g. the full report) separately.
❓ Why is it useful?
-
Shows clearly how multi-step reasoning works.
-
You can log intermediate outputs, save them, etc.
4. Same Multi-Step Prompting as a Single Chain (OpenAI + StrOutputParser)
Now do the same, but as a single pipeline using the | operator.
๐ป Code
from langchain_openai import ChatOpenAI
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
load_dotenv()
model = ChatOpenAI()
# 1st prompt -> detailed report
template1 = PromptTemplate(
template='Write a detailed report on {topic}',
input_variables=['topic']
)
# 2nd prompt -> summary
template2 = PromptTemplate(
template='Write a 5 line summary on the following text:\n{text}',
input_variables=['text']
)
parser = StrOutputParser()
# Build the chain:
# topic → template1 → model → string → template2 → model → string
chain = template1 | model | parser | template2 | model | parser
result = chain.invoke({'topic': 'black hole'})
print(result)
๐ What’s happening?
-
template1receives{'topic': 'black hole'}and formats prompt. -
modelgenerates a report. -
parserconverts that to a plain string. -
template2uses that string as{text}. -
modelgenerates the summary. -
last
parserreturns the summary as a string.
All in one line:
template1 | model | parser | template2 | model | parser
⏰ When to use a chain like this?
-
When you don’t need to inspect intermediate results.
-
When building reuseable components (e.g. a “report+summary” service).
❓ Why is this pattern nice?
-
Clear, declarative pipeline.
-
Easy to reuse in apps, Streamlit, FastAPI, etc.
5. Structured Output With StructuredOutputParser + ResponseSchema (HuggingFace)
This is another way to define structure (older but still useful API).
๐ฏ Goal
Ask for 3 facts about a topic, each in its own named field.
๐ป Code
from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain.output_parsers import StructuredOutputParser, ResponseSchema
load_dotenv()
# HuggingFace model
llm = HuggingFaceEndpoint(
repo_id="google/gemma-2-2b-it",
task="text-generation"
)
model = ChatHuggingFace(llm=llm)
schema = [
ResponseSchema(name='fact_1', description='Fact 1 about the topic'),
ResponseSchema(name='fact_2', description='Fact 2 about the topic'),
ResponseSchema(name='fact_3', description='Fact 3 about the topic'),
]
parser = StructuredOutputParser.from_response_schemas(schema)
template = PromptTemplate(
template='Give 3 facts about {topic}.\n{format_instruction}',
input_variables=['topic'],
partial_variables={'format_instruction': parser.get_format_instructions()}
)
chain = template | model | parser
result = chain.invoke({'topic': 'black hole'})
print(result)
๐ What’s happening?
-
ResponseSchemadefines named fields:-
fact_1,fact_2,fact_3
-
-
StructuredOutputParser:-
Generates instructions telling the LLM how to format output.
-
Parses the result into a dict like:
-
{
'fact_1': '...',
'fact_2': '...',
'fact_3': '...',
}
⏰ When to use this?
-
When you want specific, named outputs that aren’t naturally modeled as a list.
-
Good for:
-
sections like
intro,body,conclusion -
title,description,keywords
-
❓ Pydantic vs StructuredOutputParser?
-
Pydantic:
-
Strong typing, validation,
BaseModelclasses.
-
-
StructuredOutputParser:-
Quick way to define field names + descriptions.
-
No Pydantic dependency; gives you a dict.
-
๐ง Big Picture – Which Parser to Use?
| Parser | Output Type | Best For |
|---|---|---|
JsonOutputParser |
dict / list | Generic JSON structures without strong typing |
PydanticOutputParser |
Pydantic model | Validated, typed data for backend logic |
StructuredOutputParser |
dict (fixed keys) | Simple named fields, quick schema |
StrOutputParser |
plain string | Normal text generation / single-step content |