Friday, November 28, 2025

Structured Output & Multi-step Chains with HuggingFace + OpenAI

🧩 Tutorial: Structured Output & Multi-step Chains with HuggingFace + OpenAI (LangChain)

In this tutorial you’ll learn:

How to get JSON output from LLMs with JsonOutputParser
How to get strongly-typed objects using PydanticOutputParser
How to do multi-step prompting (report ➜ summary) with:
- Manual steps (HuggingFace)
- A single chain pipeline (OpenAI)
How to use StructuredOutputParser + ResponseSchema for custom structured outputs

We’ll use:

google/gemma-2-2b-it via HuggingFaceEndpoint
ChatOpenAI for OpenAI models

Assumes:

.env contains your keys (HuggingFace, OpenAI)
pip install langchain langchain-core langchain-openai langchain-huggingface pydantic python-dotenv

1. JSON Output Using `JsonOutputParser` (HuggingFace + Gemma)

🎯 Goal

Ask the model for 5 facts about a topic and get them as proper JSON instead of free-text paragraphs.

💻 Code

from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import JsonOutputParser

load_dotenv()

# HuggingFace LLM (Gemma)
llm = HuggingFaceEndpoint(
    repo_id="google/gemma-2-2b-it",
    task="text-generation"
)

model = ChatHuggingFace(llm=llm)

# Parser that expects JSON
parser = JsonOutputParser()

template = PromptTemplate(
    template='Give me 5 facts about {topic} \n{format_instruction}',
    input_variables=['topic'],
    partial_variables={'format_instruction': parser.get_format_instructions()}
)

chain = template | model | parser

result = chain.invoke({'topic': 'black hole'})

print(result)

🔍 What’s happening?

JsonOutputParser:
- Provides get_format_instructions() → instructions like:
  “Return output as a JSON object …”
PromptTemplate:
- Adds these instructions into the prompt via {format_instruction}.
chain = template | model | parser:
- Builds a mini pipeline:
  1. Format the prompt
  2. Call Gemma
  3. Parse response as JSON

⏰ When to use this?

When you need raw JSON, not just text:
- Lists of facts
- Attributes of a product
- Configuration-like data

❓ Why is it useful?

Easier to use in code:
- You get a Python dict/list directly.
- No need to manually json.loads() loosely formatted text.

🧾 Example result (shape, not exact):

[
  "Black holes are regions of spacetime where gravity is so strong that nothing can escape.",
  "They are formed from the remnants of massive stars that have collapsed.",
  "The boundary of a black hole is called the event horizon.",
  "Supermassive black holes are found at the center of many galaxies.",
  "Black holes can grow by absorbing matter and merging with other black holes."
]

2. Strongly Typed Output with `PydanticOutputParser` (HuggingFace + Gemma)

🎯 Goal

Generate a fictional person (name, age, city) with type constraints.

💻 Code

from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field

load_dotenv()

# HuggingFace LLM (Gemma)
llm = HuggingFaceEndpoint(
    repo_id="google/gemma-2-2b-it",
    task="text-generation"
)

model = ChatHuggingFace(llm=llm)

# Pydantic model for structured output
class Person(BaseModel):
    name: str = Field(description='Name of the person')
    age: int = Field(gt=18, description='Age of the person')
    city: str = Field(description='Name of the city the person belongs to')

parser = PydanticOutputParser(pydantic_object=Person)

template = PromptTemplate(
    template='Generate the name, age and city of a fictional {place} person.\n{format_instruction}',
    input_variables=['place'],
    partial_variables={'format_instruction': parser.get_format_instructions()}
)

chain = template | model | parser

final_result: Person = chain.invoke({'place': 'Sri Lankan'})

print(final_result)
print(final_result.name, final_result.age, final_result.city)

🔍 What’s happening?

Person(BaseModel) defines the schema:
- age: int with gt=18 (must be > 18).
PydanticOutputParser:
- Generates detailed format instructions (JSON format, field names/types).
- Parses model output into a Person instance.
You get runtime validation:
- If the model returns invalid data, Pydantic will raise an error.

⏰ When to use this?

In backend services where:
- LLM result feeds directly into DB or API response
- You need guaranteed fields & types

❓ Why is this better than plain JSON?

You get:
- Validation
- Intellisense / autocomplete
- Clean, typed Python objects

🧾 Example result (approx):

name='Kamal Perera' age=29 city='Colombo'

You can also do:

print(final_result.model_dump())

To get a dict.

3. Two-Step Prompting (Report ➜ Summary) with HuggingFace (Manual Steps)

🎯 Goal

Generate a detailed report on a topic
Then generate a 5-line summary of that report

💻 Code

from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate

load_dotenv()

llm = HuggingFaceEndpoint(
    repo_id="google/gemma-2-2b-it",
    task="text-generation"
)

model = ChatHuggingFace(llm=llm)

# 1st prompt -> detailed report
template1 = PromptTemplate(
    template='Write a detailed report on {topic}',
    input_variables=['topic']
)

# 2nd prompt -> summary
template2 = PromptTemplate(
    template='Write a 5 line summary on the following text:\n{text}',
    input_variables=['text']
)

# Step 1: create and send first prompt
prompt1 = template1.invoke({'topic': 'black hole'})
result = model.invoke(prompt1)

# Step 2: create summary prompt using result.content
prompt2 = template2.invoke({'text': result.content})
result1 = model.invoke(prompt2)

print(result1.content)

(I fixed /n to \n in the template.)

🔍 What’s happening?

First prompt:
- "Write a detailed report on black hole" → Gemma → long explanation.
Second prompt:
- "Write a 5 line summary on the following text:\n<report>" → Gemma → short summary.

You’re manually:

Building prompts via template.invoke
Calling model.invoke for each step

⏰ When to use this manual style?

When experimenting, learning, or debugging each step.
When you want to inspect intermediate results (e.g. the full report) separately.

❓ Why is it useful?

Shows clearly how multi-step reasoning works.
You can log intermediate outputs, save them, etc.

4. Same Multi-Step Prompting as a Single Chain (OpenAI + StrOutputParser)

Now do the same, but as a single pipeline using the | operator.

💻 Code

from langchain_openai import ChatOpenAI
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser

load_dotenv()

model = ChatOpenAI()

# 1st prompt -> detailed report
template1 = PromptTemplate(
    template='Write a detailed report on {topic}',
    input_variables=['topic']
)

# 2nd prompt -> summary
template2 = PromptTemplate(
    template='Write a 5 line summary on the following text:\n{text}',
    input_variables=['text']
)

parser = StrOutputParser()

# Build the chain:
# topic → template1 → model → string → template2 → model → string
chain = template1 | model | parser | template2 | model | parser

result = chain.invoke({'topic': 'black hole'})

print(result)

🔍 What’s happening?

template1 receives {'topic': 'black hole'} and formats prompt.
model generates a report.
parser converts that to a plain string.
template2 uses that string as {text}.
model generates the summary.
last parser returns the summary as a string.

All in one line:

template1 | model | parser | template2 | model | parser

⏰ When to use a chain like this?

When you don’t need to inspect intermediate results.
When building reuseable components (e.g. a “report+summary” service).

❓ Why is this pattern nice?

Clear, declarative pipeline.
Easy to reuse in apps, Streamlit, FastAPI, etc.

5. Structured Output With `StructuredOutputParser` + `ResponseSchema` (HuggingFace)

This is another way to define structure (older but still useful API).

🎯 Goal

Ask for 3 facts about a topic, each in its own named field.

💻 Code

from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain.output_parsers import StructuredOutputParser, ResponseSchema

load_dotenv()

# HuggingFace model
llm = HuggingFaceEndpoint(
    repo_id="google/gemma-2-2b-it",
    task="text-generation"
)

model = ChatHuggingFace(llm=llm)

schema = [
    ResponseSchema(name='fact_1', description='Fact 1 about the topic'),
    ResponseSchema(name='fact_2', description='Fact 2 about the topic'),
    ResponseSchema(name='fact_3', description='Fact 3 about the topic'),
]

parser = StructuredOutputParser.from_response_schemas(schema)

template = PromptTemplate(
    template='Give 3 facts about {topic}.\n{format_instruction}',
    input_variables=['topic'],
    partial_variables={'format_instruction': parser.get_format_instructions()}
)

chain = template | model | parser

result = chain.invoke({'topic': 'black hole'})

print(result)

🔍 What’s happening?

ResponseSchema defines named fields:
- fact_1, fact_2, fact_3
StructuredOutputParser:
- Generates instructions telling the LLM how to format output.
- Parses the result into a dict like:

{
  'fact_1': '...',
  'fact_2': '...',
  'fact_3': '...',
}

⏰ When to use this?

When you want specific, named outputs that aren’t naturally modeled as a list.
Good for:
- sections like intro, body, conclusion
- title, description, keywords

❓ Pydantic vs StructuredOutputParser?

Pydantic:
- Strong typing, validation, BaseModel classes.
StructuredOutputParser:
- Quick way to define field names + descriptions.
- No Pydantic dependency; gives you a dict.

🧠 Big Picture – Which Parser to Use?

Parser	Output Type	Best For
`JsonOutputParser`	dict / list	Generic JSON structures without strong typing
`PydanticOutputParser`	Pydantic model	Validated, typed data for backend logic
`StructuredOutputParser`	dict (fixed keys)	Simple named fields, quick schema
`StrOutputParser`	plain string	Normal text generation / single-step content

Structured Data, Pydantic & LangChain Structured Outputs

🧩 Tutorial: Structured Data, Pydantic & LangChain Structured Outputs

In this tutorial, you’ll learn:

What JSON Schema is and how it relates to models
How to define data models using Pydantic BaseModel
How to use TypedDict for typed dictionaries
How to get structured JSON output from LLMs using:
- Raw JSON schema (Python dict)
- Pydantic BaseModel
- TypedDict + Annotated
How to use structured output with:
- ChatOpenAI
- ChatHuggingFace (TinyLlama endpoint)

1. JSON Schema Basics (Concept Level)

You started with:

{
    "title": "student",
    "description": "schema about students",
    "type": "object",
    "properties": {
        "name": "string",
        "age": "integer"
    },
    "required": ["name"]
}

🔍 What is this?

This is a JSON Schema:

type: "object" → it describes an object
properties:
- name: string
- age: integer
required: ["name"] → name must be present; age is optional.

🕒 When is JSON Schema used?

To validate JSON payloads (APIs, configurations).
To describe the structure of data for tools and LLMs.
As a contract between systems (backend ↔ frontend, services).

❓ Why do we care here?

LLMs can be guided to output JSON matching a schema.
Libraries like LangChain and Pydantic internally map to/from JSON schema.

2. Pydantic `BaseModel` – Strongly Typed Student Model

💻 Code

from pydantic import BaseModel, EmailStr, Field
from typing import Optional

class Student(BaseModel):
    name: str = 'nitish'
    age: Optional[int] = None
    email: EmailStr
    cgpa: float = Field(
        gt=0,
        lt=10,
        default=5,
        description='A decimal value representing the cgpa of the student'
    )

new_student = {'age': '32', 'email': 'abc@gmail.com'}

student = Student(**new_student)

student_dict = dict(student)

print(student_dict['age'])

student_json = student.model_dump_json()

🔍 What is happening?

Student is a Pydantic model with:
- name: str → default "nitish"
- age: Optional[int] → may be None, but here "32" will be converted to 32
- email: EmailStr → validates proper email format
- cgpa: float with:
  - gt=0, lt=10 → must be between 0 and 10
  - default 5
  - helpful description
Student(**new_student):
- age is "32" (string) → Pydantic converts to 32 (int)
- email is validated
- name uses default "nitish"
- cgpa uses default 5.0

model_dump_json() creates JSON string like:

{"name": "nitish", "age": 32, "email": "abc@gmail.com", "cgpa": 5.0}

🕒 When to use `BaseModel`?

Whenever you want:
- Validation of input data
- Type safety inside Python
- JSON <-> Python object conversion

❓ Why is this important for LLM work?

LLMs output free-form text by default.
With Pydantic + LangChain, you can ask them to output structured, valid data that matches your model.

3. `TypedDict` – Typed Dictionaries Without Validation

💻 Code

from typing import TypedDict

class Person(TypedDict):
    name: str
    age: int

new_person: Person = {'name': 'nitish', 'age': 35}

print(new_person)

🔍 What is `TypedDict`?

A way to define the expected shape of a dictionary for type checkers.
No runtime validation like Pydantic; just static typing help.

🕒 When to use TypedDict?

When you:
- Want type hints
- Don’t need heavy validation
- Prefer lightweight types (no Pydantic overhead)

❓ Why is this relevant?

LangChain’s with_structured_output can work with:
- JSON Schema dict
- Pydantic BaseModel
- TypedDict (+ Annotated descriptions)

This gives you flexibility depending on your style.

4. Structured Output from `ChatOpenAI` Using Raw JSON Schema

Here you passed a JSON schema dict to with_structured_output.

💻 Code

from langchain_openai import ChatOpenAI
from dotenv import load_dotenv

load_dotenv()

model = ChatOpenAI()

# schema
json_schema = {
  "title": "Review",
  "type": "object",
  "properties": {
    "key_themes": {
      "type": "array",
      "items": {"type": "string"},
      "description": "Write down all the key themes discussed in the review in a list"
    },
    "summary": {
      "type": "string",
      "description": "A brief summary of the review"
    },
    "sentiment": {
      "type": "string",
      "enum": ["pos", "neg"],
      "description": "Return sentiment of the review either negative, positive or neutral"
    },
    "pros": {
      "type": ["array", "null"],
      "items": {"type": "string"},
      "description": "Write down all the pros inside a list"
    },
    "cons": {
      "type": ["array", "null"],
      "items": {"type": "string"},
      "description": "Write down all the cons inside a list"
    },
    "name": {
      "type": ["string", "null"],
      "description": "Write the name of the reviewer"
    }
  },
  "required": ["key_themes", "summary", "sentiment"]
}

structured_model = model.with_structured_output(json_schema)

result = structured_model.invoke(
    """I recently upgraded to the Samsung Galaxy S24 Ultra, and I must say, it’s an absolute powerhouse! ... Review by Nitish Singh"""
)

print(result)

🔍 What is happening?

with_structured_output(json_schema):
- Tells the model: “Your output must follow this JSON schema.”
invoke(review_text):
- LLM reads your review.
- Extracts:
  - key_themes: list[str]
  - summary: string
  - sentiment: "pos" or "neg"
  - pros: list[str] or null
  - cons: list[str] or null
  - name: string or null
result will be a Python dict matching the schema.

🕒 When to use raw JSON schema?

When:
- You’re comfortable with JSON Schema
- You want full schema control (for tools, OpenAPI, etc.)
- You’re not tied to Python type system only

❓ Why is this great?

You get machine-usable structured data from an LLM in one step.
No need to write brittle regex or JSON-parsing hacks.

🧾 Example result (approx):

{
  "key_themes": [
    "Powerful performance",
    "High-quality camera",
    "Long battery life",
    "S-Pen usefulness",
    "Heavy device and size",
    "Bloatware in One UI",
    "High price"
  ],
  "summary": "The reviewer is very impressed with the Galaxy S24 Ultra's performance, camera, and battery life, but dislikes the bulky design, Samsung bloatware, and high price.",
  "sentiment": "pos",
  "pros": [
    "Fast Snapdragon 8 Gen 3 processor",
    "Excellent 200MP camera with great zoom",
    "Strong battery life with fast charging",
    "S-Pen support for notes and sketches"
  ],
  "cons": [
    "Heavy and uncomfortable for one-handed use",
    "Bloatware in One UI",
    "Very expensive price tag"
  ],
  "name": "Nitish Singh"
}

5. Structured Output with Pydantic `BaseModel` + HuggingFace (TinyLlama)

Now the same idea, but using Pydantic model and HuggingFace LLM.

💻 Code

from dotenv import load_dotenv
from typing import Optional, Literal
from pydantic import BaseModel, Field
from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint

load_dotenv()

llm = HuggingFaceEndpoint(
    repo_id="TinyLlama/TinyLlama-1.1B-Chat-v1.0",
    task="text-generation"
)

model = ChatHuggingFace(llm=llm)

# schema
class Review(BaseModel):
    key_themes: list[str] = Field(
        description="Write down all the key themes discussed in the review in a list"
    )
    summary: str = Field(description="A brief summary of the review")
    sentiment: Literal["pos", "neg"] = Field(
        description="Return sentiment of the review either negative, positive or neutral"
    )
    pros: Optional[list[str]] = Field(
        default=None,
        description="Write down all the pros inside a list"
    )
    cons: Optional[list[str]] = Field(
        default=None,
        description="Write down all the cons inside a list"
    )
    name: Optional[str] = Field(
        default=None,
        description="Write the name of the reviewer"
    )

structured_model = model.with_structured_output(Review)

result = structured_model.invoke(
    """I recently upgraded to the Samsung Galaxy S24 Ultra, and I must say, it’s an absolute powerhouse! ... Review by Nitish Singh"""
)

print(result)

🔍 What is happening?

Review(BaseModel) defines the Python data model.
with_structured_output(Review):
- LangChain internally converts this to JSON schema.
- Forces the LLM to return data that can be parsed as Review.
Result is a Review instance (Pydantic object), not just dict.

So you can do:

print(result.summary)
print(result.sentiment)
print(result.name)

🕒 When to use Pydantic + structured output?

When you:
- Want validation & type hints
- Work in a Python backend
- Need to plug result straight into your code

❓ Why is this powerful?

End-to-end pipeline:
- Raw text → LLM → Pydantic model → direct use in database / APIs

6. Structured Output with Pydantic + `ChatOpenAI`

Same Pydantic Review model, but using OpenAI instead of HuggingFace.

💻 Code

from langchain_openai import ChatOpenAI
from dotenv import load_dotenv
from typing import Optional, Literal
from pydantic import BaseModel, Field

load_dotenv()

model = ChatOpenAI()

# schema
class Review(BaseModel):
    key_themes: list[str] = Field(description="Write down all the key themes discussed in the review in a list")
    summary: str = Field(description="A brief summary of the review")
    sentiment: Literal["pos", "neg"] = Field(description="Return sentiment of the review either negative, positive or neutral")
    pros: Optional[list[str]] = Field(default=None, description="Write down all the pros inside a list")
    cons: Optional[list[str]] = Field(default=None, description="Write down all the cons inside a list")
    name: Optional[str] = Field(default=None, description="Write the name of the reviewer")

structured_model = model.with_structured_output(Review)

result = structured_model.invoke(
    """I recently upgraded to the Samsung Galaxy S24 Ultra, and I must say, it’s an absolute powerhouse! ... Review by Nitish Singh"""
)

print(result)

🔍 What’s different?

Same idea, different provider.
You still get a Review object.

Example usage:

print(result.key_themes)
print(result.pros)
print(result.cons)
print(result.name)

7. Structured Output with `TypedDict` + `Annotated` (ChatOpenAI)

Here’s a more lightweight type approach.

💻 Code

from langchain_openai import ChatOpenAI
from dotenv import load_dotenv
from typing import TypedDict, Annotated, Optional, Literal

load_dotenv()

model = ChatOpenAI()

# schema
class Review(TypedDict):
    key_themes: Annotated[list[str], "Write down all the key themes discussed in the review in a list"]
    summary: Annotated[str, "A brief summary of the review"]
    sentiment: Annotated[Literal["pos", "neg"], "Return sentiment of the review either negative, positive or neutral"]
    pros: Annotated[Optional[list[str]], "Write down all the pros inside a list"]
    cons: Annotated[Optional[list[str]], "Write down all the cons inside a list"]
    name: Annotated[Optional[str], "Write the name of the reviewer"]

structured_model = model.with_structured_output(Review)

result = structured_model.invoke(
    """I recently upgraded to the Samsung Galaxy S24 Ultra, and I must say, it’s an absolute powerhouse! ... Review by Nitish Singh"""
)

print(result['name'])

🔍 What is happening?

Review is a TypedDict with Annotated descriptions.
with_structured_output(Review):
- Uses typing + annotations to build the schema.
result is a plain dict, but type checkers know its structure.

Example result:

{
  "key_themes": [...],
  "summary": "...",
  "sentiment": "pos",
  "pros": [...],
  "cons": [...],
  "name": "Nitish Singh"
}

print(result['name']) → "Nitish Singh"

🕒 When to use this?

When you:
- Want static typing but don’t need Pydantic
- Prefer minimal dependencies
- Still want structured outputs from LLM

❓ Why use `TypedDict` + `Annotated`?

Lightweight
Works nicely with mypy / type-checkers
You still get descriptions for the LLM to follow

8. Big Picture: Which Structured Output Style to Use?

Approach	Type	Runtime Validation	Best For
Raw JSON Schema (dict)	Dict	No (LLM constrained only)	Multi-language / tool-level schema
`BaseModel` (Pydantic)	Class	✅ Yes	Python backends, APIs, DB integration
`TypedDict` + `Annotated`	Dict type	❌ No	Lightweight typing, fast, simple

9. Why Structured Output Matters for LLM Apps

Without structured output:

You get free text → must parse manually
More chances of errors (missing fields, invalid JSON, etc.)

With structured output:

LLM output → auto-validated object/dict
You can directly:
- Save to DB
- Return in API
- Feed into next processing step

This is critical for production-grade AI features where you need reliable data, not just pretty text.

Below is a clean, professional comparison table showing the differences between:

✅ JSON Schema (dict)
✅ Pydantic BaseModel
✅ TypedDict + Annotated
when used with LangChain structured outputs.

📊 Comparison Table — Structured Output Methods in LangChain

Feature / Aspect	JSON Schema (dict)	Pydantic BaseModel	TypedDict + Annotated
Definition Type	Python dictionary describing JSON schema	Python class extending `BaseModel`	Python `TypedDict` with `Annotated` descriptions
Runtime Validation	❌ No validation (LLM must comply)	✅ Yes (strict validation by Pydantic)	❌ No runtime validation
Output Type	`dict`	Pydantic model instance	`dict`
Error Handling if Output Invalid	❌ You must manually check	✅ Pydantic raises validation errors	❌ No built-in guarantees
Best Use Case	Tooling, API schema, cross-language systems	Backend apps needing clean, validated objects	Lightweight typing with minimal overhead
Ease of Use	⭐⭐⭐	⭐⭐⭐⭐	⭐⭐⭐⭐
Flexibility / Customization	⭐⭐⭐⭐⭐ (full JSON schema control)	⭐⭐⭐⭐ (rich field types)	⭐⭐⭐ (simple types only)
Type Safety	❌ No	✅ Strong	⚠️ Static only (type checkers)
Performance	Fast (no validation)	Slightly slower (validation overhead)	Fast (no validation)
Works With	All LangChain models	All LangChain models	All LangChain models
Ideal For	Multi-language systems, OpenAPI, strict schema control	Python apps, APIs, DB pipelines	Quick typing, simple extraction tasks
Description Support	Medium (via `description` fields)	Strong (via `Field(description=...)`)	Strong (via `Annotated`)
Nested Complex Structures	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐	⭐⭐ (less flexible)
Strictness	Low	High	Medium
Use in Production	⚠️ Only if model output reliable	✅ Yes, recommended	⚠️ For simple use cases
Requires External Library	❌ No	✅ Yes → Pydantic	❌ No
Automatic JSON Serialization	Manual	Built-in (`model_dump_json()`)	Manual

🧭 Summary in Simple Words

1. JSON Schema → Specification

Best when you need a standard schema
Great for cross-language use
No validation → LLM must obey

2. Pydantic BaseModel → Strict Validation

Ensures correct & clean structured output
Perfect for backends, APIs, databases
Most reliable for production

3. TypedDict + Annotated → Lightweight

No validation, faster
Good for simple tasks
Best when you want type hints but don’t want heavy models

🏅 Which One Should YOU Use?

Need	Choose
Production app, strict typing	Pydantic BaseModel
Tool integration / OpenAPI / external systems	JSON Schema
Lightweight & fast	TypedDict + Annotated
Most predictable results	Pydantic BaseModel

Saturday, November 22, 2025

Text Chunking in LangChain

✂️ Tutorial: Text Chunking in LangChain

(Recursive Splitter, Character Splitter, Language-Aware Splitter, Semantic Chunker)

Chunking is the most important step in building a RAG system.

In this tutorial you will learn:

How to chunk PDFs using CharacterTextSplitter
How to chunk Markdown using language-aware splitters
How to chunk Python code safely
How to chunk text semantically using embeddings
When to use which splitter and why

🔥 1. Splitting PDFs — `CharacterTextSplitter`

💻 Code

from langchain.text_splitter import CharacterTextSplitter
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader('dl-curriculum.pdf')
docs = loader.load()

splitter = CharacterTextSplitter(
    chunk_size=200,
    chunk_overlap=0,
    separator=''
)

result = splitter.split_documents(docs)

print(result[1].page_content)

🔍 WHAT is happening?

Load a PDF → each page is a Document
Break each page into 200-character chunks
No overlap between chunks

⏰ WHEN to use this?

For simple text (plain text, PDFs)
When structure is not important
When you want fast and simple chunking

❓ WHY useful?

Many LLM pipelines need small chunks for:
- embeddings
- vector databases
- retrieval

🧾 Example Output (approx)

"Deep Learning has become one of the most exciting areas... (partial text)"

📘 2. RecursiveCharacterTextSplitter — Smart Chunking

This one tries to split intelligently:

First by paragraphs
Then by sentences
Then by words
And falls back safely

2A. Markdown Chunking — Language-aware

💻 Code

from langchain.text_splitter import RecursiveCharacterTextSplitter, Language

text = """
# Project Name: Smart Student Tracker

A simple Python-based project to manage and track student data...
"""

splitter = RecursiveCharacterTextSplitter.from_language(
    language=Language.MARKDOWN,
    chunk_size=200,
    chunk_overlap=0,
)

chunks = splitter.split_text(text)

print(len(chunks))
print(chunks[0])

🔍 WHAT?

Understands Markdown structure
Keeps headings + sections together
Avoids breaking code blocks incorrectly

⏰ WHEN?

GitHub READMEs
Notes in Markdown
Documentation

❓ WHY?

Better accuracy for RAG because chunk boundaries follow logical sections.

🧾 Example Output

1
# Project Name: Smart Student Tracker

A simple Python-based project...

2B. Splitting Python Code — Language.PYTHON

💻 Code

from langchain.text_splitter import RecursiveCharacterTextSplitter, Language

text = """
class Student:
    def __init__(self, name, age, grade):
        self.name = name
        self.age = age
        ...
"""

splitter = RecursiveCharacterTextSplitter.from_language(
    language=Language.PYTHON,
    chunk_size=300,
    chunk_overlap=0,
)

chunks = splitter.split_text(text)

print(len(chunks))
print(chunks[1])

🔍 WHAT?

Safely splits Python code
Keeps functions, classes, and expressions together

⏰ WHEN?

Code RAG
AI assistants for programming
LLM-based debugging

❓ WHY?

Code must not be broken mid-line or mid-block
Helps LLM understand context better

🧾 Example Output

    def is_passing(self):
        return self.grade >= 6.0

🧠 3. Semantic Chunking — Using Embeddings

(The smartest chunker)

💻 Code

from langchain_experimental.text_splitter import SemanticChunker
from langchain_openai.embeddings import OpenAIEmbeddings
from dotenv import load_dotenv

load_dotenv()

text_splitter = SemanticChunker(
    OpenAIEmbeddings(),
    breakpoint_threshold_type="standard_deviation",
    breakpoint_threshold_amount=3
)

sample = """
Farmers were working hard...
The Indian Premier League (IPL) is the biggest cricket league...
Terrorism is a big danger...
"""

docs = text_splitter.create_documents([sample])
print(len(docs))
print(docs)

🔍 WHAT?

Looks at meaning, not characters
Uses embeddings → finds topic shifts
Creates chunks where semantic changes occur

Example:

Farming paragraph → Chunk 1
IPL paragraph → Chunk 2
Terrorism paragraph → Chunk 3

⏰ WHEN?

Long articles
Mixed-topic documents
Web scraping
RAG applications needing high accuracy

❓ WHY?

Avoids mixing unrelated topics
Helps retrieval return the best possible chunk

🧾 Example Output

3
[Document(page_content='Farmers were working...'), Document(...), Document(...)]

📗 4. Simple Text Splitting with RecursiveCharacterTextSplitter

💻 Code

from langchain.text_splitter import RecursiveCharacterTextSplitter

text = """
Space exploration has led to incredible scientific discoveries...
"""

splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=0,
)

chunks = splitter.split_text(text)

print(len(chunks))
print(chunks)

🔍 WHAT?

General-purpose text splitter
Tries small separators → large separators → fallback to characters

⏰ WHEN?

Normal articles
Blogs
Wikipedia text
Anything non-code, non-Markdown

❓ WHY?

Best balance between simplicity and intelligence
Most commonly used splitter in RAG systems

🧾 Example Output

1
['Space exploration has led...']

⭐ Which Text Splitter Should You Use?

Splitter	Best For	Why
CharacterTextSplitter	PDFs, raw text	Fast but dumb splitting
RecursiveCharacterTextSplitter	General-purpose chunking	Most reliable and balanced
Language-aware Splitter	Markdown, Python, HTML	Understands syntax & structure
SemanticChunker	Mixed-topic large docs	Best RAG retrieval accuracy

🎯 Summary

After this tutorial you can:

Split PDFs into pages and chunks
Split Markdown and Python safely
Use semantic chunking with embeddings
Decide which splitter is ideal for your project

Chunking is the backbone of RAG, and now you understand it properly.

LangChain Document Loaders (CSV, PDF, Text, Web) + LLM Processing

📘 Tutorial: LangChain Document Loaders (CSV, PDF, Text, Web) + LLM Processing

In this tutorial, you will learn:

How to load CSV files
How to load PDF files
How to load all PDFs from a directory
How to load TEXT files
How to load webpages (HTML)
How to use LLMs to summarize, answer questions, inspect metadata, etc.

You’ll also understand:

WHAT each loader does
WHEN to use it
WHY it’s important

1️⃣ Loading CSV Files — `CSVLoader`

💻 Code

from langchain_community.document_loaders import CSVLoader

loader = CSVLoader(file_path='Social_Network_Ads.csv')

docs = loader.load()

print(len(docs))
print(docs[1])

🔍 WHAT is happening?

CSVLoader loads CSV rows as individual Documents
Each document has:
- page_content → the row content
- metadata → row index & file info

⏰ WHEN to use this?

When your data is in tabular form:
- Sales CSV
- Ads CSV
- Training dataset
- Any spreadsheet exported as CSV

❓ WHY use CSVLoader?

Converts structured data into LangChain Document objects
Easy to send rows into LLMs for:
- summarization
- quality checks
- classification
- insights

🧾 Example Output (approx.)

403
Document(
  page_content="Age: 40, Salary: 59000, Purchased: No",
  metadata={'source': 'Social_Network_Ads.csv', 'row': 1}
)

2️⃣ Loading Multiple PDFs from a Folder — `DirectoryLoader`

💻 Code

from langchain_community.document_loaders import DirectoryLoader, PyPDFLoader

loader = DirectoryLoader(
    path='books',
    glob='*.pdf',
    loader_cls=PyPDFLoader
)

docs = loader.lazy_load()

for document in docs:
    print(document.metadata)

🔍 WHAT is happening?

DirectoryLoader scans the folder:

/books
  ├─ book1.pdf
  ├─ book2.pdf
  ├─ book3.pdf

Loads all PDFs using PyPDFLoader

⏰ WHEN to use?

When processing:
- E-book collections
- Research papers
- PDF-based knowledge bases
- Multiple invoices

❓ WHY useful?

Automates reading entire directories
Perfect for large-scale document ingestion pipelines

🧾 Example Metadata Output

{'source': 'books/book1.pdf', 'page': 0}
{'source': 'books/book1.pdf', 'page': 1}
...

3️⃣ Loading a Single PDF — `PyPDFLoader`

💻 Code

from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader('dl-curriculum.pdf')

docs = loader.load()

print(len(docs))
print(docs[0].page_content)
print(docs[1].metadata)

🔍 WHAT happens?

Each page becomes a Document
page_content = text from that page
metadata = page number, source file

⏰ WHEN to use this?

When you want page-level analysis:
- Summaries per page
- Extracting answers
- Finding chapters

❓ WHY use PyPDFLoader?

Most PDFs can’t be processed by LLMs directly
This loader extracts text safely & accurately

🧾 Example Output

36
"Deep Learning Curriculum...(full text)"
{'source': 'dl-curriculum.pdf', 'page': 1}

4️⃣ Loading TXT Files — `TextLoader` + Summarization

💻 Code

from langchain_community.document_loaders import TextLoader
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import PromptTemplate
from dotenv import load_dotenv

load_dotenv()

model = ChatOpenAI()

prompt = PromptTemplate(
    template='Write a summary for the following poem - \n {poem}',
    input_variables=['poem']
)

parser = StrOutputParser()

loader = TextLoader('cricket.txt', encoding='utf-8')

docs = loader.load()

print(type(docs))
print(len(docs))
print(docs[0].page_content)
print(docs[0].metadata)

chain = prompt | model | parser

print(chain.invoke({'poem': docs[0].page_content}))

🔍 WHAT happens?

TextLoader loads the file into a single Document
Then we feed the content into an LLM chain for summarization

⏰ WHEN to use?

When processing plain-text:
- poems
- articles
- scripts
- notes

❓ WHY useful?

Text files are very common for:
- datasets
- chat logs
- scraped info

🧾 Example Output (summary)

The poem celebrates the thrill, passion, and joy of cricket...

5️⃣ Loading Website Content — `WebBaseLoader`

💻 Code

from langchain_community.document_loaders import WebBaseLoader
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import PromptTemplate
from dotenv import load_dotenv

load_dotenv()

model = ChatOpenAI()

prompt = PromptTemplate(
    template='Answer the following question \n {question} from the following text - \n {text}',
    input_variables=['question','text']
)

parser = StrOutputParser()

url = 'https://www.flipkart.com/apple-macbook-air-m2-16-gb-256-gb-ssd-macos-sequoia-mc7x4hn-a/p/itmdc5308fa78421'
loader = WebBaseLoader(url)

docs = loader.load()

chain = prompt | model | parser

print(chain.invoke({'question': 'What is the product we are talking about?', 'text': docs[0].page_content}))

🔍 WHAT happens?

WebBaseLoader scrapes HTML, removes tags, extracts readable text
You now have product details as a Document

⏰ WHEN to use?

For pulling data from:
- product pages
- blogs
- documentation
- news articles

❓ WHY useful?

You can automatically create:
- summaries
- Q&A bots
- research assistants
- scraping + LLM analysis pipelines

🧾 Example Answer Output

The product discussed is an Apple MacBook Air M2 (16GB | 256GB SSD).

📌 Summary — Which Loader to Use?

Loader	Best For	Why?
`CSVLoader`	CSV files	Converts rows → Documents
`TextLoader`	TXT files	Simple & reliable text extraction
`PyPDFLoader`	Single PDFs	Page-by-page documents
`DirectoryLoader`	Many PDFs	Automated ingestion
`WebBaseLoader`	Websites	Scrapes HTML → Text

Advanced LangChain Runnables – Sequence, Parallel, Branch, Passthrough & Lambda

🔗 Tutorial: Advanced LangChain Runnables – Sequence, Parallel, Branch, Passthrough & Lambda

In this tutorial you’ll learn how to:

Use RunnableSequence for multi-step flows
Use RunnableParallel to get multiple outputs from one run
Use RunnablePassthrough to forward inputs inside a chain
Use RunnableBranch to add if/else logic
Use RunnableLambda to plug in your own Python functions

We’ll use ChatOpenAI everywhere, assuming you already set up:

OPENAI_API_KEY in .env
pip install langchain langchain-openai python-dotenv

1. Sequential Chain with `RunnableSequence` – Joke → Explanation

Let’s start with the simplest: do one thing after another.

💻 Code

from langchain_openai import ChatOpenAI
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from dotenv import load_dotenv
from langchain.schema.runnable import RunnableSequence

load_dotenv()

prompt1 = PromptTemplate(
    template='Write a joke about {topic}',
    input_variables=['topic']
)

prompt2 = PromptTemplate(
    template='Explain the following joke - {text}',
    input_variables=['text']
)

model = ChatOpenAI()
parser = StrOutputParser()

chain = RunnableSequence(
    prompt1,      # build joke prompt
    model,        # generate joke
    parser,       # get joke as string
    prompt2,      # build explanation prompt from text
    model,        # generate explanation
    parser        # get explanation as string
)

print(chain.invoke({'topic': 'AI'}))

🔍 What is happening?

prompt1 + {'topic': 'AI'} → "Write a joke about AI".
model → generates the joke.
parser → converts AIMessage → plain string.
prompt2 takes that string as {text} → "Explain the following joke - <joke>".
model → explains the joke.
Last parser → returns explanation string.

So the flow is:

input dict → joke prompt → LLM → joke text → explain prompt → LLM → explanation text

🕒 When to use this?

When you need multi-step processing:
- Generate → then explain
- Analyze → then summarize
- Extract → then transform

❓ Why is this useful?

You build structured flows instead of single prompts.
Easy to debug & extend (e.g., add more steps later).

🧾 Example output (will vary)

This joke plays on the common fear that AI will take over human jobs. By making AI itself the one "applying for a job," it flips the perspective and makes the situation humorous instead of scary.

2. Conditional Summarization with `RunnableBranch` + `RunnableSequence`

If the report is too long, summarize it.
If it’s short, just return it as is.

💻 Code

from langchain_openai import ChatOpenAI
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from dotenv import load_dotenv
from langchain.schema.runnable import (
    RunnableSequence,
    RunnableBranch,
    RunnablePassthrough,
)

load_dotenv()

prompt1 = PromptTemplate(
    template='Write a detailed report on {topic}',
    input_variables=['topic']
)

prompt2 = PromptTemplate(
    template='Summarize the following text \n {text}',
    input_variables=['text']
)

model = ChatOpenAI()
parser = StrOutputParser()

# First step: generate report text (string)
report_gen_chain = prompt1 | model | parser

# Second step: if report too long, summarize. Otherwise, return as-is
branch_chain = RunnableBranch(
    (lambda x: len(x.split()) > 300, prompt2 | model | parser),
    RunnablePassthrough()  # default branch: just pass text through
)

# Final flow: report generation → branching logic
final_chain = RunnableSequence(report_gen_chain, branch_chain)

print(final_chain.invoke({'topic': 'Russia vs Ukraine'}))

🔍 What is happening?

report_gen_chain outputs a string (the report).
branch_chain receives that string as x:
- If len(x.split()) > 300 → run summarization chain.
- Else → RunnablePassthrough() → return the original report.

🕒 When to use `RunnableBranch`?

When your logic depends on the content:
- Length of text
- Sentiment label
- Category / language

❓ Why is this powerful?

You combine:
- LLMs for content generation
- Python conditions for routing
It’s like if/else inside the LangChain pipeline.

🔎 Example behavior

If model writes a huge report → you’ll see a summary.
If it’s somewhat short → you’ll see the original detailed report.

3. Joke + Word Count in Parallel – `RunnableParallel` + `RunnableLambda`

Generate a joke and compute its word count at the same time.

💻 Code

from langchain_openai import ChatOpenAI
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from dotenv import load_dotenv
from langchain.schema.runnable import (
    RunnableSequence,
    RunnableLambda,
    RunnablePassthrough,
    RunnableParallel,
)

load_dotenv()

def word_count(text: str) -> int:
    return len(text.split())

prompt = PromptTemplate(
    template='Write a joke about {topic}',
    input_variables=['topic']
)

model = ChatOpenAI()
parser = StrOutputParser()

# Step 1: generate the joke (string)
joke_gen_chain = RunnableSequence(prompt, model, parser)

# Step 2: from that joke string, compute two things in parallel:
parallel_chain = RunnableParallel({
    'joke': RunnablePassthrough(),        # just forward the joke text
    'word_count': RunnableLambda(word_count)  # apply python function
})

# Full chain: joke → parallel processing
final_chain = RunnableSequence(joke_gen_chain, parallel_chain)

result = final_chain.invoke({'topic': 'AI'})

final_result = "{} \nword count - {}".format(
    result['joke'],
    result['word_count']
)

print(final_result)

🔍 What is happening?

joke_gen_chain:
- Input: {'topic': 'AI'}
- Output: "some generated joke" (string)
parallel_chain:
- Receives that joke text as input.
- RunnablePassthrough() → returns the joke unchanged.
- RunnableLambda(word_count) → calls your Python function on the joke.
Final result is a dict:

{
  "joke": "<joke text>",
  "word_count": 17
}

Then you format it into a string to print.

🕒 When to use `RunnableParallel`?

When you want to compute multiple views of the same thing:
- text + metadata (e.g., length, sentiment)
- raw text + extracted title
- answer + explanation

❓ Why is `RunnableLambda` useful?

It lets you plug in arbitrary Python logic into a LangChain pipeline.
Great for small utilities like word count, regex cleaning, etc.

🧾 Example output (will vary)

Why did the AI go to therapy? Because it had too many unresolved loops and couldn’t process its feelings properly!
word count - 24

4. Tweet + LinkedIn Post in Parallel – `RunnableParallel` + `RunnableSequence`

Same topic, two platforms: Twitter & LinkedIn content together.

💻 Code

from langchain_openai import ChatOpenAI
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from dotenv import load_dotenv
from langchain.schema.runnable import RunnableSequence, RunnableParallel

load_dotenv()

prompt1 = PromptTemplate(
    template='Generate a tweet about {topic}',
    input_variables=['topic']
)

prompt2 = PromptTemplate(
    template='Generate a Linkedin post about {topic}',
    input_variables=['topic']
)

model = ChatOpenAI()
parser = StrOutputParser()

parallel_chain = RunnableParallel({
    'tweet': RunnableSequence(prompt1, model, parser),
    'linkedin': RunnableSequence(prompt2, model, parser)
})

result = parallel_chain.invoke({'topic': 'AI'})

print("Tweet:\n", result['tweet'])
print("\nLinkedIn:\n", result['linkedin'])

🔍 What is happening?

Both sub-chains:
- Take the same input: {'topic': 'AI'}
- Build different prompts (tweet vs LinkedIn)
- Use same model and parser.
RunnableParallel returns:

{
  "tweet": "<short tweet>",
  "linkedin": "<longer professional post>"
}

🕒 When to use this pattern?

Multi-channel content generation:
- Tweet + LinkedIn + Email subject
- Title + Meta description + Social caption

❓ Why is it nice?

All logic is declarative:
- You just describe what outputs you want.
- LangChain handles passing input through all branches.

🧾 Example output (roughly):

Tweet:
AI isn’t here to replace humans—it’s here to amplify our potential. The real power is in humans + AI working together. 🤝🤖 #AI #FutureOfWork

LinkedIn:
Artificial Intelligence is transforming the way we work, learn, and build products. But it’s not about replacing humans—it’s about augmenting our abilities. Teams that learn how to collaborate with AI will move faster, make better decisions, and unlock new opportunities. Now is the right time to upskill, experiment, and think about how AI can enhance value in your domain, not just automate tasks.

5. Joke + Explanation in Parallel (Generated Once, Used Twice)

Pattern: generate once → reuse output in multiple paths.

Let’s slightly improve your “joke + explanation” parallel example so it’s robust.

💻 Code

from langchain_openai import ChatOpenAI
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from dotenv import load_dotenv
from langchain.schema.runnable import (
    RunnableSequence,
    RunnableParallel,
    RunnablePassthrough,
    RunnableLambda,
)

load_dotenv()

prompt_joke = PromptTemplate(
    template='Write a joke about {topic}',
    input_variables=['topic']
)

prompt_explain = PromptTemplate(
    template='Explain the following joke - {text}',
    input_variables=['text']
)

model = ChatOpenAI()
parser = StrOutputParser()

# Step 1: generate joke (string)
joke_gen_chain = RunnableSequence(prompt_joke, model, parser)

# Helper: map joke string → {"text": joke} for the explanation prompt
to_explain_input = RunnableLambda(lambda joke_text: {"text": joke_text})

# Step 2: in parallel, keep original joke & generate explanation
parallel_chain = RunnableParallel({
    'joke': RunnablePassthrough(),
    'explanation': RunnableSequence(
        to_explain_input,
        prompt_explain,
        model,
        parser
    )
})

# Final chain: joke → parallel (joke + explanation)
final_chain = RunnableSequence(joke_gen_chain, parallel_chain)

result = final_chain.invoke({'topic': 'cricket'})

print("Joke:\n", result['joke'])
print("\nExplanation:\n", result['explanation'])

🔍 What is happening?

joke_gen_chain → "some cricket joke" (string).
parallel_chain receives that joke string:
- joke: RunnablePassthrough() → returns joke unchanged.
- explanation:
  - RunnableLambda converts string → {"text": joke} (what prompt_explain expects).
  - prompt_explain builds: "Explain the following joke - <joke>"
  - model + parser produce explanation string.

Result:

{
  "joke": "<cricket joke>",
  "explanation": "<explanation of the joke>"
}

🕒 When to use this pattern?

When one step’s output should be:
- Returned as-is
- Also sent into another chain for extra processing

❓ Why use `RunnableLambda` here?

Because PromptTemplate expects a dict input with {"text": ...}.
RunnableLambda lets you reshape data in between steps.

6. Summary – Runnable Patterns You Now Know

✅ `RunnableSequence`

What: Step-by-step chain (A → B → C).
When: You want a fixed pipeline of transformations.
Why: Clear, readable, composable flows.

✅ `RunnableParallel`

What: Run multiple branches from the same input.
When: Need multiple outputs (tweet + LinkedIn, joke + word count).
Why: Saves mental complexity & keeps business logic clean.

✅ `RunnablePassthrough`

What: Just forwards the input to output.
When: You want the original value plus something derived from it.
Why: Simple way to keep original data in parallel chains.

✅ `RunnableBranch`

What: Conditional routing (if / elif / else).
When: Your next step depends on content (length, sentiment, category).
Why: Lets you blend LLMs with deterministic logic.

✅ `RunnableLambda`

What: Wraps a Python function into the chain.
When: You need custom logic (word count, reshaping dictionaries, etc.).
Why: Flexible bridge between LangChain and normal Python.

Building Your Own Mini-LangChain (Custom LLM, Prompt Template, and Chain)

🧪 Tutorial: Building Your Own Mini-LangChain (Custom LLM, Prompt Template, and Chain)

In this tutorial, you will learn how LangChain works internally by building:

A fake LLM (NakliLLM)
A prompt template formatter (NakliPromptTemplate)
A mini chain class (NakliLLMChain) that works like PromptTemplate | LLM | Parser

This helps you understand:

What LangChain is doing behind the scenes
How chaining works
How prompts + models combine together

Perfect for beginners!

1️⃣ Step 1 – Create a Fake LLM (NakliLLM)

import random

class NakliLLM:

    def __init__(self):
        print('LLM created')

    def predict(self, prompt):

        response_list = [
            'Delhi is the capital of India',
            'IPL is a cricket league',
            'AI stands for Artificial Intelligence'
        ]

        # Return RANDOM response, ignoring prompt
        return {'response': random.choice(response_list)}

✅ WHAT is this?

A mock LLM that returns random answers. It does not use OpenAI or API keys.

⏰ WHEN to use this?

When teaching LangChain concepts
When testing LLM pipelines without paying for tokens
When prototyping LLM flow offline

❓ WHY is this useful?

It helps you understand how LangChain chains work without depending on real LLMs.

▶️ Expected output when initialized:

LLM created

2️⃣ Step 2 – Create a Prompt Template Class (like LangChain’s PromptTemplate)

class NakliPromptTemplate:

    def __init__(self, template, input_variables):
        self.template = template
        self.input_variables = input_variables

    def format(self, input_dict):
        return self.template.format(**input_dict)

WHAT?

A simple template engine that replaces placeholders:

"Write a {length} poem about {topic}"

with actual values.

WHEN?

Always before calling an LLM — because LLMs need a properly formatted prompt as a string.

WHY?

It teaches how LangChain’s PromptTemplate works without the complexity.

3️⃣ Step 3 – Use Our Prompt Template

template = NakliPromptTemplate(
    template='Write a {length} poem about {topic}',
    input_variables=['length', 'topic']
)

prompt = template.format({'length': 'short', 'topic': 'india'})
print(prompt)

Output:

Write a short poem about india

WHAT?

Prompt preparation.

WHEN?

Before sending to ANY LLM.

WHY?

Prompts must be final strings before models use them.

4️⃣ Step 4 – Call The Fake LLM

llm = NakliLLM()

llm.predict(prompt)

Output example:

LLM created
{'response': 'AI stands for Artificial Intelligence'}

WHAT?

We pass a prompt → LLM returns a fake response.

WHEN?

Whenever you need LLM output.

WHY?

This simulates real LLM behavior in a toy environment.

5️⃣ Step 5 – Build a Mini Chain (NakliLLMChain)

Now we create the class:

class NakliLLMChain:

    def __init__(self, llm, prompt_template):
        self.llm = llm
        self.prompt_template = prompt_template

    def run(self, input_dict):
        # 1. format the prompt
        formatted_prompt = self.prompt_template.format(input_dict)

        # 2. pass formatted prompt to LLM
        result = self.llm.predict(formatted_prompt)

        # 3. return output
        return result['response']

WHAT is the chain?

A simple wrapper for:

PromptTemplate -> LLM -> Output

Exactly like:

prompt | model | StrOutputParser

WHEN is a chain used?

When you want to:

Combine prompt creation + LLM call
Reuse the same flow repeatedly
Organize code cleanly

WHY chain is important?

This is the core concept of LangChain — simple, modular pipelines.

6️⃣ Step 6 – Use the Chain

template = NakliPromptTemplate(
    template='Write a {length} poem about {topic}',
    input_variables=['length', 'topic']
)

llm = NakliLLM()

chain = NakliLLMChain(llm, template)

result = chain.run({'length': 'short', 'topic': 'india'})
print(result)

Sample Output:

LLM created
AI stands for Artificial Intelligence

📌 Full Concept Summary (Easy For Beginners)

Component	What?	When?	Why?
NakliLLM	Fake language model	Offline testing	Learn chain flow without OpenAI
NakliPromptTemplate	Formats prompt text	Before any LLM call	Reusable & clean prompts
NakliLLMChain	Pipeline wrapper	Repeated tasks	Core LangChain concept
predict()	Simulates model output	During execution	Easy testing
template.format()	Fills placeholders	Before LLM call	Converts dict → final string

LangChain Runnable Chains (Simple, Sequential, Parallel, Branching)

🔗 Tutorial: LangChain Runnable Chains (Simple, Sequential, Parallel, Branching)

In this tutorial, you’ll learn how to:

Build a simple chain: Prompt → Model → OutputParser
Build a sequential chain: one LLM step feeds into another
Run parallel chains using RunnableParallel (notes + quiz at same time)
Use branching logic (RunnableBranch) to react differently based on model output (positive/negative feedback)
Visualize chains using chain.get_graph().print_ascii()

✅ This assumes your environment, requirements.txt, and .env are already set up like in your previous tutorial.

1. Simple Chain – Prompt → Model → OutputParser

✅ Code

from langchain_openai import ChatOpenAI
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser

load_dotenv()

prompt = PromptTemplate(
    template='Generate 5 interesting facts about {topic}',
    input_variables=['topic']
)

model = ChatOpenAI()

parser = StrOutputParser()

chain = prompt | model | parser

result = chain.invoke({'topic': 'cricket'})

print(result)

chain.get_graph().print_ascii()

🔍 What is happening?

PromptTemplate → builds the prompt text:
"Generate 5 interesting facts about cricket"
ChatOpenAI → sends that prompt to OpenAI chat model.
StrOutputParser → takes the LLM response and returns it as a plain string (instead of AIMessage).

The chain:

PromptTemplate -> ChatOpenAI -> StrOutputParser

🕒 When to use this pattern?

Whenever you need one simple step:
- Generate ideas
- Write short content
- Transform text (summaries, paraphrasing, etc.)

❓ Why is it useful?

Keeps your pipeline clean and composable.
You can later plug this chain into bigger chains.
| (pipe) makes the flow easy to read: input → model → output.

🧾 Example Output (will vary)

1. Cricket originated in England and has been played since the 16th century.
2. Test cricket is the longest format of the game, lasting up to five days.
3. Sachin Tendulkar holds the record for the most runs in international cricket.
4. The Cricket World Cup is held every four years and features One Day International matches.
5. The Indian Premier League (IPL) is one of the richest and most popular T20 leagues in the world.

The print_ascii() graph will look like:

           +----------------+
           | PromptTemplate |
           +----------------+
                    |
            +----------------+
            |   ChatOpenAI   |
            +----------------+
                    |
           +-------------------+
           |  StrOutputParser  |
           +-------------------+

2. Sequential Chain – Report → Summary

First generate a detailed report, then summarize it into 5 bullet points.

✅ Code

from langchain_openai import ChatOpenAI
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser

load_dotenv()

prompt1 = PromptTemplate(
    template='Generate a detailed report on {topic}',
    input_variables=['topic']
)

prompt2 = PromptTemplate(
    template='Generate a 5 pointer summary from the following text \n {text}',
    input_variables=['text']
)

model = ChatOpenAI()
parser = StrOutputParser()

chain = prompt1 | model | parser | prompt2 | model | parser

result = chain.invoke({'topic': 'Unemployment in India'})

print(result)

chain.get_graph().print_ascii()

🔍 What is happening step-by-step?

prompt1 + topic → full prompt:
"Generate a detailed report on Unemployment in India"
model → generates the report (long text).
parser → turns AIMessage → string.
prompt2 → takes that report string as {text} and asks:
"Generate a 5 pointer summary from the following text ..."
model → generates 5-point summary.
Final parser → returns summary as string.

So the chain is:

topic
  └─> prompt1
        └─> model
             └─> parser (report string)
                     └─> prompt2
                             └─> model
                                  └─> parser (summary string)

🕒 When to use this pattern?

When you want multi-step logic:
- First: detailed reasoning
- Second: short user-friendly summary
Very common in:
- Document analysis
- Multi-stage content generation
- Reasoning then simplification

❓ Why is it important?

It shows how LangChain lets you compose LLM calls like Lego blocks.
Each stage can refine or transform previous output.

🧾 Example Output (summary, will vary)

1. Unemployment in India is influenced by population growth, skill mismatch, and structural issues in the economy.
2. Rural areas face seasonal and disguised unemployment, while urban regions struggle with educated unemployment.
3. Automation and lack of industrial growth contribute to limited job creation in formal sectors.
4. Government initiatives like MGNREGA, Skill India, and Make in India aim to address unemployment, but face implementation challenges.
5. Long-term solutions require improving education quality, promoting entrepreneurship, and supporting labor-intensive industries.

print_ascii() will show a longer chain graph from PromptTemplate through ChatOpenAI and StrOutputParser twice.

3. Parallel Chains – Notes + Quiz at the Same Time (`RunnableParallel`)

Generate notes and quiz questions from the same text in parallel, then merge them.

✅ Code

from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain.schema.runnable import RunnableParallel

load_dotenv()

model1 = ChatOpenAI()
model2 = ChatAnthropic(model_name='claude-3-7-sonnet-20250219')

prompt1 = PromptTemplate(
    template='Generate short and simple notes from the following text \n {text}',
    input_variables=['text']
)

prompt2 = PromptTemplate(
    template='Generate 5 short question answers from the following text \n {text}',
    input_variables=['text']
)

prompt3 = PromptTemplate(
    template='Merge the provided notes and quiz into a single document \n notes -> {notes} and quiz -> {quiz}',
    input_variables=['notes', 'quiz']
)

parser = StrOutputParser()

parallel_chain = RunnableParallel({
    'notes': prompt1 | model1 | parser,
    'quiz': prompt2 | model2 | parser
})

merge_chain = prompt3 | model1 | parser

chain = parallel_chain | merge_chain

text = """
Support vector machines (SVMs) are a set of supervised learning methods used for classification, regression and outliers detection.

The advantages of support vector machines are:
Effective in high dimensional spaces.
Still effective in cases where number of dimensions is greater than the number of samples.
Uses a subset of training points in the decision function (called support vectors), so it is also memory efficient.
Versatile: different Kernel functions can be specified for the decision function.

The disadvantages include:
Risk of overfitting with many features and few samples.
No direct probability estimates without extra computation.
"""

result = chain.invoke({'text': text})

print(result)

chain.get_graph().print_ascii()

🔍 What is happening?

Parallel step with RunnableParallel:
- 'notes' key:
  - prompt1 + text → ChatOpenAI → notes string
- 'quiz' key:
  - prompt2 + text → ChatAnthropic → quiz string
Result of parallel_chain.invoke({...}) is:
```
{
  "notes": "...generated notes...",
  "quiz": "...generated questions..."
}
```
Merge step:
- prompt3 takes {notes} and {quiz} from this dict.
- Asks model1 (OpenAI) to merge these into a single document.

So the full chain looks like:

           text
             |
     +-------------------+
     |  RunnableParallel |
     +-------------------+
        |             |
     notes chain    quiz chain
        \             /
         \           /
        merged by prompt3 -> model1 -> parser

🕒 When to use `RunnableParallel`?

When you want to generate multiple things from the same input:
- Notes + quiz
- Summary + title + hashtags
- SEO description + keywords + social post

❓ Why is it powerful?

You can:
- Use different models for different tasks.
- Run conceptually parallel steps in one high-level chain.
It fits how real apps work: multiple outputs for the same user input.

🧾 Example Output (simplified, will vary)

Notes:
- SVMs are supervised learning methods used for classification, regression, and outlier detection.
- They work well in high-dimensional spaces.
- They use only a subset of training points (support vectors), making them memory-efficient.
- Different kernel functions can be used to adapt to various data types.

Quiz:
1. What are support vector machines primarily used for?
2. Why are SVMs effective in high-dimensional spaces?
3. What are support vectors in an SVM?
4. Name one advantage of using kernel functions in SVM.
5. What is one disadvantage of SVMs when the number of features is very large?

4. Branching Chains – Different Logic for Positive/Negative Feedback

Step 1: Classify sentiment (positive / negative).
Step 2: Based on sentiment, pick different response prompt (branch).

✅ Code

from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser, PydanticOutputParser
from langchain.schema.runnable import RunnableBranch, RunnableLambda
from pydantic import BaseModel, Field
from typing import Literal

load_dotenv()

model = ChatOpenAI()

parser = StrOutputParser()

class Feedback(BaseModel):
    sentiment: Literal['positive', 'negative'] = Field(
        description='Give the sentiment of the feedback'
    )

parser2 = PydanticOutputParser(pydantic_object=Feedback)

prompt1 = PromptTemplate(
    template=(
        'Classify the sentiment of the following feedback text into positive or negative\n'
        '{feedback}\n'
        '{format_instruction}'
    ),
    input_variables=['feedback'],
    partial_variables={'format_instruction': parser2.get_format_instructions()}
)

classifier_chain = prompt1 | model | parser2

prompt2 = PromptTemplate(
    template='Write an appropriate response to this positive feedback:\n{feedback}',
    input_variables=['feedback']
)

prompt3 = PromptTemplate(
    template='Write an appropriate response to this negative feedback:\n{feedback}',
    input_variables=['feedback']
)

branch_chain = RunnableBranch(
    (lambda x: x.sentiment == 'positive', prompt2 | model | parser),
    (lambda x: x.sentiment == 'negative', prompt3 | model | parser),
    RunnableLambda(lambda x: "Could not determine sentiment.")
)

chain = classifier_chain | branch_chain

print(chain.invoke({'feedback': 'This is a beautiful phone'}))

chain.get_graph().print_ascii()

🔍 What is happening?

4.1. PydanticOutputParser + Feedback model

class Feedback(BaseModel):
    sentiment: Literal['positive', 'negative'] = Field(...)

This defines the expected output structure: a JSON-like object with one field: sentiment.

parser2.get_format_instructions() gives instructions like:

“Respond in JSON with fields: sentiment: 'positive' or 'negative' ...”

So prompt1 tells the model how to output structured data.

classifier_chain = prompt1 | model | parser2:

prompt1 → classification instructions + format instructions
model → returns some text
parser2 → parses that text into a Feedback object

Result of classifier_chain.invoke(...) is something like:

Feedback(sentiment='positive')

4.2. Branching with `RunnableBranch`

branch_chain = RunnableBranch(
    (lambda x: x.sentiment == 'positive', prompt2 | model | parser),
    (lambda x: x.sentiment == 'negative', prompt3 | model | parser),
    RunnableLambda(lambda x: "Could not determine sentiment.")
)

It checks conditions in order:
- If x.sentiment == 'positive' → run positive feedback chain.
- Else if x.sentiment == 'negative' → run negative feedback chain.
- Else → fallback RunnableLambda → static string.

Combined:

chain = classifier_chain | branch_chain

So full flow:

Classify feedback → Feedback(sentiment='positive' | 'negative')
Branch to the appropriate response template and model.

🕒 When to use this pattern?

When your workflow depends on model output:
- Sentiment → positive/negative path
- Category → “billing” vs “technical support”
- Language → English response vs Bangla response

❓ Why is it powerful?

You can implement conditional logic inside the chain itself.
Combines LLM decisions with deterministic control flow.

🧾 Example Output

Input:

chain.invoke({'feedback': 'This is a beautiful phone'})

Possible output:

Thank you so much for your kind words! We’re glad to hear that you find the phone beautiful and enjoyable to use. If you have any questions or need help exploring more features, we’re always here to assist. 😊

If the feedback were:

"This phone keeps hanging and the battery drains too fast."

It would likely go through the negative branch and produce an apology + support-style response.

5. Summary – What You’ve Learned in This Tutorial

🧩 Patterns Covered

Simple chain – PromptTemplate | ChatOpenAI | StrOutputParser
- What: Basic “prompt → LLM → string” pipeline
- When: Any one-step generation or transformation
- Why: Building block for everything else
Sequential chain – multi-step reasoning (report → summary)
- What: One LLM’s output feeds into another prompt
- When: You want staged processing (detailed → simplified, raw → cleaned)
- Why: Reflects real-world pipelines
Parallel chain (RunnableParallel) – notes & quiz together
- What: Multiple chains executed conceptually in parallel, returning a dict
- When: Need multiple outputs from one input (notes, quiz, tags, etc.)
- Why: Efficient composition & cleaner code
Branching chain (RunnableBranch) – logic based on sentiment
- What: Conditional routing based on model output (via Pydantic parser)
- When: Different flows for positive/negative, categories, etc.
- Why: Mixes LLM intelligence with explicit control flow
PydanticOutputParser – structured, typed model output
- What: Parse LLM response into a typed object (Feedback)
- When: You want predictable, machine-readable output
- Why: Makes downstream logic safer and cleaner
Graph visualization (get_graph().print_ascii())
- What: ASCII diagram of your chain
- When: Debugging or teaching how a chain is wired
- Why: Helps beginners understand the execution flow.

Friday, November 28, 2025

Structured Output & Multi-step Chains with HuggingFace + OpenAI

🧩 Tutorial: Structured Output & Multi-step Chains with HuggingFace + OpenAI (LangChain)

1. JSON Output Using JsonOutputParser (HuggingFace + Gemma)

🎯 Goal

💻 Code

🔍 What’s happening?

⏰ When to use this?

❓ Why is it useful?

🧾 Example result (shape, not exact):

2. Strongly Typed Output with PydanticOutputParser (HuggingFace + Gemma)

🎯 Goal

💻 Code

🔍 What’s happening?

⏰ When to use this?

❓ Why is this better than plain JSON?

🧾 Example result (approx):

3. Two-Step Prompting (Report ➜ Summary) with HuggingFace (Manual Steps)

🎯 Goal

💻 Code

🔍 What’s happening?

⏰ When to use this manual style?

❓ Why is it useful?

4. Same Multi-Step Prompting as a Single Chain (OpenAI + StrOutputParser)

💻 Code

🔍 What’s happening?

⏰ When to use a chain like this?

❓ Why is this pattern nice?

5. Structured Output With StructuredOutputParser + ResponseSchema (HuggingFace)

🎯 Goal

💻 Code

🔍 What’s happening?

⏰ When to use this?

❓ Pydantic vs StructuredOutputParser?

🧠 Big Picture – Which Parser to Use?

Structured Data, Pydantic & LangChain Structured Outputs

🧩 Tutorial: Structured Data, Pydantic & LangChain Structured Outputs

1. JSON Schema Basics (Concept Level)

🔍 What is this?

🕒 When is JSON Schema used?

❓ Why do we care here?

2. Pydantic BaseModel – Strongly Typed Student Model

💻 Code

🔍 What is happening?

🕒 When to use BaseModel?

❓ Why is this important for LLM work?

3. TypedDict – Typed Dictionaries Without Validation

💻 Code

🔍 What is TypedDict?

🕒 When to use TypedDict?

❓ Why is this relevant?

4. Structured Output from ChatOpenAI Using Raw JSON Schema

💻 Code

🔍 What is happening?

🕒 When to use raw JSON schema?

❓ Why is this great?

🧾 Example result (approx):

5. Structured Output with Pydantic BaseModel + HuggingFace (TinyLlama)

💻 Code

🔍 What is happening?

🕒 When to use Pydantic + structured output?

❓ Why is this powerful?

6. Structured Output with Pydantic + ChatOpenAI

💻 Code

🔍 What’s different?

Example usage:

7. Structured Output with TypedDict + Annotated (ChatOpenAI)

💻 Code

🔍 What is happening?

🕒 When to use this?

❓ Why use TypedDict + Annotated?

8. Big Picture: Which Structured Output Style to Use?

9. Why Structured Output Matters for LLM Apps

📊 Comparison Table — Structured Output Methods in LangChain

🧭 Summary in Simple Words

1. JSON Schema → Specification

2. Pydantic BaseModel → Strict Validation

3. TypedDict + Annotated → Lightweight

🏅 Which One Should YOU Use?

Saturday, November 22, 2025

1. JSON Output Using `JsonOutputParser` (HuggingFace + Gemma)

2. Strongly Typed Output with `PydanticOutputParser` (HuggingFace + Gemma)

5. Structured Output With `StructuredOutputParser` + `ResponseSchema` (HuggingFace)

2. Pydantic `BaseModel` – Strongly Typed Student Model

🕒 When to use `BaseModel`?

3. `TypedDict` – Typed Dictionaries Without Validation

🔍 What is `TypedDict`?

4. Structured Output from `ChatOpenAI` Using Raw JSON Schema

5. Structured Output with Pydantic `BaseModel` + HuggingFace (TinyLlama)

6. Structured Output with Pydantic + `ChatOpenAI`

7. Structured Output with `TypedDict` + `Annotated` (ChatOpenAI)

❓ Why use `TypedDict` + `Annotated`?

🔥 1. Splitting PDFs — `CharacterTextSplitter`

1️⃣ Loading CSV Files — `CSVLoader`

2️⃣ Loading Multiple PDFs from a Folder — `DirectoryLoader`

3️⃣ Loading a Single PDF — `PyPDFLoader`

4️⃣ Loading TXT Files — `TextLoader` + Summarization

5️⃣ Loading Website Content — `WebBaseLoader`