Friday, November 28, 2025

Structured Output & Multi-step Chains with HuggingFace + OpenAI



๐Ÿงฉ Tutorial: Structured Output & Multi-step Chains with HuggingFace + OpenAI (LangChain)

In this tutorial you’ll learn:

  • How to get JSON output from LLMs with JsonOutputParser

  • How to get strongly-typed objects using PydanticOutputParser

  • How to do multi-step prompting (report ➜ summary) with:

    • Manual steps (HuggingFace)

    • A single chain pipeline (OpenAI)

  • How to use StructuredOutputParser + ResponseSchema for custom structured outputs

We’ll use:

  • google/gemma-2-2b-it via HuggingFaceEndpoint

  • ChatOpenAI for OpenAI models

Assumes:

  • .env contains your keys (HuggingFace, OpenAI)

  • pip install langchain langchain-core langchain-openai langchain-huggingface pydantic python-dotenv


1. JSON Output Using JsonOutputParser (HuggingFace + Gemma)

๐ŸŽฏ Goal

Ask the model for 5 facts about a topic and get them as proper JSON instead of free-text paragraphs.

๐Ÿ’ป Code

from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import JsonOutputParser

load_dotenv()

# HuggingFace LLM (Gemma)
llm = HuggingFaceEndpoint(
    repo_id="google/gemma-2-2b-it",
    task="text-generation"
)

model = ChatHuggingFace(llm=llm)

# Parser that expects JSON
parser = JsonOutputParser()

template = PromptTemplate(
    template='Give me 5 facts about {topic} \n{format_instruction}',
    input_variables=['topic'],
    partial_variables={'format_instruction': parser.get_format_instructions()}
)

chain = template | model | parser

result = chain.invoke({'topic': 'black hole'})

print(result)

๐Ÿ” What’s happening?

  • JsonOutputParser:

    • Provides get_format_instructions() → instructions like:
      “Return output as a JSON object …”

  • PromptTemplate:

    • Adds these instructions into the prompt via {format_instruction}.

  • chain = template | model | parser:

    • Builds a mini pipeline:

      1. Format the prompt

      2. Call Gemma

      3. Parse response as JSON

⏰ When to use this?

  • When you need raw JSON, not just text:

    • Lists of facts

    • Attributes of a product

    • Configuration-like data

❓ Why is it useful?

  • Easier to use in code:

    • You get a Python dict/list directly.

    • No need to manually json.loads() loosely formatted text.

๐Ÿงพ Example result (shape, not exact):

[
  "Black holes are regions of spacetime where gravity is so strong that nothing can escape.",
  "They are formed from the remnants of massive stars that have collapsed.",
  "The boundary of a black hole is called the event horizon.",
  "Supermassive black holes are found at the center of many galaxies.",
  "Black holes can grow by absorbing matter and merging with other black holes."
]

2. Strongly Typed Output with PydanticOutputParser (HuggingFace + Gemma)

๐ŸŽฏ Goal

Generate a fictional person (name, age, city) with type constraints.

๐Ÿ’ป Code

from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field

load_dotenv()

# HuggingFace LLM (Gemma)
llm = HuggingFaceEndpoint(
    repo_id="google/gemma-2-2b-it",
    task="text-generation"
)

model = ChatHuggingFace(llm=llm)

# Pydantic model for structured output
class Person(BaseModel):
    name: str = Field(description='Name of the person')
    age: int = Field(gt=18, description='Age of the person')
    city: str = Field(description='Name of the city the person belongs to')

parser = PydanticOutputParser(pydantic_object=Person)

template = PromptTemplate(
    template='Generate the name, age and city of a fictional {place} person.\n{format_instruction}',
    input_variables=['place'],
    partial_variables={'format_instruction': parser.get_format_instructions()}
)

chain = template | model | parser

final_result: Person = chain.invoke({'place': 'Sri Lankan'})

print(final_result)
print(final_result.name, final_result.age, final_result.city)

๐Ÿ” What’s happening?

  • Person(BaseModel) defines the schema:

    • age: int with gt=18 (must be > 18).

  • PydanticOutputParser:

    • Generates detailed format instructions (JSON format, field names/types).

    • Parses model output into a Person instance.

  • You get runtime validation:

    • If the model returns invalid data, Pydantic will raise an error.

⏰ When to use this?

  • In backend services where:

    • LLM result feeds directly into DB or API response

    • You need guaranteed fields & types

❓ Why is this better than plain JSON?

  • You get:

    • Validation

    • Intellisense / autocomplete

    • Clean, typed Python objects

๐Ÿงพ Example result (approx):

name='Kamal Perera' age=29 city='Colombo'

You can also do:

print(final_result.model_dump())

To get a dict.


3. Two-Step Prompting (Report ➜ Summary) with HuggingFace (Manual Steps)

๐ŸŽฏ Goal

  1. Generate a detailed report on a topic

  2. Then generate a 5-line summary of that report

๐Ÿ’ป Code

from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate

load_dotenv()

llm = HuggingFaceEndpoint(
    repo_id="google/gemma-2-2b-it",
    task="text-generation"
)

model = ChatHuggingFace(llm=llm)

# 1st prompt -> detailed report
template1 = PromptTemplate(
    template='Write a detailed report on {topic}',
    input_variables=['topic']
)

# 2nd prompt -> summary
template2 = PromptTemplate(
    template='Write a 5 line summary on the following text:\n{text}',
    input_variables=['text']
)

# Step 1: create and send first prompt
prompt1 = template1.invoke({'topic': 'black hole'})
result = model.invoke(prompt1)

# Step 2: create summary prompt using result.content
prompt2 = template2.invoke({'text': result.content})
result1 = model.invoke(prompt2)

print(result1.content)

(I fixed /n to \n in the template.)

๐Ÿ” What’s happening?

  1. First prompt:

    • "Write a detailed report on black hole" → Gemma → long explanation.

  2. Second prompt:

    • "Write a 5 line summary on the following text:\n<report>" → Gemma → short summary.

You’re manually:

  • Building prompts via template.invoke

  • Calling model.invoke for each step

⏰ When to use this manual style?

  • When experimenting, learning, or debugging each step.

  • When you want to inspect intermediate results (e.g. the full report) separately.

❓ Why is it useful?

  • Shows clearly how multi-step reasoning works.

  • You can log intermediate outputs, save them, etc.


4. Same Multi-Step Prompting as a Single Chain (OpenAI + StrOutputParser)

Now do the same, but as a single pipeline using the | operator.

๐Ÿ’ป Code

from langchain_openai import ChatOpenAI
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser

load_dotenv()

model = ChatOpenAI()

# 1st prompt -> detailed report
template1 = PromptTemplate(
    template='Write a detailed report on {topic}',
    input_variables=['topic']
)

# 2nd prompt -> summary
template2 = PromptTemplate(
    template='Write a 5 line summary on the following text:\n{text}',
    input_variables=['text']
)

parser = StrOutputParser()

# Build the chain:
# topic → template1 → model → string → template2 → model → string
chain = template1 | model | parser | template2 | model | parser

result = chain.invoke({'topic': 'black hole'})

print(result)

๐Ÿ” What’s happening?

  • template1 receives {'topic': 'black hole'} and formats prompt.

  • model generates a report.

  • parser converts that to a plain string.

  • template2 uses that string as {text}.

  • model generates the summary.

  • last parser returns the summary as a string.

All in one line:

template1 | model | parser | template2 | model | parser

⏰ When to use a chain like this?

  • When you don’t need to inspect intermediate results.

  • When building reuseable components (e.g. a “report+summary” service).

❓ Why is this pattern nice?

  • Clear, declarative pipeline.

  • Easy to reuse in apps, Streamlit, FastAPI, etc.


5. Structured Output With StructuredOutputParser + ResponseSchema (HuggingFace)

This is another way to define structure (older but still useful API).

๐ŸŽฏ Goal

Ask for 3 facts about a topic, each in its own named field.

๐Ÿ’ป Code

from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain.output_parsers import StructuredOutputParser, ResponseSchema

load_dotenv()

# HuggingFace model
llm = HuggingFaceEndpoint(
    repo_id="google/gemma-2-2b-it",
    task="text-generation"
)

model = ChatHuggingFace(llm=llm)

schema = [
    ResponseSchema(name='fact_1', description='Fact 1 about the topic'),
    ResponseSchema(name='fact_2', description='Fact 2 about the topic'),
    ResponseSchema(name='fact_3', description='Fact 3 about the topic'),
]

parser = StructuredOutputParser.from_response_schemas(schema)

template = PromptTemplate(
    template='Give 3 facts about {topic}.\n{format_instruction}',
    input_variables=['topic'],
    partial_variables={'format_instruction': parser.get_format_instructions()}
)

chain = template | model | parser

result = chain.invoke({'topic': 'black hole'})

print(result)

๐Ÿ” What’s happening?

  • ResponseSchema defines named fields:

    • fact_1, fact_2, fact_3

  • StructuredOutputParser:

    • Generates instructions telling the LLM how to format output.

    • Parses the result into a dict like:

{
  'fact_1': '...',
  'fact_2': '...',
  'fact_3': '...',
}

⏰ When to use this?

  • When you want specific, named outputs that aren’t naturally modeled as a list.

  • Good for:

    • sections like intro, body, conclusion

    • title, description, keywords

❓ Pydantic vs StructuredOutputParser?

  • Pydantic:

    • Strong typing, validation, BaseModel classes.

  • StructuredOutputParser:

    • Quick way to define field names + descriptions.

    • No Pydantic dependency; gives you a dict.


๐Ÿง  Big Picture – Which Parser to Use?

Parser Output Type Best For
JsonOutputParser dict / list Generic JSON structures without strong typing
PydanticOutputParser Pydantic model Validated, typed data for backend logic
StructuredOutputParser dict (fixed keys) Simple named fields, quick schema
StrOutputParser plain string Normal text generation / single-step content


Structured Data, Pydantic & LangChain Structured Outputs

๐Ÿงฉ Tutorial: Structured Data, Pydantic & LangChain Structured Outputs

In this tutorial, you’ll learn:

  • What JSON Schema is and how it relates to models

  • How to define data models using Pydantic BaseModel

  • How to use TypedDict for typed dictionaries

  • How to get structured JSON output from LLMs using:

    • Raw JSON schema (Python dict)

    • Pydantic BaseModel

    • TypedDict + Annotated

  • How to use structured output with:

    • ChatOpenAI

    • ChatHuggingFace (TinyLlama endpoint)


1. JSON Schema Basics (Concept Level)

You started with:

{
    "title": "student",
    "description": "schema about students",
    "type": "object",
    "properties": {
        "name": "string",
        "age": "integer"
    },
    "required": ["name"]
}

๐Ÿ” What is this?

This is a JSON Schema:

  • type: "object" → it describes an object

  • properties:

    • name: string

    • age: integer

  • required: ["name"]name must be present; age is optional.

๐Ÿ•’ When is JSON Schema used?

  • To validate JSON payloads (APIs, configurations).

  • To describe the structure of data for tools and LLMs.

  • As a contract between systems (backend ↔ frontend, services).

❓ Why do we care here?

  • LLMs can be guided to output JSON matching a schema.

  • Libraries like LangChain and Pydantic internally map to/from JSON schema.


2. Pydantic BaseModel – Strongly Typed Student Model

๐Ÿ’ป Code

from pydantic import BaseModel, EmailStr, Field
from typing import Optional

class Student(BaseModel):
    name: str = 'nitish'
    age: Optional[int] = None
    email: EmailStr
    cgpa: float = Field(
        gt=0,
        lt=10,
        default=5,
        description='A decimal value representing the cgpa of the student'
    )

new_student = {'age': '32', 'email': 'abc@gmail.com'}

student = Student(**new_student)

student_dict = dict(student)

print(student_dict['age'])

student_json = student.model_dump_json()

๐Ÿ” What is happening?

  • Student is a Pydantic model with:

    • name: str → default "nitish"

    • age: Optional[int] → may be None, but here "32" will be converted to 32

    • email: EmailStr → validates proper email format

    • cgpa: float with:

      • gt=0, lt=10 → must be between 0 and 10

      • default 5

      • helpful description

  • Student(**new_student):

    • age is "32" (string) → Pydantic converts to 32 (int)

    • email is validated

    • name uses default "nitish"

    • cgpa uses default 5.0

  • model_dump_json() creates JSON string like:

    {"name": "nitish", "age": 32, "email": "abc@gmail.com", "cgpa": 5.0}
    

๐Ÿ•’ When to use BaseModel?

  • Whenever you want:

    • Validation of input data

    • Type safety inside Python

    • JSON <-> Python object conversion

❓ Why is this important for LLM work?

  • LLMs output free-form text by default.

  • With Pydantic + LangChain, you can ask them to output structured, valid data that matches your model.


3. TypedDict – Typed Dictionaries Without Validation

๐Ÿ’ป Code

from typing import TypedDict

class Person(TypedDict):
    name: str
    age: int

new_person: Person = {'name': 'nitish', 'age': 35}

print(new_person)

๐Ÿ” What is TypedDict?

  • A way to define the expected shape of a dictionary for type checkers.

  • No runtime validation like Pydantic; just static typing help.

๐Ÿ•’ When to use TypedDict?

  • When you:

    • Want type hints

    • Don’t need heavy validation

    • Prefer lightweight types (no Pydantic overhead)

❓ Why is this relevant?

  • LangChain’s with_structured_output can work with:

    • JSON Schema dict

    • Pydantic BaseModel

    • TypedDict (+ Annotated descriptions)

This gives you flexibility depending on your style.


4. Structured Output from ChatOpenAI Using Raw JSON Schema

Here you passed a JSON schema dict to with_structured_output.

๐Ÿ’ป Code

from langchain_openai import ChatOpenAI
from dotenv import load_dotenv

load_dotenv()

model = ChatOpenAI()

# schema
json_schema = {
  "title": "Review",
  "type": "object",
  "properties": {
    "key_themes": {
      "type": "array",
      "items": {"type": "string"},
      "description": "Write down all the key themes discussed in the review in a list"
    },
    "summary": {
      "type": "string",
      "description": "A brief summary of the review"
    },
    "sentiment": {
      "type": "string",
      "enum": ["pos", "neg"],
      "description": "Return sentiment of the review either negative, positive or neutral"
    },
    "pros": {
      "type": ["array", "null"],
      "items": {"type": "string"},
      "description": "Write down all the pros inside a list"
    },
    "cons": {
      "type": ["array", "null"],
      "items": {"type": "string"},
      "description": "Write down all the cons inside a list"
    },
    "name": {
      "type": ["string", "null"],
      "description": "Write the name of the reviewer"
    }
  },
  "required": ["key_themes", "summary", "sentiment"]
}

structured_model = model.with_structured_output(json_schema)

result = structured_model.invoke(
    """I recently upgraded to the Samsung Galaxy S24 Ultra, and I must say, it’s an absolute powerhouse! ... Review by Nitish Singh"""
)

print(result)

๐Ÿ” What is happening?

  • with_structured_output(json_schema):

    • Tells the model: “Your output must follow this JSON schema.”

  • invoke(review_text):

    • LLM reads your review.

    • Extracts:

      • key_themes: list[str]

      • summary: string

      • sentiment: "pos" or "neg"

      • pros: list[str] or null

      • cons: list[str] or null

      • name: string or null

  • result will be a Python dict matching the schema.

๐Ÿ•’ When to use raw JSON schema?

  • When:

    • You’re comfortable with JSON Schema

    • You want full schema control (for tools, OpenAPI, etc.)

    • You’re not tied to Python type system only

❓ Why is this great?

  • You get machine-usable structured data from an LLM in one step.

  • No need to write brittle regex or JSON-parsing hacks.

๐Ÿงพ Example result (approx):

{
  "key_themes": [
    "Powerful performance",
    "High-quality camera",
    "Long battery life",
    "S-Pen usefulness",
    "Heavy device and size",
    "Bloatware in One UI",
    "High price"
  ],
  "summary": "The reviewer is very impressed with the Galaxy S24 Ultra's performance, camera, and battery life, but dislikes the bulky design, Samsung bloatware, and high price.",
  "sentiment": "pos",
  "pros": [
    "Fast Snapdragon 8 Gen 3 processor",
    "Excellent 200MP camera with great zoom",
    "Strong battery life with fast charging",
    "S-Pen support for notes and sketches"
  ],
  "cons": [
    "Heavy and uncomfortable for one-handed use",
    "Bloatware in One UI",
    "Very expensive price tag"
  ],
  "name": "Nitish Singh"
}

5. Structured Output with Pydantic BaseModel + HuggingFace (TinyLlama)

Now the same idea, but using Pydantic model and HuggingFace LLM.

๐Ÿ’ป Code

from dotenv import load_dotenv
from typing import Optional, Literal
from pydantic import BaseModel, Field
from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint

load_dotenv()

llm = HuggingFaceEndpoint(
    repo_id="TinyLlama/TinyLlama-1.1B-Chat-v1.0",
    task="text-generation"
)

model = ChatHuggingFace(llm=llm)

# schema
class Review(BaseModel):
    key_themes: list[str] = Field(
        description="Write down all the key themes discussed in the review in a list"
    )
    summary: str = Field(description="A brief summary of the review")
    sentiment: Literal["pos", "neg"] = Field(
        description="Return sentiment of the review either negative, positive or neutral"
    )
    pros: Optional[list[str]] = Field(
        default=None,
        description="Write down all the pros inside a list"
    )
    cons: Optional[list[str]] = Field(
        default=None,
        description="Write down all the cons inside a list"
    )
    name: Optional[str] = Field(
        default=None,
        description="Write the name of the reviewer"
    )

structured_model = model.with_structured_output(Review)

result = structured_model.invoke(
    """I recently upgraded to the Samsung Galaxy S24 Ultra, and I must say, it’s an absolute powerhouse! ... Review by Nitish Singh"""
)

print(result)

๐Ÿ” What is happening?

  • Review(BaseModel) defines the Python data model.

  • with_structured_output(Review):

    • LangChain internally converts this to JSON schema.

    • Forces the LLM to return data that can be parsed as Review.

  • Result is a Review instance (Pydantic object), not just dict.

So you can do:

print(result.summary)
print(result.sentiment)
print(result.name)

๐Ÿ•’ When to use Pydantic + structured output?

  • When you:

    • Want validation & type hints

    • Work in a Python backend

    • Need to plug result straight into your code

❓ Why is this powerful?

  • End-to-end pipeline:

    • Raw text → LLM → Pydantic model → direct use in database / APIs


6. Structured Output with Pydantic + ChatOpenAI

Same Pydantic Review model, but using OpenAI instead of HuggingFace.

๐Ÿ’ป Code

from langchain_openai import ChatOpenAI
from dotenv import load_dotenv
from typing import Optional, Literal
from pydantic import BaseModel, Field

load_dotenv()

model = ChatOpenAI()

# schema
class Review(BaseModel):
    key_themes: list[str] = Field(description="Write down all the key themes discussed in the review in a list")
    summary: str = Field(description="A brief summary of the review")
    sentiment: Literal["pos", "neg"] = Field(description="Return sentiment of the review either negative, positive or neutral")
    pros: Optional[list[str]] = Field(default=None, description="Write down all the pros inside a list")
    cons: Optional[list[str]] = Field(default=None, description="Write down all the cons inside a list")
    name: Optional[str] = Field(default=None, description="Write the name of the reviewer")

structured_model = model.with_structured_output(Review)

result = structured_model.invoke(
    """I recently upgraded to the Samsung Galaxy S24 Ultra, and I must say, it’s an absolute powerhouse! ... Review by Nitish Singh"""
)

print(result)

๐Ÿ” What’s different?

  • Same idea, different provider.

  • You still get a Review object.

Example usage:

print(result.key_themes)
print(result.pros)
print(result.cons)
print(result.name)

7. Structured Output with TypedDict + Annotated (ChatOpenAI)

Here’s a more lightweight type approach.

๐Ÿ’ป Code

from langchain_openai import ChatOpenAI
from dotenv import load_dotenv
from typing import TypedDict, Annotated, Optional, Literal

load_dotenv()

model = ChatOpenAI()

# schema
class Review(TypedDict):
    key_themes: Annotated[list[str], "Write down all the key themes discussed in the review in a list"]
    summary: Annotated[str, "A brief summary of the review"]
    sentiment: Annotated[Literal["pos", "neg"], "Return sentiment of the review either negative, positive or neutral"]
    pros: Annotated[Optional[list[str]], "Write down all the pros inside a list"]
    cons: Annotated[Optional[list[str]], "Write down all the cons inside a list"]
    name: Annotated[Optional[str], "Write the name of the reviewer"]

structured_model = model.with_structured_output(Review)

result = structured_model.invoke(
    """I recently upgraded to the Samsung Galaxy S24 Ultra, and I must say, it’s an absolute powerhouse! ... Review by Nitish Singh"""
)

print(result['name'])

๐Ÿ” What is happening?

  • Review is a TypedDict with Annotated descriptions.

  • with_structured_output(Review):

    • Uses typing + annotations to build the schema.

  • result is a plain dict, but type checkers know its structure.

Example result:

{
  "key_themes": [...],
  "summary": "...",
  "sentiment": "pos",
  "pros": [...],
  "cons": [...],
  "name": "Nitish Singh"
}

print(result['name'])"Nitish Singh"

๐Ÿ•’ When to use this?

  • When you:

    • Want static typing but don’t need Pydantic

    • Prefer minimal dependencies

    • Still want structured outputs from LLM

❓ Why use TypedDict + Annotated?

  • Lightweight

  • Works nicely with mypy / type-checkers

  • You still get descriptions for the LLM to follow


8. Big Picture: Which Structured Output Style to Use?

Approach Type Runtime Validation Best For
Raw JSON Schema (dict) Dict No (LLM constrained only) Multi-language / tool-level schema
BaseModel (Pydantic) Class ✅ Yes Python backends, APIs, DB integration
TypedDict + Annotated Dict type ❌ No Lightweight typing, fast, simple

9. Why Structured Output Matters for LLM Apps

Without structured output:

  • You get free text → must parse manually

  • More chances of errors (missing fields, invalid JSON, etc.)

With structured output:

  • LLM output → auto-validated object/dict

  • You can directly:

    • Save to DB

    • Return in API

    • Feed into next processing step

This is critical for production-grade AI features where you need reliable data, not just pretty text.




Below is a clean, professional comparison table showing the differences between:

JSON Schema (dict)
Pydantic BaseModel
TypedDict + Annotated
when used with LangChain structured outputs.


๐Ÿ“Š Comparison Table — Structured Output Methods in LangChain

Feature / Aspect JSON Schema (dict) Pydantic BaseModel TypedDict + Annotated
Definition Type Python dictionary describing JSON schema Python class extending BaseModel Python TypedDict with Annotated descriptions
Runtime Validation ❌ No validation (LLM must comply) ✅ Yes (strict validation by Pydantic) ❌ No runtime validation
Output Type dict Pydantic model instance dict
Error Handling if Output Invalid ❌ You must manually check ✅ Pydantic raises validation errors ❌ No built-in guarantees
Best Use Case Tooling, API schema, cross-language systems Backend apps needing clean, validated objects Lightweight typing with minimal overhead
Ease of Use ⭐⭐⭐ ⭐⭐⭐⭐ ⭐⭐⭐⭐
Flexibility / Customization ⭐⭐⭐⭐⭐ (full JSON schema control) ⭐⭐⭐⭐ (rich field types) ⭐⭐⭐ (simple types only)
Type Safety ❌ No ✅ Strong ⚠️ Static only (type checkers)
Performance Fast (no validation) Slightly slower (validation overhead) Fast (no validation)
Works With All LangChain models All LangChain models All LangChain models
Ideal For Multi-language systems, OpenAPI, strict schema control Python apps, APIs, DB pipelines Quick typing, simple extraction tasks
Description Support Medium (via description fields) Strong (via Field(description=...)) Strong (via Annotated)
Nested Complex Structures ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐⭐ ⭐⭐ (less flexible)
Strictness Low High Medium
Use in Production ⚠️ Only if model output reliable ✅ Yes, recommended ⚠️ For simple use cases
Requires External Library ❌ No ✅ Yes → Pydantic ❌ No
Automatic JSON Serialization Manual Built-in (model_dump_json()) Manual

๐Ÿงญ Summary in Simple Words

1. JSON Schema → Specification

  • Best when you need a standard schema

  • Great for cross-language use

  • No validation → LLM must obey

2. Pydantic BaseModel → Strict Validation

  • Ensures correct & clean structured output

  • Perfect for backends, APIs, databases

  • Most reliable for production

3. TypedDict + Annotated → Lightweight

  • No validation, faster

  • Good for simple tasks

  • Best when you want type hints but don’t want heavy models


๐Ÿ… Which One Should YOU Use?

Need Choose
Production app, strict typing Pydantic BaseModel
Tool integration / OpenAPI / external systems JSON Schema
Lightweight & fast TypedDict + Annotated
Most predictable results Pydantic BaseModel


Saturday, November 22, 2025

Text Chunking in LangChain

✂️ Tutorial: Text Chunking in LangChain

(Recursive Splitter, Character Splitter, Language-Aware Splitter, Semantic Chunker)

Chunking is the most important step in building a RAG system.

In this tutorial you will learn:

  • How to chunk PDFs using CharacterTextSplitter

  • How to chunk Markdown using language-aware splitters

  • How to chunk Python code safely

  • How to chunk text semantically using embeddings

  • When to use which splitter and why


๐Ÿ”ฅ 1. Splitting PDFs — CharacterTextSplitter

๐Ÿ’ป Code

from langchain.text_splitter import CharacterTextSplitter
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader('dl-curriculum.pdf')
docs = loader.load()

splitter = CharacterTextSplitter(
    chunk_size=200,
    chunk_overlap=0,
    separator=''
)

result = splitter.split_documents(docs)

print(result[1].page_content)

๐Ÿ” WHAT is happening?

  • Load a PDF → each page is a Document

  • Break each page into 200-character chunks

  • No overlap between chunks

⏰ WHEN to use this?

  • For simple text (plain text, PDFs)

  • When structure is not important

  • When you want fast and simple chunking

❓ WHY useful?

  • Many LLM pipelines need small chunks for:

    • embeddings

    • vector databases

    • retrieval

๐Ÿงพ Example Output (approx)

"Deep Learning has become one of the most exciting areas... (partial text)"

๐Ÿ“˜ 2. RecursiveCharacterTextSplitter — Smart Chunking

This one tries to split intelligently:

  • First by paragraphs

  • Then by sentences

  • Then by words

  • And falls back safely


2A. Markdown Chunking — Language-aware

๐Ÿ’ป Code

from langchain.text_splitter import RecursiveCharacterTextSplitter, Language

text = """
# Project Name: Smart Student Tracker

A simple Python-based project to manage and track student data...
"""

splitter = RecursiveCharacterTextSplitter.from_language(
    language=Language.MARKDOWN,
    chunk_size=200,
    chunk_overlap=0,
)

chunks = splitter.split_text(text)

print(len(chunks))
print(chunks[0])

๐Ÿ” WHAT?

  • Understands Markdown structure

  • Keeps headings + sections together

  • Avoids breaking code blocks incorrectly

⏰ WHEN?

  • GitHub READMEs

  • Notes in Markdown

  • Documentation

❓ WHY?

  • Better accuracy for RAG because chunk boundaries follow logical sections.

๐Ÿงพ Example Output

1
# Project Name: Smart Student Tracker

A simple Python-based project...

2B. Splitting Python Code — Language.PYTHON

๐Ÿ’ป Code

from langchain.text_splitter import RecursiveCharacterTextSplitter, Language

text = """
class Student:
    def __init__(self, name, age, grade):
        self.name = name
        self.age = age
        ...
"""

splitter = RecursiveCharacterTextSplitter.from_language(
    language=Language.PYTHON,
    chunk_size=300,
    chunk_overlap=0,
)

chunks = splitter.split_text(text)

print(len(chunks))
print(chunks[1])

๐Ÿ” WHAT?

  • Safely splits Python code

  • Keeps functions, classes, and expressions together

⏰ WHEN?

  • Code RAG

  • AI assistants for programming

  • LLM-based debugging

❓ WHY?

  • Code must not be broken mid-line or mid-block

  • Helps LLM understand context better

๐Ÿงพ Example Output

    def is_passing(self):
        return self.grade >= 6.0

๐Ÿง  3. Semantic Chunking — Using Embeddings

(The smartest chunker)

๐Ÿ’ป Code

from langchain_experimental.text_splitter import SemanticChunker
from langchain_openai.embeddings import OpenAIEmbeddings
from dotenv import load_dotenv

load_dotenv()

text_splitter = SemanticChunker(
    OpenAIEmbeddings(),
    breakpoint_threshold_type="standard_deviation",
    breakpoint_threshold_amount=3
)

sample = """
Farmers were working hard...
The Indian Premier League (IPL) is the biggest cricket league...
Terrorism is a big danger...
"""

docs = text_splitter.create_documents([sample])
print(len(docs))
print(docs)

๐Ÿ” WHAT?

  • Looks at meaning, not characters

  • Uses embeddings → finds topic shifts

  • Creates chunks where semantic changes occur

Example:

  • Farming paragraph → Chunk 1

  • IPL paragraph → Chunk 2

  • Terrorism paragraph → Chunk 3

⏰ WHEN?

  • Long articles

  • Mixed-topic documents

  • Web scraping

  • RAG applications needing high accuracy

❓ WHY?

  • Avoids mixing unrelated topics

  • Helps retrieval return the best possible chunk

๐Ÿงพ Example Output

3
[Document(page_content='Farmers were working...'), Document(...), Document(...)]

๐Ÿ“— 4. Simple Text Splitting with RecursiveCharacterTextSplitter

๐Ÿ’ป Code

from langchain.text_splitter import RecursiveCharacterTextSplitter

text = """
Space exploration has led to incredible scientific discoveries...
"""

splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=0,
)

chunks = splitter.split_text(text)

print(len(chunks))
print(chunks)

๐Ÿ” WHAT?

  • General-purpose text splitter

  • Tries small separators → large separators → fallback to characters

⏰ WHEN?

  • Normal articles

  • Blogs

  • Wikipedia text

  • Anything non-code, non-Markdown

❓ WHY?

  • Best balance between simplicity and intelligence

  • Most commonly used splitter in RAG systems

๐Ÿงพ Example Output

1
['Space exploration has led...']

⭐ Which Text Splitter Should You Use?

Splitter Best For Why
CharacterTextSplitter PDFs, raw text Fast but dumb splitting
RecursiveCharacterTextSplitter General-purpose chunking Most reliable and balanced
Language-aware Splitter Markdown, Python, HTML Understands syntax & structure
SemanticChunker Mixed-topic large docs Best RAG retrieval accuracy

๐ŸŽฏ Summary

After this tutorial you can:

  • Split PDFs into pages and chunks

  • Split Markdown and Python safely

  • Use semantic chunking with embeddings

  • Decide which splitter is ideal for your project

Chunking is the backbone of RAG, and now you understand it properly.


LangChain Document Loaders (CSV, PDF, Text, Web) + LLM Processing



๐Ÿ“˜ Tutorial: LangChain Document Loaders (CSV, PDF, Text, Web) + LLM Processing

In this tutorial, you will learn:

  • How to load CSV files

  • How to load PDF files

  • How to load all PDFs from a directory

  • How to load TEXT files

  • How to load webpages (HTML)

  • How to use LLMs to summarize, answer questions, inspect metadata, etc.

You’ll also understand:

  • WHAT each loader does

  • WHEN to use it

  • WHY it’s important


1️⃣ Loading CSV Files — CSVLoader

๐Ÿ’ป Code

from langchain_community.document_loaders import CSVLoader

loader = CSVLoader(file_path='Social_Network_Ads.csv')

docs = loader.load()

print(len(docs))
print(docs[1])

๐Ÿ” WHAT is happening?

  • CSVLoader loads CSV rows as individual Documents

  • Each document has:

    • page_content → the row content

    • metadata → row index & file info

⏰ WHEN to use this?

  • When your data is in tabular form:

    • Sales CSV

    • Ads CSV

    • Training dataset

    • Any spreadsheet exported as CSV

❓ WHY use CSVLoader?

  • Converts structured data into LangChain Document objects

  • Easy to send rows into LLMs for:

    • summarization

    • quality checks

    • classification

    • insights

๐Ÿงพ Example Output (approx.)

403
Document(
  page_content="Age: 40, Salary: 59000, Purchased: No",
  metadata={'source': 'Social_Network_Ads.csv', 'row': 1}
)

2️⃣ Loading Multiple PDFs from a Folder — DirectoryLoader

๐Ÿ’ป Code

from langchain_community.document_loaders import DirectoryLoader, PyPDFLoader

loader = DirectoryLoader(
    path='books',
    glob='*.pdf',
    loader_cls=PyPDFLoader
)

docs = loader.lazy_load()

for document in docs:
    print(document.metadata)

๐Ÿ” WHAT is happening?

  • DirectoryLoader scans the folder:

    /books
      ├─ book1.pdf
      ├─ book2.pdf
      ├─ book3.pdf
    
  • Loads all PDFs using PyPDFLoader

⏰ WHEN to use?

  • When processing:

    • E-book collections

    • Research papers

    • PDF-based knowledge bases

    • Multiple invoices

❓ WHY useful?

  • Automates reading entire directories

  • Perfect for large-scale document ingestion pipelines

๐Ÿงพ Example Metadata Output

{'source': 'books/book1.pdf', 'page': 0}
{'source': 'books/book1.pdf', 'page': 1}
...

3️⃣ Loading a Single PDF — PyPDFLoader

๐Ÿ’ป Code

from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader('dl-curriculum.pdf')

docs = loader.load()

print(len(docs))
print(docs[0].page_content)
print(docs[1].metadata)

๐Ÿ” WHAT happens?

  • Each page becomes a Document

  • page_content = text from that page

  • metadata = page number, source file

⏰ WHEN to use this?

  • When you want page-level analysis:

    • Summaries per page

    • Extracting answers

    • Finding chapters

❓ WHY use PyPDFLoader?

  • Most PDFs can’t be processed by LLMs directly

  • This loader extracts text safely & accurately

๐Ÿงพ Example Output

36
"Deep Learning Curriculum...(full text)"
{'source': 'dl-curriculum.pdf', 'page': 1}

4️⃣ Loading TXT Files — TextLoader + Summarization

๐Ÿ’ป Code

from langchain_community.document_loaders import TextLoader
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import PromptTemplate
from dotenv import load_dotenv

load_dotenv()

model = ChatOpenAI()

prompt = PromptTemplate(
    template='Write a summary for the following poem - \n {poem}',
    input_variables=['poem']
)

parser = StrOutputParser()

loader = TextLoader('cricket.txt', encoding='utf-8')

docs = loader.load()

print(type(docs))
print(len(docs))
print(docs[0].page_content)
print(docs[0].metadata)

chain = prompt | model | parser

print(chain.invoke({'poem': docs[0].page_content}))

๐Ÿ” WHAT happens?

  • TextLoader loads the file into a single Document

  • Then we feed the content into an LLM chain for summarization

⏰ WHEN to use?

  • When processing plain-text:

    • poems

    • articles

    • scripts

    • notes

❓ WHY useful?

  • Text files are very common for:

    • datasets

    • chat logs

    • scraped info

๐Ÿงพ Example Output (summary)

The poem celebrates the thrill, passion, and joy of cricket...

5️⃣ Loading Website Content — WebBaseLoader

๐Ÿ’ป Code

from langchain_community.document_loaders import WebBaseLoader
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import PromptTemplate
from dotenv import load_dotenv

load_dotenv()

model = ChatOpenAI()

prompt = PromptTemplate(
    template='Answer the following question \n {question} from the following text - \n {text}',
    input_variables=['question','text']
)

parser = StrOutputParser()

url = 'https://www.flipkart.com/apple-macbook-air-m2-16-gb-256-gb-ssd-macos-sequoia-mc7x4hn-a/p/itmdc5308fa78421'
loader = WebBaseLoader(url)

docs = loader.load()

chain = prompt | model | parser

print(chain.invoke({'question': 'What is the product we are talking about?', 'text': docs[0].page_content}))

๐Ÿ” WHAT happens?

  • WebBaseLoader scrapes HTML, removes tags, extracts readable text

  • You now have product details as a Document

⏰ WHEN to use?

  • For pulling data from:

    • product pages

    • blogs

    • documentation

    • news articles

❓ WHY useful?

  • You can automatically create:

    • summaries

    • Q&A bots

    • research assistants

    • scraping + LLM analysis pipelines

๐Ÿงพ Example Answer Output

The product discussed is an Apple MacBook Air M2 (16GB | 256GB SSD).

๐Ÿ“Œ Summary — Which Loader to Use?

Loader Best For Why?
CSVLoader CSV files Converts rows → Documents
TextLoader TXT files Simple & reliable text extraction
PyPDFLoader Single PDFs Page-by-page documents
DirectoryLoader Many PDFs Automated ingestion
WebBaseLoader Websites Scrapes HTML → Text


Advanced LangChain Runnables – Sequence, Parallel, Branch, Passthrough & Lambda



๐Ÿ”— Tutorial: Advanced LangChain Runnables – Sequence, Parallel, Branch, Passthrough & Lambda

In this tutorial you’ll learn how to:

  • Use RunnableSequence for multi-step flows

  • Use RunnableParallel to get multiple outputs from one run

  • Use RunnablePassthrough to forward inputs inside a chain

  • Use RunnableBranch to add if/else logic

  • Use RunnableLambda to plug in your own Python functions

We’ll use ChatOpenAI everywhere, assuming you already set up:

  • OPENAI_API_KEY in .env

  • pip install langchain langchain-openai python-dotenv


1. Sequential Chain with RunnableSequence – Joke → Explanation

Let’s start with the simplest: do one thing after another.

๐Ÿ’ป Code

from langchain_openai import ChatOpenAI
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from dotenv import load_dotenv
from langchain.schema.runnable import RunnableSequence

load_dotenv()

prompt1 = PromptTemplate(
    template='Write a joke about {topic}',
    input_variables=['topic']
)

prompt2 = PromptTemplate(
    template='Explain the following joke - {text}',
    input_variables=['text']
)

model = ChatOpenAI()
parser = StrOutputParser()

chain = RunnableSequence(
    prompt1,      # build joke prompt
    model,        # generate joke
    parser,       # get joke as string
    prompt2,      # build explanation prompt from text
    model,        # generate explanation
    parser        # get explanation as string
)

print(chain.invoke({'topic': 'AI'}))

๐Ÿ” What is happening?

  1. prompt1 + {'topic': 'AI'}"Write a joke about AI".

  2. model → generates the joke.

  3. parser → converts AIMessage → plain string.

  4. prompt2 takes that string as {text}"Explain the following joke - <joke>".

  5. model → explains the joke.

  6. Last parser → returns explanation string.

So the flow is:

input dict → joke prompt → LLM → joke text → explain prompt → LLM → explanation text

๐Ÿ•’ When to use this?

  • When you need multi-step processing:

    • Generate → then explain

    • Analyze → then summarize

    • Extract → then transform

❓ Why is this useful?

  • You build structured flows instead of single prompts.

  • Easy to debug & extend (e.g., add more steps later).

๐Ÿงพ Example output (will vary)

This joke plays on the common fear that AI will take over human jobs. By making AI itself the one "applying for a job," it flips the perspective and makes the situation humorous instead of scary.

2. Conditional Summarization with RunnableBranch + RunnableSequence

If the report is too long, summarize it.
If it’s short, just return it as is.

๐Ÿ’ป Code

from langchain_openai import ChatOpenAI
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from dotenv import load_dotenv
from langchain.schema.runnable import (
    RunnableSequence,
    RunnableBranch,
    RunnablePassthrough,
)

load_dotenv()

prompt1 = PromptTemplate(
    template='Write a detailed report on {topic}',
    input_variables=['topic']
)

prompt2 = PromptTemplate(
    template='Summarize the following text \n {text}',
    input_variables=['text']
)

model = ChatOpenAI()
parser = StrOutputParser()

# First step: generate report text (string)
report_gen_chain = prompt1 | model | parser

# Second step: if report too long, summarize. Otherwise, return as-is
branch_chain = RunnableBranch(
    (lambda x: len(x.split()) > 300, prompt2 | model | parser),
    RunnablePassthrough()  # default branch: just pass text through
)

# Final flow: report generation → branching logic
final_chain = RunnableSequence(report_gen_chain, branch_chain)

print(final_chain.invoke({'topic': 'Russia vs Ukraine'}))

๐Ÿ” What is happening?

  • report_gen_chain outputs a string (the report).

  • branch_chain receives that string as x:

    • If len(x.split()) > 300 → run summarization chain.

    • Else → RunnablePassthrough() → return the original report.

๐Ÿ•’ When to use RunnableBranch?

  • When your logic depends on the content:

    • Length of text

    • Sentiment label

    • Category / language

❓ Why is this powerful?

  • You combine:

    • LLMs for content generation

    • Python conditions for routing

  • It’s like if/else inside the LangChain pipeline.

๐Ÿ”Ž Example behavior

  • If model writes a huge report → you’ll see a summary.

  • If it’s somewhat short → you’ll see the original detailed report.


3. Joke + Word Count in Parallel – RunnableParallel + RunnableLambda

Generate a joke and compute its word count at the same time.

๐Ÿ’ป Code

from langchain_openai import ChatOpenAI
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from dotenv import load_dotenv
from langchain.schema.runnable import (
    RunnableSequence,
    RunnableLambda,
    RunnablePassthrough,
    RunnableParallel,
)

load_dotenv()

def word_count(text: str) -> int:
    return len(text.split())

prompt = PromptTemplate(
    template='Write a joke about {topic}',
    input_variables=['topic']
)

model = ChatOpenAI()
parser = StrOutputParser()

# Step 1: generate the joke (string)
joke_gen_chain = RunnableSequence(prompt, model, parser)

# Step 2: from that joke string, compute two things in parallel:
parallel_chain = RunnableParallel({
    'joke': RunnablePassthrough(),        # just forward the joke text
    'word_count': RunnableLambda(word_count)  # apply python function
})

# Full chain: joke → parallel processing
final_chain = RunnableSequence(joke_gen_chain, parallel_chain)

result = final_chain.invoke({'topic': 'AI'})

final_result = "{} \nword count - {}".format(
    result['joke'],
    result['word_count']
)

print(final_result)

๐Ÿ” What is happening?

  1. joke_gen_chain:

    • Input: {'topic': 'AI'}

    • Output: "some generated joke" (string)

  2. parallel_chain:

    • Receives that joke text as input.

    • RunnablePassthrough() → returns the joke unchanged.

    • RunnableLambda(word_count) → calls your Python function on the joke.

  3. Final result is a dict:

{
  "joke": "<joke text>",
  "word_count": 17
}

Then you format it into a string to print.

๐Ÿ•’ When to use RunnableParallel?

  • When you want to compute multiple views of the same thing:

    • text + metadata (e.g., length, sentiment)

    • raw text + extracted title

    • answer + explanation

❓ Why is RunnableLambda useful?

  • It lets you plug in arbitrary Python logic into a LangChain pipeline.

  • Great for small utilities like word count, regex cleaning, etc.

๐Ÿงพ Example output (will vary)

Why did the AI go to therapy? Because it had too many unresolved loops and couldn’t process its feelings properly!
word count - 24

4. Tweet + LinkedIn Post in Parallel – RunnableParallel + RunnableSequence

Same topic, two platforms: Twitter & LinkedIn content together.

๐Ÿ’ป Code

from langchain_openai import ChatOpenAI
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from dotenv import load_dotenv
from langchain.schema.runnable import RunnableSequence, RunnableParallel

load_dotenv()

prompt1 = PromptTemplate(
    template='Generate a tweet about {topic}',
    input_variables=['topic']
)

prompt2 = PromptTemplate(
    template='Generate a Linkedin post about {topic}',
    input_variables=['topic']
)

model = ChatOpenAI()
parser = StrOutputParser()

parallel_chain = RunnableParallel({
    'tweet': RunnableSequence(prompt1, model, parser),
    'linkedin': RunnableSequence(prompt2, model, parser)
})

result = parallel_chain.invoke({'topic': 'AI'})

print("Tweet:\n", result['tweet'])
print("\nLinkedIn:\n", result['linkedin'])

๐Ÿ” What is happening?

  • Both sub-chains:

    • Take the same input: {'topic': 'AI'}

    • Build different prompts (tweet vs LinkedIn)

    • Use same model and parser.

  • RunnableParallel returns:

{
  "tweet": "<short tweet>",
  "linkedin": "<longer professional post>"
}

๐Ÿ•’ When to use this pattern?

  • Multi-channel content generation:

    • Tweet + LinkedIn + Email subject

    • Title + Meta description + Social caption

❓ Why is it nice?

  • All logic is declarative:

    • You just describe what outputs you want.

    • LangChain handles passing input through all branches.

๐Ÿงพ Example output (roughly):

Tweet:
AI isn’t here to replace humans—it’s here to amplify our potential. The real power is in humans + AI working together. ๐Ÿค๐Ÿค– #AI #FutureOfWork

LinkedIn:
Artificial Intelligence is transforming the way we work, learn, and build products. But it’s not about replacing humans—it’s about augmenting our abilities. Teams that learn how to collaborate with AI will move faster, make better decisions, and unlock new opportunities. Now is the right time to upskill, experiment, and think about how AI can enhance value in your domain, not just automate tasks.

5. Joke + Explanation in Parallel (Generated Once, Used Twice)

Pattern: generate once → reuse output in multiple paths.

Let’s slightly improve your “joke + explanation” parallel example so it’s robust.

๐Ÿ’ป Code

from langchain_openai import ChatOpenAI
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from dotenv import load_dotenv
from langchain.schema.runnable import (
    RunnableSequence,
    RunnableParallel,
    RunnablePassthrough,
    RunnableLambda,
)

load_dotenv()

prompt_joke = PromptTemplate(
    template='Write a joke about {topic}',
    input_variables=['topic']
)

prompt_explain = PromptTemplate(
    template='Explain the following joke - {text}',
    input_variables=['text']
)

model = ChatOpenAI()
parser = StrOutputParser()

# Step 1: generate joke (string)
joke_gen_chain = RunnableSequence(prompt_joke, model, parser)

# Helper: map joke string → {"text": joke} for the explanation prompt
to_explain_input = RunnableLambda(lambda joke_text: {"text": joke_text})

# Step 2: in parallel, keep original joke & generate explanation
parallel_chain = RunnableParallel({
    'joke': RunnablePassthrough(),
    'explanation': RunnableSequence(
        to_explain_input,
        prompt_explain,
        model,
        parser
    )
})

# Final chain: joke → parallel (joke + explanation)
final_chain = RunnableSequence(joke_gen_chain, parallel_chain)

result = final_chain.invoke({'topic': 'cricket'})

print("Joke:\n", result['joke'])
print("\nExplanation:\n", result['explanation'])

๐Ÿ” What is happening?

  1. joke_gen_chain"some cricket joke" (string).

  2. parallel_chain receives that joke string:

    • joke: RunnablePassthrough() → returns joke unchanged.

    • explanation:

      • RunnableLambda converts string → {"text": joke} (what prompt_explain expects).

      • prompt_explain builds: "Explain the following joke - <joke>"

      • model + parser produce explanation string.

Result:

{
  "joke": "<cricket joke>",
  "explanation": "<explanation of the joke>"
}

๐Ÿ•’ When to use this pattern?

  • When one step’s output should be:

    • Returned as-is

    • Also sent into another chain for extra processing

❓ Why use RunnableLambda here?

  • Because PromptTemplate expects a dict input with {"text": ...}.

  • RunnableLambda lets you reshape data in between steps.


6. Summary – Runnable Patterns You Now Know

RunnableSequence

  • What: Step-by-step chain (A → B → C).

  • When: You want a fixed pipeline of transformations.

  • Why: Clear, readable, composable flows.

RunnableParallel

  • What: Run multiple branches from the same input.

  • When: Need multiple outputs (tweet + LinkedIn, joke + word count).

  • Why: Saves mental complexity & keeps business logic clean.

RunnablePassthrough

  • What: Just forwards the input to output.

  • When: You want the original value plus something derived from it.

  • Why: Simple way to keep original data in parallel chains.

RunnableBranch

  • What: Conditional routing (if / elif / else).

  • When: Your next step depends on content (length, sentiment, category).

  • Why: Lets you blend LLMs with deterministic logic.

RunnableLambda

  • What: Wraps a Python function into the chain.

  • When: You need custom logic (word count, reshaping dictionaries, etc.).

  • Why: Flexible bridge between LangChain and normal Python.

Building Your Own Mini-LangChain (Custom LLM, Prompt Template, and Chain)



๐Ÿงช Tutorial: Building Your Own Mini-LangChain (Custom LLM, Prompt Template, and Chain)

In this tutorial, you will learn how LangChain works internally by building:

  • A fake LLM (NakliLLM)

  • A prompt template formatter (NakliPromptTemplate)

  • A mini chain class (NakliLLMChain) that works like PromptTemplate | LLM | Parser

This helps you understand:

  • What LangChain is doing behind the scenes

  • How chaining works

  • How prompts + models combine together

Perfect for beginners!


1️⃣ Step 1 – Create a Fake LLM (NakliLLM)

import random

class NakliLLM:

    def __init__(self):
        print('LLM created')

    def predict(self, prompt):

        response_list = [
            'Delhi is the capital of India',
            'IPL is a cricket league',
            'AI stands for Artificial Intelligence'
        ]

        # Return RANDOM response, ignoring prompt
        return {'response': random.choice(response_list)}

WHAT is this?

A mock LLM that returns random answers. It does not use OpenAI or API keys.

WHEN to use this?

  • When teaching LangChain concepts

  • When testing LLM pipelines without paying for tokens

  • When prototyping LLM flow offline

WHY is this useful?

It helps you understand how LangChain chains work without depending on real LLMs.

▶️ Expected output when initialized:

LLM created

2️⃣ Step 2 – Create a Prompt Template Class (like LangChain’s PromptTemplate)

class NakliPromptTemplate:

    def __init__(self, template, input_variables):
        self.template = template
        self.input_variables = input_variables

    def format(self, input_dict):
        return self.template.format(**input_dict)

WHAT?

A simple template engine that replaces placeholders:

"Write a {length} poem about {topic}"

with actual values.

WHEN?

Always before calling an LLM — because LLMs need a properly formatted prompt as a string.

WHY?

It teaches how LangChain’s PromptTemplate works without the complexity.


3️⃣ Step 3 – Use Our Prompt Template

template = NakliPromptTemplate(
    template='Write a {length} poem about {topic}',
    input_variables=['length', 'topic']
)

prompt = template.format({'length': 'short', 'topic': 'india'})
print(prompt)

Output:

Write a short poem about india

WHAT?

Prompt preparation.

WHEN?

Before sending to ANY LLM.

WHY?

Prompts must be final strings before models use them.


4️⃣ Step 4 – Call The Fake LLM

llm = NakliLLM()

llm.predict(prompt)

Output example:

LLM created
{'response': 'AI stands for Artificial Intelligence'}

WHAT?

We pass a prompt → LLM returns a fake response.

WHEN?

Whenever you need LLM output.

WHY?

This simulates real LLM behavior in a toy environment.


5️⃣ Step 5 – Build a Mini Chain (NakliLLMChain)

Now we create the class:

class NakliLLMChain:

    def __init__(self, llm, prompt_template):
        self.llm = llm
        self.prompt_template = prompt_template

    def run(self, input_dict):
        # 1. format the prompt
        formatted_prompt = self.prompt_template.format(input_dict)

        # 2. pass formatted prompt to LLM
        result = self.llm.predict(formatted_prompt)

        # 3. return output
        return result['response']

WHAT is the chain?

A simple wrapper for:

PromptTemplate -> LLM -> Output

Exactly like:

prompt | model | StrOutputParser

WHEN is a chain used?

When you want to:

  • Combine prompt creation + LLM call

  • Reuse the same flow repeatedly

  • Organize code cleanly

WHY chain is important?

This is the core concept of LangChain — simple, modular pipelines.


6️⃣ Step 6 – Use the Chain

template = NakliPromptTemplate(
    template='Write a {length} poem about {topic}',
    input_variables=['length', 'topic']
)

llm = NakliLLM()

chain = NakliLLMChain(llm, template)

result = chain.run({'length': 'short', 'topic': 'india'})
print(result)

Sample Output:

LLM created
AI stands for Artificial Intelligence

๐Ÿ“Œ Full Concept Summary (Easy For Beginners)

Component What? When? Why?
NakliLLM Fake language model Offline testing Learn chain flow without OpenAI
NakliPromptTemplate Formats prompt text Before any LLM call Reusable & clean prompts
NakliLLMChain Pipeline wrapper Repeated tasks Core LangChain concept
predict() Simulates model output During execution Easy testing
template.format() Fills placeholders Before LLM call Converts dict → final string


LangChain Runnable Chains (Simple, Sequential, Parallel, Branching)



๐Ÿ”— Tutorial: LangChain Runnable Chains (Simple, Sequential, Parallel, Branching)

In this tutorial, you’ll learn how to:

  • Build a simple chain: Prompt → Model → OutputParser

  • Build a sequential chain: one LLM step feeds into another

  • Run parallel chains using RunnableParallel (notes + quiz at same time)

  • Use branching logic (RunnableBranch) to react differently based on model output (positive/negative feedback)

  • Visualize chains using chain.get_graph().print_ascii()

✅ This assumes your environment, requirements.txt, and .env are already set up like in your previous tutorial.


1. Simple Chain – Prompt → Model → OutputParser

✅ Code

from langchain_openai import ChatOpenAI
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser

load_dotenv()

prompt = PromptTemplate(
    template='Generate 5 interesting facts about {topic}',
    input_variables=['topic']
)

model = ChatOpenAI()

parser = StrOutputParser()

chain = prompt | model | parser

result = chain.invoke({'topic': 'cricket'})

print(result)

chain.get_graph().print_ascii()

๐Ÿ” What is happening?

  • PromptTemplate → builds the prompt text:
    "Generate 5 interesting facts about cricket"

  • ChatOpenAI → sends that prompt to OpenAI chat model.

  • StrOutputParser → takes the LLM response and returns it as a plain string (instead of AIMessage).

The chain:

PromptTemplate -> ChatOpenAI -> StrOutputParser

๐Ÿ•’ When to use this pattern?

  • Whenever you need one simple step:

    • Generate ideas

    • Write short content

    • Transform text (summaries, paraphrasing, etc.)

❓ Why is it useful?

  • Keeps your pipeline clean and composable.

  • You can later plug this chain into bigger chains.

  • | (pipe) makes the flow easy to read: input → model → output.

๐Ÿงพ Example Output (will vary)

1. Cricket originated in England and has been played since the 16th century.
2. Test cricket is the longest format of the game, lasting up to five days.
3. Sachin Tendulkar holds the record for the most runs in international cricket.
4. The Cricket World Cup is held every four years and features One Day International matches.
5. The Indian Premier League (IPL) is one of the richest and most popular T20 leagues in the world.

The print_ascii() graph will look like:

           +----------------+
           | PromptTemplate |
           +----------------+
                    |
            +----------------+
            |   ChatOpenAI   |
            +----------------+
                    |
           +-------------------+
           |  StrOutputParser  |
           +-------------------+

2. Sequential Chain – Report → Summary

First generate a detailed report, then summarize it into 5 bullet points.

✅ Code

from langchain_openai import ChatOpenAI
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser

load_dotenv()

prompt1 = PromptTemplate(
    template='Generate a detailed report on {topic}',
    input_variables=['topic']
)

prompt2 = PromptTemplate(
    template='Generate a 5 pointer summary from the following text \n {text}',
    input_variables=['text']
)

model = ChatOpenAI()
parser = StrOutputParser()

chain = prompt1 | model | parser | prompt2 | model | parser

result = chain.invoke({'topic': 'Unemployment in India'})

print(result)

chain.get_graph().print_ascii()

๐Ÿ” What is happening step-by-step?

  1. prompt1 + topic → full prompt:
    "Generate a detailed report on Unemployment in India"

  2. model → generates the report (long text).

  3. parser → turns AIMessage → string.

  4. prompt2 → takes that report string as {text} and asks:
    "Generate a 5 pointer summary from the following text ..."

  5. model → generates 5-point summary.

  6. Final parser → returns summary as string.

So the chain is:

topic
  └─> prompt1
        └─> model
             └─> parser (report string)
                     └─> prompt2
                             └─> model
                                  └─> parser (summary string)

๐Ÿ•’ When to use this pattern?

  • When you want multi-step logic:

    • First: detailed reasoning

    • Second: short user-friendly summary

  • Very common in:

    • Document analysis

    • Multi-stage content generation

    • Reasoning then simplification

❓ Why is it important?

  • It shows how LangChain lets you compose LLM calls like Lego blocks.

  • Each stage can refine or transform previous output.

๐Ÿงพ Example Output (summary, will vary)

1. Unemployment in India is influenced by population growth, skill mismatch, and structural issues in the economy.
2. Rural areas face seasonal and disguised unemployment, while urban regions struggle with educated unemployment.
3. Automation and lack of industrial growth contribute to limited job creation in formal sectors.
4. Government initiatives like MGNREGA, Skill India, and Make in India aim to address unemployment, but face implementation challenges.
5. Long-term solutions require improving education quality, promoting entrepreneurship, and supporting labor-intensive industries.

print_ascii() will show a longer chain graph from PromptTemplate through ChatOpenAI and StrOutputParser twice.


3. Parallel Chains – Notes + Quiz at the Same Time (RunnableParallel)

Generate notes and quiz questions from the same text in parallel, then merge them.

✅ Code

from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain.schema.runnable import RunnableParallel

load_dotenv()

model1 = ChatOpenAI()
model2 = ChatAnthropic(model_name='claude-3-7-sonnet-20250219')

prompt1 = PromptTemplate(
    template='Generate short and simple notes from the following text \n {text}',
    input_variables=['text']
)

prompt2 = PromptTemplate(
    template='Generate 5 short question answers from the following text \n {text}',
    input_variables=['text']
)

prompt3 = PromptTemplate(
    template='Merge the provided notes and quiz into a single document \n notes -> {notes} and quiz -> {quiz}',
    input_variables=['notes', 'quiz']
)

parser = StrOutputParser()

parallel_chain = RunnableParallel({
    'notes': prompt1 | model1 | parser,
    'quiz': prompt2 | model2 | parser
})

merge_chain = prompt3 | model1 | parser

chain = parallel_chain | merge_chain

text = """
Support vector machines (SVMs) are a set of supervised learning methods used for classification, regression and outliers detection.

The advantages of support vector machines are:
Effective in high dimensional spaces.
Still effective in cases where number of dimensions is greater than the number of samples.
Uses a subset of training points in the decision function (called support vectors), so it is also memory efficient.
Versatile: different Kernel functions can be specified for the decision function.

The disadvantages include:
Risk of overfitting with many features and few samples.
No direct probability estimates without extra computation.
"""

result = chain.invoke({'text': text})

print(result)

chain.get_graph().print_ascii()

๐Ÿ” What is happening?

  1. Parallel step with RunnableParallel:

    • 'notes' key:

      • prompt1 + text → ChatOpenAI → notes string

    • 'quiz' key:

      • prompt2 + text → ChatAnthropic → quiz string

    Result of parallel_chain.invoke({...}) is:

    {
      "notes": "...generated notes...",
      "quiz": "...generated questions..."
    }
    
  2. Merge step:

    • prompt3 takes {notes} and {quiz} from this dict.

    • Asks model1 (OpenAI) to merge these into a single document.

So the full chain looks like:

           text
             |
     +-------------------+
     |  RunnableParallel |
     +-------------------+
        |             |
     notes chain    quiz chain
        \             /
         \           /
        merged by prompt3 -> model1 -> parser

๐Ÿ•’ When to use RunnableParallel?

  • When you want to generate multiple things from the same input:

    • Notes + quiz

    • Summary + title + hashtags

    • SEO description + keywords + social post

❓ Why is it powerful?

  • You can:

    • Use different models for different tasks.

    • Run conceptually parallel steps in one high-level chain.

  • It fits how real apps work: multiple outputs for the same user input.

๐Ÿงพ Example Output (simplified, will vary)

Notes:
- SVMs are supervised learning methods used for classification, regression, and outlier detection.
- They work well in high-dimensional spaces.
- They use only a subset of training points (support vectors), making them memory-efficient.
- Different kernel functions can be used to adapt to various data types.

Quiz:
1. What are support vector machines primarily used for?
2. Why are SVMs effective in high-dimensional spaces?
3. What are support vectors in an SVM?
4. Name one advantage of using kernel functions in SVM.
5. What is one disadvantage of SVMs when the number of features is very large?

4. Branching Chains – Different Logic for Positive/Negative Feedback

Step 1: Classify sentiment (positive / negative).
Step 2: Based on sentiment, pick different response prompt (branch).

✅ Code

from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser, PydanticOutputParser
from langchain.schema.runnable import RunnableBranch, RunnableLambda
from pydantic import BaseModel, Field
from typing import Literal

load_dotenv()

model = ChatOpenAI()

parser = StrOutputParser()

class Feedback(BaseModel):
    sentiment: Literal['positive', 'negative'] = Field(
        description='Give the sentiment of the feedback'
    )

parser2 = PydanticOutputParser(pydantic_object=Feedback)

prompt1 = PromptTemplate(
    template=(
        'Classify the sentiment of the following feedback text into positive or negative\n'
        '{feedback}\n'
        '{format_instruction}'
    ),
    input_variables=['feedback'],
    partial_variables={'format_instruction': parser2.get_format_instructions()}
)

classifier_chain = prompt1 | model | parser2

prompt2 = PromptTemplate(
    template='Write an appropriate response to this positive feedback:\n{feedback}',
    input_variables=['feedback']
)

prompt3 = PromptTemplate(
    template='Write an appropriate response to this negative feedback:\n{feedback}',
    input_variables=['feedback']
)

branch_chain = RunnableBranch(
    (lambda x: x.sentiment == 'positive', prompt2 | model | parser),
    (lambda x: x.sentiment == 'negative', prompt3 | model | parser),
    RunnableLambda(lambda x: "Could not determine sentiment.")
)

chain = classifier_chain | branch_chain

print(chain.invoke({'feedback': 'This is a beautiful phone'}))

chain.get_graph().print_ascii()

๐Ÿ” What is happening?

4.1. PydanticOutputParser + Feedback model

class Feedback(BaseModel):
    sentiment: Literal['positive', 'negative'] = Field(...)
  • This defines the expected output structure: a JSON-like object with one field: sentiment.

parser2.get_format_instructions() gives instructions like:

“Respond in JSON with fields: sentiment: 'positive' or 'negative' ...”

So prompt1 tells the model how to output structured data.

classifier_chain = prompt1 | model | parser2:

  • prompt1 → classification instructions + format instructions

  • model → returns some text

  • parser2 → parses that text into a Feedback object

Result of classifier_chain.invoke(...) is something like:

Feedback(sentiment='positive')

4.2. Branching with RunnableBranch

branch_chain = RunnableBranch(
    (lambda x: x.sentiment == 'positive', prompt2 | model | parser),
    (lambda x: x.sentiment == 'negative', prompt3 | model | parser),
    RunnableLambda(lambda x: "Could not determine sentiment.")
)
  • It checks conditions in order:

    • If x.sentiment == 'positive' → run positive feedback chain.

    • Else if x.sentiment == 'negative' → run negative feedback chain.

    • Else → fallback RunnableLambda → static string.

Combined:

chain = classifier_chain | branch_chain

So full flow:

  1. Classify feedback → Feedback(sentiment='positive' | 'negative')

  2. Branch to the appropriate response template and model.

๐Ÿ•’ When to use this pattern?

  • When your workflow depends on model output:

    • Sentiment → positive/negative path

    • Category → “billing” vs “technical support”

    • Language → English response vs Bangla response

❓ Why is it powerful?

  • You can implement conditional logic inside the chain itself.

  • Combines LLM decisions with deterministic control flow.

๐Ÿงพ Example Output

Input:

chain.invoke({'feedback': 'This is a beautiful phone'})

Possible output:

Thank you so much for your kind words! We’re glad to hear that you find the phone beautiful and enjoyable to use. If you have any questions or need help exploring more features, we’re always here to assist. ๐Ÿ˜Š

If the feedback were:

"This phone keeps hanging and the battery drains too fast."

It would likely go through the negative branch and produce an apology + support-style response.


5. Summary – What You’ve Learned in This Tutorial

๐Ÿงฉ Patterns Covered

  1. Simple chainPromptTemplate | ChatOpenAI | StrOutputParser

    • What: Basic “prompt → LLM → string” pipeline

    • When: Any one-step generation or transformation

    • Why: Building block for everything else

  2. Sequential chain – multi-step reasoning (report → summary)

    • What: One LLM’s output feeds into another prompt

    • When: You want staged processing (detailed → simplified, raw → cleaned)

    • Why: Reflects real-world pipelines

  3. Parallel chain (RunnableParallel) – notes & quiz together

    • What: Multiple chains executed conceptually in parallel, returning a dict

    • When: Need multiple outputs from one input (notes, quiz, tags, etc.)

    • Why: Efficient composition & cleaner code

  4. Branching chain (RunnableBranch) – logic based on sentiment

    • What: Conditional routing based on model output (via Pydantic parser)

    • When: Different flows for positive/negative, categories, etc.

    • Why: Mixes LLM intelligence with explicit control flow

  5. PydanticOutputParser – structured, typed model output

    • What: Parse LLM response into a typed object (Feedback)

    • When: You want predictable, machine-readable output

    • Why: Makes downstream logic safer and cleaner

  6. Graph visualization (get_graph().print_ascii())

    • What: ASCII diagram of your chain

    • When: Debugging or teaching how a chain is wired

    • Why: Helps beginners understand the execution flow.

Structured Output & Multi-step Chains with HuggingFace + OpenAI

๐Ÿงฉ Tutorial: Structured Output & Multi-step Chains with HuggingFace + OpenAI (LangChain) In this tutorial you’ll learn: How to get...