LangChain
LangChain: a framework for building applications powered by language models, provides:
- Composability: Build complex AI workflows by chaining simple components.
- Abstraction: Common interfaces for different LLMs, vector stores, etc.
- Memory: Maintain context across interactions.
- Tools: Connect LLMs to external data and capabilities
Core Architecture
Models: wrappers around LLMs that provide a consistent interface:
import { ChatOpenAI } from "@langchain/openai";
import { ChatAnthropic } from "@langchain/anthropic";
// All models share the same interface
const openAIModel = new ChatOpenAI({
modelName: "gpt-4",
temperature: 0.7,
});
const anthropicModel = new ChatAnthropic({
modelName: "claude-3-sonnet",
temperature: 0.7,
});
// Same method works for both
const response = await openAIModel.invoke("Hello, how are you?");
Messages - Communication Format
LangChain uses a message-based system: because modern chat models (GPT-4, Claude) are designed to work with conversations, not just single prompts. This system mirrors how the models are actually trained and how their APIs work.
Messages are structured objects that represent individual turns in a conversations. Each message has:
- A role
- Content
- Optional metadata
A prompt is a template that generates messages, a message factory.
import { HumanMessage, SystemMessage, AIMessage } from "@langchain/core/messages";
const messages = [
new SystemMessage("You are a helpful assistant"),
new HumanMessage("What's the weather like?"),
new AIMessage("I don't have access to real-time weather data..."),
];
Prompts - Template System
Dynamic prompts with variable substitution
import { ChatPromptTemplate } from "@langchain/core/prompts";
const prompt = ChatPromptTemplate.fromTemplate(
"You are a {role}. Answer this question: {question}"
);
const formattedPrompt = await prompt.format({
role: "scientist",
question: "What is photosynthesis?"
});
const chatPrompt = ChatPromptTemplate.fromMessages([
["system", "You are {role}. Always be {personality}."],
["human", "{query}"]
]);
// Generates multiple messages
const messages = await chatPrompt.formatMessages({
role: "a chef",
personality: "enthusiastic and encouraging",
query: "How do I make pasta?"
});
// Result:
// [
// SystemMessage { content: "You are a chef. Always be enthusiastic and encouraging." },
// HumanMessage { content: "How do I make pasta?" }
// ]
Chains - Composition Pattern: connect components in sequence
Output Parsers - Structure Results
import { z } from "zod";
import { StructuredOutputParser } from "langchain/output_parsers";
const parser = StructuredOutputParser.fromZodSchema(
z.object({
name: z.string(),
age: z.number(),
interests: z.array(z.string())
})
);
// Parser generates format instructions for the LLM
const formatInstructions = parser.getFormatInstructions();
Runnable: the foundational abstraction in LangChain, represents any component that can process input and produce output.
invoke() - Single Request, Complete Reponse
processes one input and returns the complete output after processing is done
- you need the full response before proceeding
- simple request-repsonse patterns
- when you don't need real-time feedback
stream() - Single Request, Streaming Response
Use when:
- Real-time UI updates
- Large responses that take time
- Better user experience with immediate feedback
- Processing data as it arrives
const model = new ChatOpenAI();
// Streaming response - chunks arrive as generated
const stream = await model.stream("Write a long story about space");
for await (const chunk of stream) {
process.stdout.write(chunk.content); // Print each chunk as it arrives
}
batch() - multiple Requests, parallel processing
- processing multiple items efficiently
- bulk operations
- when you have rate limits but want parallelism
- analyzing multiple documents simultaneously
Runnables can be chained using .pipe():
const chain = prompt
.pipe(model)
.pipe(outputParser)
.pipe(customProcessor);
// All three methods work on chains
await chain.invoke(input);
await chain.stream(input);
await chain.batch(inputs);
Everything in LangChain implements the Runnable interface
interface Runnable {
invoke(input: any): Promise<any>;
stream(input: any): AsyncGenerator<any>;
batch(inputs: any[]): Promise<any[]>;
}
Work together
import { ChatOpenAI } from "@langchain/openai";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { StringOutputParser } from "@langchain/core/output_parsers";
// 1. Create components
const model = new ChatOpenAI({ temperature: 0 });
const prompt = ChatPromptTemplate.fromTemplate(
"Tell me a joke about {topic}"
);
const outputParser = new StringOutputParser();
// 2. Chain them together
const chain = prompt.pipe(model).pipe(outputParser);
// 3. Run the chain
const joke = await chain.invoke({ topic: "programming" });
// Input → Prompt → Model → Parser → Output
Memory system maintains conversation context:
Memory in LangChain is both the messages, and additional context / summaries.
Memory = Conversation History + State Management:
Memory in LangChain is a system that:
- Stores the conversation history (messages)
- Manages how that history is used
- Can transform/summarize the history
- Decides what to remember and what to forget
Types of Memory:
- BufferMemory: Stores Everything
- BufferWindowMemory - Limited History
- ConversationSummaryMemory - Compressed History
- ConversationSummaryBufferMemory - Hybrid Approach
import { BufferMemory } from "langchain/memory";
import { ConversationChain } from "langchain/chains";
const memory = new BufferMemory();
const chain = new ConversationChain({
llm: model,
memory: memory,
});
// First interaction
await chain.call({ input: "My name is Alice" });
// Second interaction remembers the first
await chain.call({ input: "What's my name?" }); // Knows it's Alice
Document Processing Pipeline: For RAG (Retrieval Augmented Generation):
const loader = new TextLoader("./document.txt");
const docs = await loader.load();
// 2. Split into chunks
const splitter = new RecursiveCharacterTextSplitter({
chunkSize: 1000,
chunkOverlap: 200,
});
const chunks = await splitter.splitDocuments(docs);
// 3. Create embeddings and store
const embeddings = new OpenAIEmbeddings();
const vectorStore = await MemoryVectorStore.fromDocuments(
chunks,
embeddings
);
// 4. Create retrieval chain
const retriever = vectorStore.asRetriever();
const chain = RetrievalQAChain.fromLLM(model, retriever);
Agents - Dynamic Decision Making
import { initializeAgentExecutorWithOptions } from "langchain/agents";
import { Calculator } from "langchain/tools/calculator";
const tools = [new Calculator()];
const agent = await initializeAgentExecutorWithOptions(tools, model, {
agentType: "zero-shot-react-description",
});
const result = await agent.call({
input: "What is 25% of 180?"
});
// Agent decides to use calculator tool