Appearance
LCEL
Today, we will delve into one of the core modules of LangChain: chains. More precisely, this represents a central concept within LangChain because it embodies the framework's core design philosophy: connecting all components to build complex LLM applications in a clear and straightforward way.
In LangChain, the creation of components and the construction of LLM applications are somewhat decoupled. So far, we have studied components such as prompt templates, LLM models, and output parsers. These components each play a role in building an LLM application, linking together in a pre-defined sequence. If one component doesn't perform well, such as finding a better LLM model, we can simply change the model instance being used without needing to modify other parts of the application.
This approach of connecting components to build chains makes the construction of LLM applications highly flexible and controllable.
LangChain's chain-building approach has undergone an upgrade, shifting from using class-based methods (XXXChain) to constructing LLM chains using LCEL (LangChain Expression Language). Today, we will start with a comparison between traditional chains and LCEL, detailing the advantages and usage of LCEL, and further exploring the underlying principles.
Traditional Chains vs. LCEL Chains
Before diving deeply into LCEL, let's start with an example to understand what LCEL is and how it differs from traditional chains.
Constructing a Traditional Complex LLM Chain
Suppose we need to create three teaching Q&A LLM chains: one for answering physics questions, one for math, and another for general topics.
python
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain_openai import OpenAI
llm = OpenAI(temperature=0)
math_prompt_template = """
You are an expert in math.
Respond to the following question:
Question: {question}
Answer:
"""
physics_prompt_template = """
You are an expert in physics.
Respond to the following question:
Question: {question}
Answer:
"""
other_prompt_template = """
Respond to the following question:
Question: {question}
Answer:
"""
math_chain = LLMChain(llm=llm, prompt=PromptTemplate.from_template(math_prompt_template))
physics_chain = LLMChain(llm=llm, prompt=PromptTemplate.from_template(physics_prompt_template))
other_chain = LLMChain(llm=llm, prompt=PromptTemplate.from_template(other_prompt_template))
Now, we need to select different LLM chains based on the type of question a user inputs: if it's a math question, we choose math_chain
; if it's a physics question, we choose physics_chain
; and for other types of questions, we use other_chain
.
To achieve this, we create an additional LLM chain to identify the type of question.
python
classify_prompt_template = """
Given the user question below, classify it as either being about `Math`, `Physics`, or `Other`.
Do not respond with more than one word.
<question>
{question}
</question>
Classification:
"""
classify_chain = LLMChain(llm=llm, prompt=PromptTemplate.from_template(classify_prompt_template))
print(classify_chain.invoke({"question": "Tell me a joke"}))
# >> Other
print(classify_chain.invoke({"question": "What is 1+1?"}))
# >> Math
print(classify_chain.invoke({"question": "What is the acceleration due to gravity?"}))
# >> Physics
Next, we use classify_chain
for routing, classifying the user's input and directing the question to the corresponding LLM chain.
python
user_input = "What is 1+1?"
question_type = classify_chain.invoke({"question": user_input})
if question_type == "Math":
chain = math_chain
elif question_type == "Physics":
chain = physics_chain
else:
chain = other_chain
print(chain.invoke({"question": user_input}))
With the addition of if/else branches, the code becomes increasingly complex. With only three branches, it is manageable, but as the number of chains increases, the code becomes harder to maintain. It becomes challenging to track the connections between chains, making it risky to modify the logic without affecting subsequent stages.
Constructing a Complex LLM Chain Using LCEL
Now, let's rewrite the above LLM chain using LCEL syntax.
python
llm = OpenAI(temperature=0)
# The prompt templates remain the same as above
math_prompt_template = """xxxx"""
physics_prompt_template = """xxxx"""
other_prompt_template = """xxxx"""
classify_prompt_template = """xxxx"""
# Constructing four LLM chains using LCEL
from langchain_core.output_parsers import StrOutputParser
math_chain = PromptTemplate.from_template(math_prompt_template) | llm | StrOutputParser()
physics_chain = PromptTemplate.from_template(physics_prompt_template) | llm | StrOutputParser()
other_chain = PromptTemplate.from_template(other_prompt_template) | llm | StrOutputParser()
classify_chain = PromptTemplate.from_template(classify_prompt_template) | llm | StrOutputParser()
# Defining the logic for answering questions
from langchain_core.runnables import RunnableBranch
from operator import itemgetter
answer_chain = RunnableBranch(
(lambda x: "math" in x["topic"].lower(), math_chain),
(lambda x: "physics" in x["topic"].lower(), physics_chain),
other_chain
)
# Combining the final chain
final_chain = {"topic": classify_chain, "question": itemgetter("question")} | answer_chain
print(final_chain.invoke({"question": user_input}))
Using LCEL seems more complex, right?
The syntax of LCEL distinctly moves away from traditional Python, with components and even chains connected using the pipe operator |
. Additionally, new concepts such as Runnable
are introduced, raising the learning curve.
However, this approach makes the data flow visualizable, clearly showing where each component or chain leads. This "composability" strengthens flexibility and control over LLM application construction.
If the only benefit was a clearer flow, the increased learning curve might not be worth it for many. But LCEL's promise extends to enabling the seamless transition of prototypes to production without code changes, from simple "prompt + LLM" chains to the most complex configurations.
Why Use LCEL for Chain Building?
Supports Streaming
In some scenarios, stream processing can greatly enhance user experience. For example, in chat applications, the model's response can be streamed to the user character by character, rather than waiting for the entire response. Chains built with LCEL support streaming natively.
python
chain = prompt_template | llm | output_parser
for chunk in chain.stream({"input": "test"}):
print(chunk, end="", flush=True)
Supports Batch Processing
LCEL can handle multiple inputs simultaneously and return multiple outputs, which is much more efficient than repeatedly invoking the chain. Each stage of an LCEL chain can also support batch processing, a common need when pulling data from multiple external sources for subsequent processing.
python
chain = prompt_template | model | output_parser
result_list = chain.batch([{"input": "test1"}, {"input": "test2"}])
# >> ["output1", "output2"]
Supports Asynchronous Operations
Traditional chain construction requires a lot of async/await
code to achieve asynchronous processing, complicating the code. LCEL provides asynchronous versions for every synchronous call (invoke
, stream
, batch
), making asynchronous processing much easier.
python
chain = prompt_template | llm | output_parser
await chain.ainvoke({"input": "test"})
Supports Fallback Logic
For better robustness, backup LLM chains can be used when the primary chain encounters an issue. In traditional approaches, this requires using try/except
for handling exceptions. LCEL supports fallback logic, simplifying the code.
python
chain = prompt_template | llm1 | output_parser
fallback_chain = prompt_template | llm2 | output_parser
chain.with_fallbacks([fallback_chain])
chain.invoke({"input": "test"})
Furthermore, every component/chain built with LCEL is integrated with LangSmith for easy debugging and testing. LangChain continues to enhance LCEL's capabilities, making LLM application development simpler and more convenient. You don't need to worry about the execution mode of the chain—whether batch processing or asynchronous operation—just connect the components according to the business logic, and the chain's code will not require changes. The more complex the LLM chain, the more pronounced LCEL's advantages become.
LCEL Implementation Principles
When using a framework like LCEL (LangChain Expression Language), it is crucial to understand its underlying implementation. Without this knowledge, it can be challenging to troubleshoot issues during development and debugging, as errors may be unclear, and the way to fix them may be elusive.
The Mystery Behind the Pipe Operator
The first step in understanding LCEL is to figure out the meaning behind the pipe operator ("|").
In fact, this is still native Python functionality. The pipe operator "|" is essentially the bitwise OR operator in Python. When the Python interpreter encounters "|", such as in the expression A | B
, it will call the A.__or__
method with B
as the parameter. In other words, the following two expressions are equivalent:
python
A | B
A.__or__(B)
Thus, we can override the __or__
method to implement a custom pipeline handling logic.
python
class Runnable:
def __init__(self, func):
self.func = func
def __or__(self, other):
def chained_func(*args, **kwargs):
return other(self.func(*args, **kwargs))
return Runnable(chained_func)
def __call__(self, *args, **kwargs):
return self.func(*args, **kwargs)
In the code above, we have created a Runnable
class, which essentially acts as a wrapper. Let's see its most basic usage:
python
def add(x, y):
return x + y
add_runner = Runnable(add)
print(add_runner(1, 2)) # Output: 3
In this example:
- We define an
add
function and then wrap it into aRunnable
object. - When we instantiate the
Runnable
object, the__init__
method assigns the providedadd
function toself.func
. - Calling
add_runner(1, 2)
effectively invokesadd_runner.__call__(1, 2)
, which calls theadd
function internally.
Now, let's increase the complexity and see what happens when two Runnable
objects are connected using the pipe operator.
python
def add(x, y):
return x + y
add_runner = Runnable(add)
def double(x):
return x * 2
double_runner = Runnable(double)
result_runner = add_runner | double_runner
print(result_runner(1, 2)) # Output: 6
Here's what happens:
add_runner
anddouble_runner
are instances of theRunnable
class, wrapping theadd
anddouble
functions, respectively.add_runner | double_runner
is equivalent toadd_runner.__or__(double_runner)
, returning a newRunnable
object.- Calling
result_runner(1, 2)
executeschained_func
, where:
self.func
isadd_runner.func
, andother
isdouble_runner
.- First, it calls
add(1, 2)
, returning 3. - It then passes the result (3) to
double_runner
, resulting indouble(3)
, which returns 6.
As we can see, Runnable
already possesses the basic capability of passing data through a pipeline.
Runnable
LangChain achieves its custom pipeline functionality by overriding the __or__
method in the Runnable
class. However, unlike the example above, Runnable.__or__
returns a subclass called RunnableSequence
.
By understanding Runnable
and RunnableSequence
, we can easily grasp the execution flow of an LCEL chain. Each component in the LCEL chain is a Runnable
object, while the entire LCEL chain is a RunnableSequence
.
python
from langchain.output_parsers import CommaSeparatedListOutputParser
from langchain.prompts import PromptTemplate
from langchain_openai import OpenAI
from langchain.schema.runnable import Runnable, RunnableSequence
llm = OpenAI()
print(isinstance(llm, Runnable)) # Output: True
list_parser = CommaSeparatedListOutputParser()
print(isinstance(list_parser, Runnable)) # Output: True
prompt_template = PromptTemplate(
template="List some {subject}.\n",
input_variables=["subject"],
)
print(isinstance(prompt_template, Runnable)) # Output: True
chain = prompt_template | llm | list_parser
print(isinstance(chain, RunnableSequence)) # Output: True
The Runnable
class defines six methods: invoke
, stream
, batch
, ainvoke
, astream
, and abatch
, supporting functionalities such as stream processing, batch processing, and asynchronous execution. Each component inherits from Runnable
and implements these six methods.
RunnableSequence
represents a sequence of multiple Runnable
objects. It contains attributes named first
, middle
, and last
, which are all Runnable
instances representing the first, middle, and last components in the sequence. The RunnableSequence
class also overrides the six methods to support sequential execution of the Runnable
objects.
For example, let's look at the RunnableSequence.invoke
method:
python
class RunnableSequence(RunnableSerializable[Input, Output]):
...
def invoke(...):
# `step` represents each component instance in the chain
for i, step in enumerate(self.steps):
input = step.invoke(input, ...)
Here's a simplified execution equivalent:
python
chain = component1 | component2 | component3
output = chain.invoke(input)
# Equivalent to:
output1 = component1.invoke(input)
output2 = component2.invoke(output1)
output = component3.invoke(output2)
Thus, when an LCEL chain is executed, each component in the chain sequentially calls its corresponding method (invoke/stream/batch), passing the output to the next component, until the final result is reached.
Understanding this mechanism clarifies how LCEL chains facilitate modular and flexible LLM application development.
Common Runnables in LangChain
LangChain provides a variety of utility classes based on Runnable
to enrich the functionality of LCEL (LangChain Expression Language). These classes significantly simplify the code complexity when constructing complex chains. Let's explore some common classes.
RunnablePassthrough
RunnablePassthrough
is a simple Runnable
wrapper used for directly passing data through the chain. When positioned at the first step of the chain, it represents the user's input; when in a middle step, it represents the output from the previous component.
RunnablePassthrough
is particularly useful when adjusting the input to the next component, such as in RAG (Retrieval-Augmented Generation) scenarios where you need to first retrieve related data based on user input, then pass the retrieved results along with the original question to the downstream prompt template.
python
from langchain.schema.runnable import RunnablePassthrough
...
# The retriever is used to search for data based on user input.
retriever = vectorstore.as_retriever()
# Prompt template to be filled with user input and retrieved data.
template = """Answer the question based only on the following context:
{context}
Question: {question}
"""
prompt_template = ChatPromptTemplate.from_template(template)
retrieval_chain = (
{"context": retriever, "question": RunnablePassthrough()}
| prompt_template
| OpenAI()
| StrOutputParser()
)
retrieval_chain.invoke("How to write a Juejin booklet?")
In the example above:
- The user's question is passed to both
retriever
andRunnablePassthrough()
. - The retriever completes the search and assigns the result to
context
. - The retrieved
context
and the originalquestion
are then passed together to the prompt template.
RunnableParallel
RunnableParallel
is designed for parallel execution of multiple Runnable
objects, allowing components/chains to be wrapped and executed concurrently. This is particularly useful for enabling parallel processing for certain steps or multiple chains.
Executing Multiple Chains Concurrently
When the same input needs to be processed by multiple chains, these chains can be combined into a single chain using RunnableParallel
for concurrent execution.
python
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnableParallel
from langchain_openai import ChatOpenAI
llm = ChatOpenAI()
joke_chain = ChatPromptTemplate.from_template("tell me a joke about {topic}") | llm
poem_chain = (
ChatPromptTemplate.from_template("write a 2-line poem about {topic}") | llm
)
map_chain = RunnableParallel(joke=joke_chain, poem=poem_chain)
map_chain.invoke({"topic": "bear"})
The result would be something like:
json
{
"joke": AIMessage(content="Why don't bears wear shoes?\n\nBecause they have bear feet!"),
"poem": AIMessage(content="In the wild's embrace, bear roams free,\nStrength and grace, a majestic decree.")
}
In this example, map_chain
executes joke_chain
and poem_chain
concurrently, improving execution efficiency.
Partial Component Parallelism
Sometimes, the user's question needs to be searched across multiple data sources before passing the results and the question to a prompt template.
In such cases, multiple retrievers can be wrapped in a RunnableParallel
for concurrent retrieval.
python
from langchain.schema.runnable import RunnablePassthrough
...
# Retriever 1
retriever1 = vectorstore1.as_retriever()
# Retriever 2
retriever2 = vectorstore2.as_retriever()
# Prompt template to fill with user input and retrieved data.
template = """Answer the question based only on the following context:
{context1}
{context2}
Question: {question}
"""
prompt_template = ChatPromptTemplate.from_template(template)
multi_retrieval_chain = (
RunnableParallel({
"context1": retriever1,
"context2": retriever2,
"question": RunnablePassthrough()
})
| prompt_template
| OpenAI()
| StrOutputParser()
)
multi_retrieval_chain.invoke("How to write a Juejin booklet?")
Implicit Conversion
In an LCEL chain, dictionaries are implicitly converted to RunnableParallel
, meaning the above multi_retrieval_chain
can be simplified as follows:
python
multi_retrieval_chain = (
{
"context1": retriever1,
"context2": retriever2,
"question": RunnablePassthrough()
}
| prompt_template
| OpenAI()
| StrOutputParser()
)
RunnableBranch
RunnableBranch
is used for multi-branch sub-chain scenarios, providing routing capabilities similar to if/else
statements. You can create multiple branches in a RunnableBranch
and assign a sub-chain to each branch. During execution, RunnableBranch
iterates through the branches in order and uses the first branch that meets the condition.
The usage of RunnableBranch
has been demonstrated earlier in the "LCEL: Building Complex LLM Chains" section, which you can refer back to after understanding the features explained here.
RunnableLambda
RunnableLambda
is a powerful Runnable
class in LangChain that allows any regular Python function or callable object to be converted into a Runnable
object, enabling the use of streaming, batching, asynchronous processing, and other features of Runnable
.
python
from langchain_core.runnables import RunnableLambda
def add_one(x: int) -> int:
return x + 1
# Convert the function into a Runnable object
runnable_add_one = RunnableLambda(add_one)
print(runnable_add_one.invoke(5)) # Output: 6
This conversion makes any function usable as a part of an LCEL chain. For example, you can integrate external systems by defining a custom function that reports the output to a logging system:
python
from langchain_core.runnables import RunnableLambda
from langchain_openai import OpenAI
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
def report_to_log(llm_answer: str):
print(f"Reporting {llm_answer} to log system")
return llm_answer
prompt = PromptTemplate.from_template("answer the question: {question}\n\n answer:")
chain = prompt | OpenAI() | RunnableLambda(report_to_log) | StrOutputParser()
chain.invoke({"question": "What's 2 + 2?"})
Note: The function wrapped by
RunnableLambda
can only have one input parameter.
Summary
LangChain supports two approaches to building chains: the traditional class-based approach and using LCEL syntax. LCEL uses the pipe operator "|" to connect different components, offering stronger composability and support for streaming, batching, asynchronous processing, and more, making LLM application development more flexible and simpler.
LCEL remains native Python syntax, with the pipe operator enabling data passing through the overridden __or__
method. Understanding Runnable
and RunnableSequence
is key to grasping LCEL's execution flow—each component is a Runnable
object, while the chain is a RunnableSequence
.
To further simplify LLM chain construction, a series of utility classes based on Runnable
—such as RunnablePassthrough
, RunnableParallel
, RunnableBranch
, and RunnableLambda
—were developed. Mastering these classes can significantly enhance LLM application development efficiency.