LCEL

Today, we will delve into one of the core modules of LangChain: chains. More precisely, this represents a central concept within LangChain because it embodies the framework's core design philosophy: connecting all components to build complex LLM applications in a clear and straightforward way.

In LangChain, the creation of components and the construction of LLM applications are somewhat decoupled. So far, we have studied components such as prompt templates, LLM models, and output parsers. These components each play a role in building an LLM application, linking together in a pre-defined sequence. If one component doesn't perform well, such as finding a better LLM model, we can simply change the model instance being used without needing to modify other parts of the application.

This approach of connecting components to build chains makes the construction of LLM applications highly flexible and controllable.

LangChain's chain-building approach has undergone an upgrade, shifting from using class-based methods (XXXChain) to constructing LLM chains using LCEL (LangChain Expression Language). Today, we will start with a comparison between traditional chains and LCEL, detailing the advantages and usage of LCEL, and further exploring the underlying principles.

Traditional Chains vs. LCEL Chains

Before diving deeply into LCEL, let's start with an example to understand what LCEL is and how it differs from traditional chains.

Constructing a Traditional Complex LLM Chain

Suppose we need to create three teaching Q&A LLM chains: one for answering physics questions, one for math, and another for general topics.

python

from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain_openai import OpenAI

llm = OpenAI(temperature=0)

math_prompt_template = """
You are an expert in math.
Respond to the following question:
Question: {question}
Answer:
"""

physics_prompt_template = """
You are an expert in physics.
Respond to the following question:
Question: {question}
Answer:
"""

other_prompt_template = """
Respond to the following question:
Question: {question}
Answer:
"""

math_chain = LLMChain(llm=llm, prompt=PromptTemplate.from_template(math_prompt_template))
physics_chain = LLMChain(llm=llm, prompt=PromptTemplate.from_template(physics_prompt_template))
other_chain = LLMChain(llm=llm, prompt=PromptTemplate.from_template(other_prompt_template))

Now, we need to select different LLM chains based on the type of question a user inputs: if it's a math question, we choose math_chain; if it's a physics question, we choose physics_chain; and for other types of questions, we use other_chain.

To achieve this, we create an additional LLM chain to identify the type of question.

python

classify_prompt_template = """
Given the user question below, classify it as either being about `Math`, `Physics`, or `Other`.
Do not respond with more than one word.

<question>
{question}
</question>

Classification:
"""
classify_chain = LLMChain(llm=llm, prompt=PromptTemplate.from_template(classify_prompt_template))
print(classify_chain.invoke({"question": "Tell me a joke"}))
# >> Other
print(classify_chain.invoke({"question": "What is 1+1?"}))
# >> Math
print(classify_chain.invoke({"question": "What is the acceleration due to gravity?"}))
# >> Physics

Next, we use classify_chain for routing, classifying the user's input and directing the question to the corresponding LLM chain.

python

user_input = "What is 1+1?"
question_type = classify_chain.invoke({"question": user_input})
if question_type == "Math":
    chain = math_chain
elif question_type == "Physics":
    chain = physics_chain
else:
    chain = other_chain
print(chain.invoke({"question": user_input}))

With the addition of if/else branches, the code becomes increasingly complex. With only three branches, it is manageable, but as the number of chains increases, the code becomes harder to maintain. It becomes challenging to track the connections between chains, making it risky to modify the logic without affecting subsequent stages.

Constructing a Complex LLM Chain Using LCEL

Now, let's rewrite the above LLM chain using LCEL syntax.

python

llm = OpenAI(temperature=0)
# The prompt templates remain the same as above
math_prompt_template = """xxxx"""
physics_prompt_template = """xxxx"""
other_prompt_template = """xxxx"""  
classify_prompt_template = """xxxx"""

# Constructing four LLM chains using LCEL
from langchain_core.output_parsers import StrOutputParser

math_chain = PromptTemplate.from_template(math_prompt_template) | llm | StrOutputParser()
physics_chain = PromptTemplate.from_template(physics_prompt_template) | llm | StrOutputParser()
other_chain = PromptTemplate.from_template(other_prompt_template) | llm | StrOutputParser()
classify_chain = PromptTemplate.from_template(classify_prompt_template) | llm | StrOutputParser()

# Defining the logic for answering questions
from langchain_core.runnables import RunnableBranch
from operator import itemgetter
answer_chain = RunnableBranch(
    (lambda x: "math" in x["topic"].lower(), math_chain),
    (lambda x: "physics" in x["topic"].lower(), physics_chain),
    other_chain
)

# Combining the final chain
final_chain = {"topic": classify_chain, "question": itemgetter("question")} | answer_chain
print(final_chain.invoke({"question": user_input}))

Using LCEL seems more complex, right?

The syntax of LCEL distinctly moves away from traditional Python, with components and even chains connected using the pipe operator |. Additionally, new concepts such as Runnable are introduced, raising the learning curve.

However, this approach makes the data flow visualizable, clearly showing where each component or chain leads. This "composability" strengthens flexibility and control over LLM application construction.

If the only benefit was a clearer flow, the increased learning curve might not be worth it for many. But LCEL's promise extends to enabling the seamless transition of prototypes to production without code changes, from simple "prompt + LLM" chains to the most complex configurations.

Why Use LCEL for Chain Building?

Supports Streaming

In some scenarios, stream processing can greatly enhance user experience. For example, in chat applications, the model's response can be streamed to the user character by character, rather than waiting for the entire response. Chains built with LCEL support streaming natively.

python

chain = prompt_template | llm | output_parser
for chunk in chain.stream({"input": "test"}):
    print(chunk, end="", flush=True)

Supports Batch Processing

LCEL can handle multiple inputs simultaneously and return multiple outputs, which is much more efficient than repeatedly invoking the chain. Each stage of an LCEL chain can also support batch processing, a common need when pulling data from multiple external sources for subsequent processing.

python

chain = prompt_template | model | output_parser
result_list = chain.batch([{"input": "test1"}, {"input": "test2"}])
# >> ["output1", "output2"]

Supports Asynchronous Operations

Traditional chain construction requires a lot of async/await code to achieve asynchronous processing, complicating the code. LCEL provides asynchronous versions for every synchronous call (invoke, stream, batch), making asynchronous processing much easier.

python

chain = prompt_template | llm | output_parser
await chain.ainvoke({"input": "test"})

Supports Fallback Logic

For better robustness, backup LLM chains can be used when the primary chain encounters an issue. In traditional approaches, this requires using try/except for handling exceptions. LCEL supports fallback logic, simplifying the code.

python

chain = prompt_template | llm1 | output_parser
fallback_chain = prompt_template | llm2 | output_parser
chain.with_fallbacks([fallback_chain])
chain.invoke({"input": "test"})

Furthermore, every component/chain built with LCEL is integrated with LangSmith for easy debugging and testing. LangChain continues to enhance LCEL's capabilities, making LLM application development simpler and more convenient. You don't need to worry about the execution mode of the chain—whether batch processing or asynchronous operation—just connect the components according to the business logic, and the chain's code will not require changes. The more complex the LLM chain, the more pronounced LCEL's advantages become.

LCEL Implementation Principles

When using a framework like LCEL (LangChain Expression Language), it is crucial to understand its underlying implementation. Without this knowledge, it can be challenging to troubleshoot issues during development and debugging, as errors may be unclear, and the way to fix them may be elusive.

The Mystery Behind the Pipe Operator

The first step in understanding LCEL is to figure out the meaning behind the pipe operator ("|").

In fact, this is still native Python functionality. The pipe operator "|" is essentially the bitwise OR operator in Python. When the Python interpreter encounters "|", such as in the expression A | B, it will call the A.__or__ method with B as the parameter. In other words, the following two expressions are equivalent:

python

A | B
A.__or__(B)

Thus, we can override the __or__ method to implement a custom pipeline handling logic.

python

class Runnable:
    def __init__(self, func):
        self.func = func

    def __or__(self, other):
        def chained_func(*args, **kwargs):
            return other(self.func(*args, **kwargs))
        return Runnable(chained_func)

    def __call__(self, *args, **kwargs):
        return self.func(*args, **kwargs)

In the code above, we have created a Runnable class, which essentially acts as a wrapper. Let's see its most basic usage:

python

def add(x, y):
    return x + y

add_runner = Runnable(add)
print(add_runner(1, 2))  # Output: 3

In this example:

We define an add function and then wrap it into a Runnable object.
When we instantiate the Runnable object, the __init__ method assigns the provided add function to self.func.
Calling add_runner(1, 2) effectively invokes add_runner.__call__(1, 2), which calls the add function internally.

Now, let's increase the complexity and see what happens when two Runnable objects are connected using the pipe operator.

python

def add(x, y):
    return x + y

add_runner = Runnable(add)

def double(x):
    return x * 2

double_runner = Runnable(double)

result_runner = add_runner | double_runner
print(result_runner(1, 2))  # Output: 6

Here's what happens:

add_runner and double_runner are instances of the Runnable class, wrapping the add and double functions, respectively.
add_runner | double_runner is equivalent to add_runner.__or__(double_runner), returning a new Runnable object.
Calling result_runner(1, 2) executes chained_func, where:

self.func is add_runner.func, and other is double_runner.
First, it calls add(1, 2), returning 3.
It then passes the result (3) to double_runner, resulting in double(3), which returns 6.

As we can see, Runnable already possesses the basic capability of passing data through a pipeline.

Runnable

LangChain achieves its custom pipeline functionality by overriding the __or__ method in the Runnable class. However, unlike the example above, Runnable.__or__ returns a subclass called RunnableSequence.

By understanding Runnable and RunnableSequence, we can easily grasp the execution flow of an LCEL chain. Each component in the LCEL chain is a Runnable object, while the entire LCEL chain is a RunnableSequence.

python

from langchain.output_parsers import CommaSeparatedListOutputParser
from langchain.prompts import PromptTemplate
from langchain_openai import OpenAI
from langchain.schema.runnable import Runnable, RunnableSequence

llm = OpenAI()
print(isinstance(llm, Runnable))  # Output: True

list_parser = CommaSeparatedListOutputParser()
print(isinstance(list_parser, Runnable))  # Output: True

prompt_template = PromptTemplate(
    template="List some {subject}.\n",
    input_variables=["subject"],
)
print(isinstance(prompt_template, Runnable))  # Output: True

chain = prompt_template | llm | list_parser
print(isinstance(chain, RunnableSequence))  # Output: True

The Runnable class defines six methods: invoke, stream, batch, ainvoke, astream, and abatch, supporting functionalities such as stream processing, batch processing, and asynchronous execution. Each component inherits from Runnable and implements these six methods.

RunnableSequence represents a sequence of multiple Runnable objects. It contains attributes named first, middle, and last, which are all Runnable instances representing the first, middle, and last components in the sequence. The RunnableSequence class also overrides the six methods to support sequential execution of the Runnable objects.

For example, let's look at the RunnableSequence.invoke method:

python

class RunnableSequence(RunnableSerializable[Input, Output]):
    ...
    def invoke(...):
        # `step` represents each component instance in the chain
        for i, step in enumerate(self.steps):
            input = step.invoke(input, ...)

Here's a simplified execution equivalent:

python

chain = component1 | component2 | component3
output = chain.invoke(input)

# Equivalent to:
output1 = component1.invoke(input)
output2 = component2.invoke(output1)
output = component3.invoke(output2)

Thus, when an LCEL chain is executed, each component in the chain sequentially calls its corresponding method (invoke/stream/batch), passing the output to the next component, until the final result is reached.

Understanding this mechanism clarifies how LCEL chains facilitate modular and flexible LLM application development.

Common Runnables in LangChain

LangChain provides a variety of utility classes based on Runnable to enrich the functionality of LCEL (LangChain Expression Language). These classes significantly simplify the code complexity when constructing complex chains. Let's explore some common classes.

RunnablePassthrough

RunnablePassthrough is a simple Runnable wrapper used for directly passing data through the chain. When positioned at the first step of the chain, it represents the user's input; when in a middle step, it represents the output from the previous component.

RunnablePassthrough is particularly useful when adjusting the input to the next component, such as in RAG (Retrieval-Augmented Generation) scenarios where you need to first retrieve related data based on user input, then pass the retrieved results along with the original question to the downstream prompt template.

python

from langchain.schema.runnable import RunnablePassthrough
...
# The retriever is used to search for data based on user input.
retriever = vectorstore.as_retriever()

# Prompt template to be filled with user input and retrieved data.
template = """Answer the question based only on the following context:
{context}

Question: {question}
"""
prompt_template = ChatPromptTemplate.from_template(template)

retrieval_chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt_template
    | OpenAI()
    | StrOutputParser()
)

retrieval_chain.invoke("How to write a Juejin booklet?")

In the example above:

The user's question is passed to both retriever and RunnablePassthrough().
The retriever completes the search and assigns the result to context.
The retrieved context and the original question are then passed together to the prompt template.

RunnableParallel

RunnableParallel is designed for parallel execution of multiple Runnable objects, allowing components/chains to be wrapped and executed concurrently. This is particularly useful for enabling parallel processing for certain steps or multiple chains.

Executing Multiple Chains Concurrently

When the same input needs to be processed by multiple chains, these chains can be combined into a single chain using RunnableParallel for concurrent execution.

python

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnableParallel
from langchain_openai import ChatOpenAI

llm = ChatOpenAI()
joke_chain = ChatPromptTemplate.from_template("tell me a joke about {topic}") | llm
poem_chain = (
    ChatPromptTemplate.from_template("write a 2-line poem about {topic}") | llm
)

map_chain = RunnableParallel(joke=joke_chain, poem=poem_chain)

map_chain.invoke({"topic": "bear"})

The result would be something like:

json

{
  "joke": AIMessage(content="Why don't bears wear shoes?\n\nBecause they have bear feet!"),
  "poem": AIMessage(content="In the wild's embrace, bear roams free,\nStrength and grace, a majestic decree.")
}

In this example, map_chain executes joke_chain and poem_chain concurrently, improving execution efficiency.

Partial Component Parallelism

Sometimes, the user's question needs to be searched across multiple data sources before passing the results and the question to a prompt template.

In such cases, multiple retrievers can be wrapped in a RunnableParallel for concurrent retrieval.

python

from langchain.schema.runnable import RunnablePassthrough
...
# Retriever 1
retriever1 = vectorstore1.as_retriever()
# Retriever 2
retriever2 = vectorstore2.as_retriever()

# Prompt template to fill with user input and retrieved data.
template = """Answer the question based only on the following context:
{context1}

{context2}

Question: {question}
"""
prompt_template = ChatPromptTemplate.from_template(template)

multi_retrieval_chain = (
    RunnableParallel({
        "context1": retriever1, 
        "context2": retriever2,
        "question": RunnablePassthrough()
    })
    | prompt_template
    | OpenAI()
    | StrOutputParser()
)

multi_retrieval_chain.invoke("How to write a Juejin booklet?")

Implicit Conversion

In an LCEL chain, dictionaries are implicitly converted to RunnableParallel, meaning the above multi_retrieval_chain can be simplified as follows:

python

multi_retrieval_chain = (
    {
        "context1": retriever1, 
        "context2": retriever2,
        "question": RunnablePassthrough()
    }
    | prompt_template
    | OpenAI()
    | StrOutputParser()
)

RunnableBranch

RunnableBranch is used for multi-branch sub-chain scenarios, providing routing capabilities similar to if/else statements. You can create multiple branches in a RunnableBranch and assign a sub-chain to each branch. During execution, RunnableBranch iterates through the branches in order and uses the first branch that meets the condition.

The usage of RunnableBranch has been demonstrated earlier in the "LCEL: Building Complex LLM Chains" section, which you can refer back to after understanding the features explained here.

RunnableLambda

RunnableLambda is a powerful Runnable class in LangChain that allows any regular Python function or callable object to be converted into a Runnable object, enabling the use of streaming, batching, asynchronous processing, and other features of Runnable.

python

from langchain_core.runnables import RunnableLambda
def add_one(x: int) -> int:
    return x + 1

# Convert the function into a Runnable object
runnable_add_one = RunnableLambda(add_one)
print(runnable_add_one.invoke(5))  # Output: 6

This conversion makes any function usable as a part of an LCEL chain. For example, you can integrate external systems by defining a custom function that reports the output to a logging system:

python

from langchain_core.runnables import RunnableLambda
from langchain_openai import OpenAI
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser

def report_to_log(llm_answer: str):
    print(f"Reporting {llm_answer} to log system")
    return llm_answer

prompt = PromptTemplate.from_template("answer the question: {question}\n\n answer:")
chain = prompt | OpenAI() | RunnableLambda(report_to_log) | StrOutputParser()
chain.invoke({"question": "What's 2 + 2?"})

Note: The function wrapped by RunnableLambda can only have one input parameter.

Summary

LangChain supports two approaches to building chains: the traditional class-based approach and using LCEL syntax. LCEL uses the pipe operator "|" to connect different components, offering stronger composability and support for streaming, batching, asynchronous processing, and more, making LLM application development more flexible and simpler.

LCEL remains native Python syntax, with the pipe operator enabling data passing through the overridden __or__ method. Understanding Runnable and RunnableSequence is key to grasping LCEL's execution flow—each component is a Runnable object, while the chain is a RunnableSequence.

To further simplify LLM chain construction, a series of utility classes based on Runnable—such as RunnablePassthrough, RunnableParallel, RunnableBranch, and RunnableLambda—were developed. Mastering these classes can significantly enhance LLM application development efficiency.

LCEL ​

Traditional Chains vs. LCEL Chains ​

Constructing a Traditional Complex LLM Chain ​

Constructing a Complex LLM Chain Using LCEL ​

Why Use LCEL for Chain Building? ​

Supports Streaming ​

Supports Batch Processing ​

Supports Asynchronous Operations ​

Supports Fallback Logic ​

LCEL Implementation Principles ​

The Mystery Behind the Pipe Operator ​

Runnable ​

Common Runnables in LangChain ​

RunnablePassthrough ​

RunnableParallel ​

Executing Multiple Chains Concurrently ​

Partial Component Parallelism ​

Implicit Conversion ​

RunnableBranch ​

RunnableLambda ​

Summary ​

LCEL

Traditional Chains vs. LCEL Chains

Constructing a Traditional Complex LLM Chain

Constructing a Complex LLM Chain Using LCEL

Why Use LCEL for Chain Building?

Supports Streaming

Supports Batch Processing

Supports Asynchronous Operations

Supports Fallback Logic

LCEL Implementation Principles

The Mystery Behind the Pipe Operator

Runnable

Common Runnables in LangChain

RunnablePassthrough

RunnableParallel

Executing Multiple Chains Concurrently

Partial Component Parallelism

Implicit Conversion

RunnableBranch

RunnableLambda

Summary