LangChain(0.0.340)官方文档十一:Agents之Agent Types

发布时间:2023年12月29日

一、快速入门

??我们可以将代理(agents)视为为LLM(Language Model)提供“工具”的一种方式。就像人类使用计算器进行数学计算或使用Google搜索获取信息一样,代理允许LLM执行类似的操作。
在这里插入图片描述

Agents是可以使用计算器、搜索或执行代码等工具的 LLM

1.1 概念

开始之前,我们先了解一些概念:

  1. Agent

Agent是决定下一步采取什么行动的链。它由语言模型和提示驱动。Agent的输入包括:

  • Tools:可用工具的描述
  • User input:高级目标
  • intermediate_steps(List[Tuple[AgentAction, Any]]):先前为实现目标而执行的 (action, tool output) 对,它们被传递给后续的迭代步骤,以便代理知道它已经完成了哪些工作。其中observation使用Any类型以便最大程度地灵活使用。通常情况下,observation是一个字符串。

Agent的输出是下一步要采取的行动( AgentAction)或发送给用户的最终响应(AgentFinish)。

  • AgentAction:一个数据类,表示代理应该采取的行动。它有一个tool属性(表示应该调用的工具名称)和一个tool_input属性(表示该工具的输入)。

  • AgentFinish:一个数据类,表示应该返回给用户的最终响应。它有一个return_values参数,是一个要返回的字典。通常只有一个键 - output(字符串类型)。

另外不同的Agent具有不同的prompting styles,不同的输入编码方式以及不同的输出解析方式。LangChain内置了很多Agent方便调用,详见Agent Types

  1. Tools
    工具是Agent可以调用的函数,它们是Agent实现目标所需的关键工具。设计工具时需要考虑两个重要因素:提供正确的工具、以正确的方式描述工具。Langchain内置了许多工具,详见tools integrations section

  2. Toolkits
    对于许多任务,Agent往往需要多个相关工具来结合完成,Langchain由此提出Toolkits(工具包)的概念。Langchain提供了许多内置的工具包,用于快速入门,详见 toolkits integrations section

  3. AgentExecutor
    代理执行器是 the runtime for an agent。它负责调用Agent、执行所选择的操作、将操作输出传递回Agent并重复这个过程。此过程中AgentExecutor也负责处理一些复杂的情况,比如Agent选择了不存在的工具、tool errors、Agent生成的输出无法解析为正确的工具调用,以及日志记录等。此过程伪代码如下:

next_action = agent.get_action(...)      # 获取Agent的下一步行动
while next_action != AgentFinish:		 # 判断是否是最终响应
	# 通过调用run(next_action)来执行Agent选择的操作,并获取观察结果(observation)
    observation = run(next_action)
    # 传递上一步行动和观察结果作为参数,以便代理可以根据当前状态做出决策。
    next_action = agent.get_action(..., next_action, observation)
return next_action						 # 返回最终响应作为结果

??AgentExecutor 类是LangChain中主要的agent runtime,其使用示例见《Examples using AgentExecutor》。除此之外,也支持其它实验中的runtimes,比如Plan-and-execute AgentBaby AGIAuto GPT

1.2 基本示例

??下面会演示如何使用LCEL从头开始构建Agent,从中可以了解Agent框架,这涉及到构建Agent本身、定义自定义工具,并在自定义循环中运行Agent和工具。最后,我们将展示如何使用标准的LangChain AgentExecutor来简化执行过程。

1.2.1 配置LangSmith

??Agent在返回面向用户的输出之前,会执行一系列中间步骤,这使得调试过程变得很麻烦,LangSmith可以有效应对这种情况。当使用LangChain LCEL语法构建Agent时,LangSmith会自动跟踪此过程。当我们调用AgentExecutor时,我们不仅可以全面地跟踪到Agent规划的中间步骤,还可以完全跟踪tools的输入和输出。

要启用 LangSmith,我们只需要设置以下环境变量:

export LANGCHAIN_TRACING_V2="true"
export LANGCHAIN_API_KEY="<your-api-key>"
1.2.2 使用LCEL语法创建Agents

??本示例中,我们将使用OpenAI Function Calling来创建一个自定义Agent(计算单词的长度)。这通常是创建Agent最可靠的方式。

from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
llm.invoke("how many letters in the word educa?")
AIMessage(content='There are 6 letters in the word "educa".')

??我们可以看到,它很难计算字符串“educa”中的字母。

  1. 自定义工具
    定义了一个名为get_word_length的函数,并使用@tool装饰器将其标记为一个工具函数(Langchain中可用的工具)
from langchain.agents import tool

@tool
def get_word_length(word: str) -> int:
    """Returns the length of a word."""
    return len(word)

tools = [get_word_length]
  1. 创建prompt
    创建聊天模板prompt,该模板包含了系统消息、用户输入和一个用于存储工具调用和输出的消息占位符。

由于OpenAI Function Calling对调用工具做了微调,所以它很擅长执行这个操作,我们不需要向LLM编写太复杂的指令,只需要在构建prompt时使用两个变量——inputagent_scratchpad。前者是用户输入的字符串, 后者是包含先前代理工具调用和相应工具输出的消息序列。

from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are very powerful assistant, but bad at calculating lengths of words.",
        ),
        ("user", "{input}"),
        MessagesPlaceholder(variable_name="agent_scratchpad"),
    ]
)

??MessagesPlaceholder可以理解为专门用于存储和传递中间步骤的消息,比如存储工具函数的调用、聊天中的的记忆(Memory in LLMChain)等,甚至您可以用它来格式化对话期间要呈现哪些消息(Types of MessagePromptTemplate)。

  1. 工具集成
    使用format_tool_to_openai_function将工具函数格式化为OpenAI函数格式,然后使用bind函数将其绑定到llm的状态中。
from langchain.tools.render import format_tool_to_openai_function

llm_with_tools = llm.bind(functions=[format_tool_to_openai_function(t) for t in tools])
  1. 创建agent
    使用LCEL语法来组合一些其它的组件,比如format_to_openai_function_messages会将中间步骤的信息格式化为适合发送给模型的消息(格式),OpenAIFunctionsAgentOutputParser用于将模型输出的消息转解析为AgentAction/AgentFinish
from langchain.agents.format_scratchpad import format_to_openai_function_messages
from langchain.agents.output_parsers import OpenAIFunctionsAgentOutputParser

agent = (
    {
        "input": lambda x: x["input"],
        "agent_scratchpad": lambda x: format_to_openai_function_messages(
            x["intermediate_steps"]
        ),
    }
    | prompt
    | llm_with_tools
    | OpenAIFunctionsAgentOutputParser()
)

??每次模型的输入都是一个字典,有两个键inputintermediate_steps,分别代表用户输入和中间步骤结果。这两个键的值通过lambda函数,传入agent 组件第一步的 "input""agent_scratchpad"中(字典表示并行执行)。

  1. 测试
agent.invoke({"input": "how many letters in the word educa?", "intermediate_steps": []})
AgentActionMessageLog(tool='get_word_length', tool_input={'word': 'educa'}, log="\nInvoking: `get_word_length` with `{'word': 'educa'}`\n\n\n", message_log=[AIMessage(content='', additional_kwargs={'function_call': {'arguments': '{\n  "word": "educa"\n}', 'name': 'get_word_length'}})])

最终的输出是一个包含有关调用的工具(get_word_length)以及相应消息记录的AgentActionMessageLog。

1.2.3 使用自定义runtime执行

上面只是第一步,接着还需要为此编写runtime。最简单的方法就是连续循环来执行agent,直到返回AgentFinish。

from langchain_core.agents import AgentFinish

user_input = "how many letters in the word educa?"
intermediate_steps = []
while True:
    output = agent.invoke(
        {
            "input": user_input,
            "intermediate_steps": intermediate_steps,
        }
    )
    if isinstance(output, AgentFinish):
        final_result = output.return_values["output"]
        break
    else:
        print(f"TOOL NAME: {output.tool}")
        print(f"TOOL INPUT: {output.tool_input}")
        tool = {"get_word_length": get_word_length}[output.tool]
        observation = tool.run(output.tool_input)
        intermediate_steps.append((output, observation))
print(final_result)
TOOL NAME: get_word_length
TOOL INPUT: {'word': 'educa'}
There are 5 letters in the word "educa"
1.2.4 使用AgentExecutor执行

??使用AgentExecutor可以简化这一过程,并且它提供了一些改进的功能,如错误处理、提前停止、跟踪等等。我们只要在创建AgentExecutor对象时捆绑agent和tools就行,然后调用invoke方法处理单个输入。

??另外,verbose=True表示启用详细模式,即在执行过程中输出更详细的信息和日志,如执行步骤、工具函数的调用和输出、agent的回复等。

from langchain.agents import AgentExecutor

agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
agent_executor.invoke({"input": "how many letters in the word educa?"})
> Entering new AgentExecutor chain...

Invoking: `get_word_length` with `{'word': 'educa'}`


5There are 5 letters in the word "educa".

> Finished chain.

{'input': 'how many letters in the word educa?',
 'output': 'There are 5 letters in the word "educa".'}
1.2.5 Adding memory

??此时agent可以正常运行了,但它是无状态的,不记得之前的任何交互信息。添加memory机制可以解决这个问题,我们只需要:

  • 在prompt中添加memory变量
  • 跟踪 chat history
  1. 在prompt中添加memory变量
from langchain.prompts import MessagesPlaceholder

MEMORY_KEY = "chat_history"
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are very powerful assistant, but bad at calculating lengths of words.",
        ),
        MessagesPlaceholder(variable_name=MEMORY_KEY),
        ("user", "{input}"),
        MessagesPlaceholder(variable_name="agent_scratchpad"),
    ]
)
  1. 设置一个列表来跟踪聊天记录
from langchain_core.messages import AIMessage, HumanMessage

chat_history = []
  1. 将这些内容添加到agent 链中,可见第一个组件的输入字典,多了一个"chat_history"键。
agent = (
    {
        "input": lambda x: x["input"],
        "agent_scratchpad": lambda x: format_to_openai_function_messages(
            x["intermediate_steps"]
        ),
        "chat_history": lambda x: x["chat_history"],
    }
    | prompt
    | llm_with_tools
    | OpenAIFunctionsAgentOutputParser()
)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
  1. 运行时,将输入和输出作为聊天记录进行跟踪
input1 = "how many letters in the word educa?"
result = agent_executor.invoke({"input": input1, "chat_history": chat_history})
chat_history.extend(
    [
        HumanMessage(content=input1),
        AIMessage(content=result["output"]),
    ]
)
agent_executor.invoke({"input": "is that a real word?", "chat_history": chat_history})
> Entering new AgentExecutor chain...

Invoking: `get_word_length` with `{'word': 'educa'}`


5There are 5 letters in the word "educa".

> Finished chain.


> Entering new AgentExecutor chain...
No, "educa" is not a real word in English.

> Finished chain.

{'input': 'is that a real word?',
 'chat_history': [HumanMessage(content='how many letters in the word educa?'),
  AIMessage(content='There are 5 letters in the word "educa".')],
 'output': 'No, "educa" is not a real word in English.'}

以下是 LangSmith跟踪情况:LangSmith trace

??如果希望每次对话时自动将chat_history添加到memory中,可以在创建AgentExecutor对象时,启用memory,详见《Custom LLM Agent》《Zep》

from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory()
agent_executor = AgentExecutor(agent=agent, tools=tools, memory=memory, verbose=True)

1.3 Next Steps

1.4 Agent Types

LangChain中可用的Agent Types有以下几种:

  • Zero-shot ReAct:最通用的agent使用方式。这种Agent使用ReAct框架来确定使用哪个工具(仅基于工具的描述)。

  • Structured tool chat:支持多输入工具,与用户进行更结构化对话。args_schema指定了工具所需的输入参数的结构和类型,通过使用args_schema,确保代理与工具之间的交互是准确和一致的,这对于更复杂的工具使用非常有用。

  • Conversational:比起标准的 ReAct Agent,Conversational Agent的提示更具有对话性,更适用于对话场景。

  • Self-ask with search:一种使用搜索功能的代理类型,可以通过搜索引擎获取相关信息来回答问题

  • ReAct document store:使用ReAct Agent框架与文档存储进行交互。必须提供两个工具:"Search"工具用于搜索文档,"Lookup"工具用于在最近找到的文档中查找术语。

  • XML Agent: 一种使用XML格式解析工具调用和最终答案的Agen

另外还有OpenAI系列:

  • OpenAI Functions:一种使用OpenAI API的功能驱动型Agent。某些OpenAI模型(如gpt-3.5-turbo-0613和gpt-4-0613)已经经过专门的微调,可以智能地识别何时调用函数,并返回调用这些函数所需的参数。

  • OpenAI Multi Functions Agent:使用此Agent来构建具有多个功能的AI助手。

  • OpenAI Tools:可以并行化运行多个工具(并行函数调用)。您可以使用OpenAI Tools来扩展Agent的功能,例如使用DuckDuckGo进行搜索或使用Bearly的代码解释器。

  • OpenAI Assistants :一种使用OpenAI API构建AI助手的方法。助手具有指令,并可以利用模型、工具和知识来回答用户的查询。Assistants API支持三种类型的工具:代码解释器、检索和函数调用。您可以使用OpenAI工具或自定义工具与OpenAI Assistants进行交互。当仅使用OpenAI工具时,您可以直接调用助手并获得最终答案。当使用自定义工具时,您可以使用内置的AgentExecutor运行助手和工具执行循环,或者轻松编写自己的执行器。

??OpenAI FunctionsOpenAI Multi Functions Agent专注于使用函数调用来处理用户的查询。OpenAI Tools提供了一种并行调用多个工具的方式,以扩展Agent的功能。OpenAI Assistants是一种更高级的方法,它结合了模型、工具和知识来构建更复杂的AI助手。

详细的AgentType使用示例,请参考《Examples using AgentType》

二、 ReAct

本节演示如何使用Langchain创建一个ReAct代理,并使用聊天模型作为代理驱动程序。

2.1 使用LLM

2.1.1 使用 LCEL手动构建Agent
  1. 导入模型和工具(“serpapi"和"llm-math”)
from langchain.agents import AgentType, initialize_agent, load_tools
from langchain.llms import OpenAI

llm = OpenAI(temperature=0)
tools = load_tools(["serpapi", "llm-math"], llm=llm)
  1. 使用LCEL创建代理
from langchain import hub
from langchain.agents.format_scratchpad import format_log_to_str
from langchain.agents.output_parsers import ReActSingleInputOutputParser
from langchain.tools.render import render_text_description

prompt = hub.pull("hwchase17/react")
prompt = prompt.partial(
    tools=render_text_description(tools),
    tool_names=", ".join([t.name for t in tools]),
)

llm_with_stop = llm.bind(stop=["\nObservation"])

# 创建了一个代理的管道,其中包含了输入处理、prompt、LLM和输出解析器
agent = (
    {
        "input": lambda x: x["input"],
        "agent_scratchpad": lambda x: format_log_to_str(x["intermediate_steps"]),
    }
    | prompt
    | llm_with_stop
    | ReActSingleInputOutputParser()
)
  • prompt =...:从hub中拉取了一个预定义的prompt,并使用render_text_description函数渲染工具的文本描述。然后,将工具的名称作为字符串添加到prompt中
  • partial方法:用于部分格式化提示模板(详见《部分格式化》)。在本示例中,则是将toolstool_names这两个变量的值预先设置为特定的值
  • llm.bind(stop=["\nObservation"]) :表示将停止标记(stop token)绑定到LLM示例上,这意味着当模型生成的文本中包含 "\nObservation" 时,模型会停止生成文本(bind方法详见《 绑定运行时参数》)。
  • format_log_to_str:将中间步骤的日志格式化为易于处理和展示的字符串形式
  • ReActJsonSingleInputOutputParser:将ReAct 代理的输出解析为AgentAction 对象或者AgentFinish 对象。
# AgentAction
Thought: agent thought here
Action: search
Action Input: what is the temperature in SF?

# AgentFinish
Thought: agent thought here
Final Answer: The temperature is 100 degrees
  1. 创建并执行代理执行器
from langchain.agents import AgentExecutor

agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
agent_executor.invoke(
    {
        "input": "Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?"
    }
)
> Entering new AgentExecutor chain...
 I need to find out who Leo DiCaprio's girlfriend is and then calculate her age raised to the 0.43 power.
Action: Search
Action Input: "Leo DiCaprio girlfriend"model Vittoria Ceretti I need to find out Vittoria Ceretti's age
Action: Search
Action Input: "Vittoria Ceretti age"25 years I need to calculate 25 raised to the 0.43 power
Action: Calculator
Action Input: 25^0.43Answer: 3.991298452658078 I now know the final answer
Final Answer: Leo DiCaprio's girlfriend is Vittoria Ceretti and her current age raised to the 0.43 power is 3.991298452658078.

> Finished chain.


{'input': "Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?",
 'output': "Leo DiCaprio's girlfriend is Vittoria Ceretti and her current age raised to the 0.43 power is 3.991298452658078."}
2.1.2 使用AgentType自动构建Agent

??你也可以使用现成的代理类型来简化此操作,比如在使用initialize_agent初始化代理执行器时,传入AgentType.ZERO_SHOT_REACT_DESCRIPTION代理类型。

agent_executor = initialize_agent(
    tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)
agent_executor.invoke(
    {
        "input": "Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?"
    }
)
> Entering new AgentExecutor chain...
 I need to find out who Leo DiCaprio's girlfriend is and then calculate her age raised to the 0.43 power.
Action: Search
Action Input: "Leo DiCaprio girlfriend"
Observation: model Vittoria Ceretti
Thought: I need to find out Vittoria Ceretti's age
Action: Search
Action Input: "Vittoria Ceretti age"
Observation: 25 years
Thought: I need to calculate 25 raised to the 0.43 power
Action: Calculator
Action Input: 25^0.43
Observation: Answer: 3.991298452658078
Thought: I now know the final answer
Final Answer: Leo DiCaprio's girlfriend is Vittoria Ceretti and her current age raised to the 0.43 power is 3.991298452658078.

> Finished chain.


{'input': "Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?",
 'output': "Leo DiCaprio's girlfriend is Vittoria Ceretti and her current age raised to the 0.43 power is 3.991298452658078."}

2.2 使用chat models

2.2.1 手动构建Agent

??上面都是使用LLMs来调用ReAct Agents,你也可以使用chat models来调用。与LLMs不同,chat models需要使用JSON来编码代理的动作,因为聊天模型的使用稍微复杂一些,而JSON可以强制编码输出格式。

from langchain.chat_models import ChatOpenAI
from langchain.agents.output_parsers import ReActJsonSingleInputOutputParser

prompt = hub.pull("hwchase17/react-json")
prompt = prompt.partial(
    tools=render_text_description(tools),
    tool_names=", ".join([t.name for t in tools]),
)

chat_model = ChatOpenAI(temperature=0)
chat_model_with_stop = chat_model.bind(stop=["\nObservation"])

agent = (
    {
        "input": lambda x: x["input"],
        "agent_scratchpad": lambda x: format_log_to_str(x["intermediate_steps"]),
    }
    | prompt
    | chat_model_with_stop
    | ReActJsonSingleInputOutputParser()
)

可以看到,只是使用的模型和解析器不一样,其它都是一样的。

agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
agent_executor.invoke(
    {
        "input": "Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?"
    }
)
2.2.2 自动构建Agent

你也可以使用现成的代理类型:

agent = initialize_agent(
    tools, chat_model, agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)
agent.run(
    "Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?"
)

2.3 ReAct document store

??Document store(文档存储)是一种用于存储和管理文档的数据库或存储系统,一种非关系型数据库(NoSQL),即不需要预定义的模式或固定的表结构。Document store通常提供了一些功能,如文档索引、全文搜索、查询语言、数据复制和分片等,被广泛应用于内容管理系统、博客平台、社交媒体、日志记录、实时分析等场景。一些常见的文档存储系统包括MongoDB、Couchbase、Elasticsearch。

??我们可以创建一个 react 代理,使用 Langchain 库中的工具和模型来执行与document store相关的搜索和查找操作。

from langchain.agents import AgentType, Tool, initialize_agent
from langchain.agents.react.base import DocstoreExplorer
from langchain.docstore import Wikipedia
from langchain.llms import OpenAI

docstore = DocstoreExplorer(Wikipedia())
tools = [
    Tool(
        name="Search",
        func=docstore.search,
        description="useful for when you need to ask with search",
    ),
    Tool(
        name="Lookup",
        func=docstore.lookup,
        description="useful for when you need to ask with lookup",
    ),
]

llm = OpenAI(temperature=0, model_name="gpt-3.5-turbo-instruct")
react = initialize_agent(tools, llm, agent=AgentType.REACT_DOCSTORE, verbose=True)
  • DocstoreExplorer:Langchain库中一个用于辅助document store操作的工具类,比如提供了search和lookup等方法,用于在文档存储中执行搜索和查找操作。
  • Wikipedia():创建一个维基百科document store实例,用于加载和处理维基百科的文档。
  • 创建了两个工具对象,分别用于搜索(search)和查找(lookup)操作,每个工具对象都有名称、函数(func)和描述(description)。
  • 使用initialize_agent函数初始化了一个ReAct Docstore代理对象,该对象可以使用LLM对document store对象进行搜索和查找。
question = "Author David Chanoff has collaborated with a U.S. Navy admiral who served as the ambassador to the United Kingdom under which President?"
react.run(question)
> Entering new AgentExecutor chain...

Thought: I need to search David Chanoff and find the U.S. Navy admiral he collaborated with. Then I need to find which President the admiral served under.

Action: Search[David Chanoff]

Observation: David Chanoff is a noted author of non-fiction work. His work has typically involved collaborations with the principal protagonist of the work concerned. His collaborators have included; Augustus A. White, Joycelyn Elders, ?oàn V?n To?i, William J. Crowe, Ariel Sharon, Kenneth Good and Felix Zandman. He has also written about a wide range of subjects including literary history, education and foreign for The Washington Post, The New Republic and The New York Times Magazine. He has published more than twelve books.
Thought: The U.S. Navy admiral David Chanoff collaborated with is William J. Crowe. I need to find which President he served under.

Action: Search[William J. Crowe]

Observation: William James Crowe Jr. (January 2, 1925 – October 18, 2007) was a United States Navy admiral and diplomat who served as the 11th chairman of the Joint Chiefs of Staff under Presidents Ronald Reagan and George H. W. Bush, and as the ambassador to the United Kingdom and Chair of the Intelligence Oversight Board under President Bill Clinton.
Thought: William J. Crowe served as the ambassador to the United Kingdom under President Bill Clinton, so the answer is Bill Clinton.

Action: Finish[Bill Clinton]

> Finished chain.
'Bill Clinton'

三、Conversational

??其他代理通常针对使用工具进行优化,以找出最佳响应,这在对话环境中并不理想,因为您可能希望代理也能够与用户聊天。与标准的 ReAct Agent不同的是,Conversational Agent的提示更具有对话性,更适用于对话场景。

??SerpAPI是一个提供搜索结果的API服务,可以从各种搜索引擎(如Google、Bing等)获取搜索结果。Langchain提供了SerpAPIWrapper工具,用于与SerpAPI进行交互。将SerpAPIWrapper作为工具加载到代理中,可以使用search.run方法执行搜索查询,并将搜索结果作为代理的输入之一。然后代理可以根据实时的搜索结果生成响应,从而回答用户的问题或提供相关信息。

??SerpAPIWrapper使用前,需要安装google-search-results包,并配置serpapi_api_key,它支持以下操作:

  • 搜索查询:使用run方法可以向SerpAPI发送搜索查询,并解析返回的结果。
  • 自定义参数:可以通过传递参数来自定义SerpAPI的行为,例如指定搜索引擎、地理位置、语言等。
  • 异步操作:SerpAPIWrapper支持异步操作,可以使用arun和aresults方法进行异步的搜索查询和结果获取。
from langchain.agents import AgentType, Tool, initialize_agent
from langchain.llms import OpenAI
from langchain.memory import ConversationBufferMemory
from langchain.utilities import SerpAPIWrapper

search = SerpAPIWrapper()
llm = OpenAI(temperature=0)
tools = [
    Tool(
        name="Current Search",
        func=search.run,
        description="useful for when you need to answer questions about current events or the current state of the world",
    ),
]

3.1 使用LLM

3.1.1 使用 LCEL手动构建Agent
from langchain import hub
from langchain.agents.format_scratchpad import format_log_to_str
from langchain.agents.output_parsers import ReActSingleInputOutputParser
from langchain.tools.render import render_text_description

prompt = hub.pull("hwchase17/react-chat")
prompt = prompt.partial(
    tools=render_text_description(tools),
    tool_names=", ".join([t.name for t in tools]),
)
llm_with_stop = llm.bind(stop=["\nObservation"])

agent = (
    {
        "input": lambda x: x["input"],
        "agent_scratchpad": lambda x: format_log_to_str(x["intermediate_steps"]),
        "chat_history": lambda x: x["chat_history"],
    }
    | prompt
    | llm_with_stop
    | ReActSingleInputOutputParser()
)
from langchain.agents import AgentExecutor

memory = ConversationBufferMemory(memory_key="chat_history")
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True, memory=memory)
agent_executor.invoke({"input": "hi, i am bob"})["output"]

??ConversationBufferMemory是一种用于存储聊天历史记录的内存类型。它以缓冲区的形式保存了对话中的聊天消息,每次对话内容会自动累加到ConversationBufferMemory中。通过指定memory_key参数,可以将ConversationBufferMemory中的聊天记录信息读写到指定的键上。在代理或链的执行过程中,可以通过指定相同的键来读取和写入对应的值(聊天记录),以实现持续的对话交互。

> Entering new AgentExecutor chain...

Thought: Do I need to use a tool? No
Final Answer: Hi Bob, nice to meet you! How can I help you today?

> Finished chain.
'Hi Bob, nice to meet you! How can I help you today?'
agent_executor.invoke({"input": "whats my name?"})["output"]
> Entering new AgentExecutor chain...

Thought: Do I need to use a tool? No
Final Answer: Your name is Bob.

> Finished chain.
'Your name is Bob.'
agent_executor.invoke({"input": "what are some movies showing 9/21/2023?"})["output"]
> Entering new AgentExecutor chain...

Thought: Do I need to use a tool? Yes
Action: Current Search
Action Input: Movies showing 9/21/2023['September 2023 Movies: The Creator ? Dumb Money ? Expend4bles ? The Kill Room ? The Inventor ? The Equalizer 3 ? PAW Patrol: The Mighty Movie, ...'] Do I need to use a tool? No
Final Answer: According to current search, some movies showing on 9/21/2023 are The Creator, Dumb Money, Expend4bles, The Kill Room, The Inventor, The Equalizer 3, and PAW Patrol: The Mighty Movie.

> Finished chain.
'According to current search, some movies showing on 9/21/2023 are The Creator, Dumb Money, Expend4bles, The Kill Room, The Inventor, The Equalizer 3, and PAW Patrol: The Mighty Movie.'
3.1.2 使用AgentType自动构建Agent

你也可以使用现成的代理类型来简化此操作:

agent_executor = initialize_agent(
    tools,
    llm,
    agent=AgentType.CONVERSATIONAL_REACT_DESCRIPTION,
    verbose=True,
    memory=memory,
)

3.2 使用chat model

和上一章一样,使用chat model时主要是提示和解析器不一样。

3.2.1 手动构建Agent
from langchain import hub
from langchain.chat_models import ChatOpenAI

prompt = hub.pull("hwchase17/react-chat-json")
prompt = prompt.partial(
    tools=render_text_description(tools),
    tool_names=", ".join([t.name for t in tools]),
)

chat_model = ChatOpenAI(temperature=0, model="gpt-4")
chat_model_with_stop = chat_model.bind(stop=["\nObservation"])
from langchain.agents.format_scratchpad import format_log_to_messages
from langchain.agents.output_parsers import JSONAgentOutputParser

# 我们需要额外的指导,否则模型有时会忘记如何回应
TEMPLATE_TOOL_RESPONSE = """TOOL RESPONSE: 
---------------------
{observation}

USER'S INPUT
--------------------

Okay, so what is the response to my last comment? If using information obtained from the tools you must mention it explicitly without mentioning the tool names - I have forgotten all TOOL RESPONSES! Remember to respond with a markdown code snippet of a json blob with a single action, and NOTHING else - even if you just want to respond to the user. Do NOT respond with anything except a JSON snippet no matter what!"""

agent = (
    {
        "input": lambda x: x["input"],
        "agent_scratchpad": lambda x: format_log_to_messages(
            x["intermediate_steps"], template_tool_response=TEMPLATE_TOOL_RESPONSE
        ),
        "chat_history": lambda x: x["chat_history"],
    }
    | prompt
    | chat_model_with_stop
    | JSONAgentOutputParser()
)
  • TEMPLATE_TOOL_RESPONSE:一个模板字符串,它包含了一个占位符{observation},用于插入观察结果。通过设置此模板,可以指导代理在生成响应时的格式和内容,控制聊天模型的行为,避免聊天模型在长时间对话中可能会忘记如何回应(模型可能会忘记如何正确地回应之前的评论或工具的输出)。
  • format_log_to_messages:用于将中间步骤的日志转换为聊天消息列表,可选参数 template_tool_response(模板字符串),用于指导代理在生成响应时的格式和内容。
from langchain.agents import AgentExecutor

memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True, memory=memory)
agent_executor.invoke({"input": "hi, i am bob"})["output"]
> Entering new AgentExecutor chain...
```json
{
    "action": "Final Answer",
    "action_input": "Hello Bob, how can I assist you today?"
}
```

> Finished chain.
'Hello Bob, how can I assist you today?'
agent_executor.invoke({"input": "what are some movies showing 9/21/2023?"})["output"]
> Entering new AgentExecutor chain...
```json
{
    "action": "Current Search",
    "action_input": "movies showing on 9/21/2023"
}
```['September 2023 Movies: The Creator ? Dumb Money ? Expend4bles ? The Kill Room ? The Inventor ? The Equalizer 3 ? PAW Patrol: The Mighty Movie, ...']```json
{
    "action": "Final Answer",
    "action_input": "Some movies that are showing on 9/21/2023 include 'The Creator', 'Dumb Money', 'Expend4bles', 'The Kill Room', 'The Inventor', 'The Equalizer 3', and 'PAW Patrol: The Mighty Movie'."
}
```

> Finished chain.
"Some movies that are showing on 9/21/2023 include 'The Creator', 'Dumb Money', 'Expend4bles', 'The Kill Room', 'The Inventor', 'The Equalizer 3', and 'PAW Patrol: The Mighty Movie'."
3.2.2 自动构建Agent
from langchain.chat_models import ChatOpenAI
from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
chat_model = ChatOpenAI(temperature=0)
agent_chain = initialize_agent(
    tools,
    chat_model,
    agent=AgentType.CHAT_CONVERSATIONAL_REACT_DESCRIPTION,
    verbose=True,
    memory=memory,
)

四、Structured tool chat

!pip install playwright
!playwright install

下面这段代码是在jupyter notebooks上运行:

from langchain.agents import AgentType, initialize_agent
from langchain.chat_models import ChatOpenAI

# This import is required only for jupyter notebooks, since they have their own eventloop
import nest_asyncio
from langchain.agents.agent_toolkits import PlayWrightBrowserToolkit
from langchain.tools.playwright.utils import (
    create_async_playwright_browser,  # A synchronous browser is available, though it isn't compatible with jupyter.
)

nest_asyncio.apply()   

async_browser = create_async_playwright_browser()
browser_toolkit = PlayWrightBrowserToolkit.from_browser(async_browser=async_browser)
tools = browser_toolkit.get_tools()
tools
[ClickTool(async_browser=<Browser type=<BrowserType name=chromium executable_path=/root/.cache/ms-playwright/chromium-1091/chrome-linux/chrome> version=120.0.6099.28>),
 NavigateTool(async_browser=<Browser type=<BrowserType name=chromium executable_path=/root/.cache/ms-playwright/chromium-1091/chrome-linux/chrome> version=120.0.6099.28>),
 NavigateBackTool(async_browser=<Browser type=<BrowserType name=chromium executable_path=/root/.cache/ms-playwright/chromium-1091/chrome-linux/chrome> version=120.0.6099.28>),
 ExtractTextTool(async_browser=<Browser type=<BrowserType name=chromium executable_path=/root/.cache/ms-playwright/chromium-1091/chrome-linux/chrome> version=120.0.6099.28>),
 ExtractHyperlinksTool(async_browser=<Browser type=<BrowserType name=chromium executable_path=/root/.cache/ms-playwright/chromium-1091/chrome-linux/chrome> version=120.0.6099.28>),
 GetElementsTool(async_browser=<Browser type=<BrowserType name=chromium executable_path=/root/.cache/ms-playwright/chromium-1091/chrome-linux/chrome> version=120.0.6099.28>),
 CurrentWebPageTool(async_browser=<Browser type=<BrowserType name=chromium executable_path=/root/.cache/ms-playwright/chromium-1091/chrome-linux/chrome> version=120.0.6099.28>)]
  • nest_asyncio.apply()nest_asyncio是用于在Jupyter Notebook中处理异步事件循环的模块,apply方法将异步事件循环应用于当前Jupyter Notebook环境。
  • create_async_playwright_browser():创建一个异步的Playwright浏览器实例。
  • browser_toolkit =...:创建一个PlayWrightBrowserToolkit浏览器工具包实例。
  • browser_toolkit.get_tools():从浏览器工具包中获取可用的浏览器工具,这些工具在内部定义了args_schema,用于指定工具所需的输入参数的结构和类型。

4.1 使用 LCEL构建Agent

from langchain import hub
from langchain.tools.render import render_text_description_and_args
from langchain.agents.format_scratchpad import format_log_to_str
from langchain.agents.output_parsers import JSONAgentOutputParser

prompt = hub.pull("hwchase17/react-multi-input-json")
prompt = prompt.partial(
    tools=render_text_description_and_args(tools),
    tool_names=", ".join([t.name for t in tools]),
)

llm = ChatOpenAI(temperature=0)
llm_with_stop = llm.bind(stop=["Observation"])

agent = (
    {
        "input": lambda x: x["input"],
        "agent_scratchpad": lambda x: format_log_to_str(x["intermediate_steps"]),
    }
    | prompt
    | llm_with_stop
    | JSONAgentOutputParser()
)
from langchain.agents import AgentExecutor

agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
response = await agent_executor.ainvoke(
    {"input": "Browse to blog.langchain.dev and summarize the text, please."}
)
print(response["output"])
> Entering new AgentExecutor chain...
Action:
```
{
  "action": "navigate_browser",
  "action_input": {
    "url": "https://blog.langchain.dev"
  }
}
```
Navigating to https://blog.langchain.dev returned status code 200Action:
```
{
  "action": "extract_text",
  "action_input": {}
}
```

LangChain LangChain Home GitHub Docs By LangChain Release Notes Write with Us Sign in Subscribe The official LangChain blog. Subscribe now Login Featured Posts Announcing LangChain Hub Using LangSmith to Support Fine-tuning Announcing LangSmith, a unified platform for debugging, testing, evaluating, and monitoring your LLM applications Sep 20 Peering Into the Soul of AI Decision-Making with LangSmith 10 min read Sep 20 LangChain + Docugami Webinar: Lessons from Deploying LLMs with LangSmith 3 min read Sep 18 TED AI Hackathon Kickoff (and projects we’d love to see) 2 min read Sep 12 How to Safely Query Enterprise Data with LangChain Agents + SQL + OpenAI + Gretel 6 min read Sep 12 OpaquePrompts x LangChain: Enhance the privacy of your LangChain application with just one code change 4 min read Load more LangChain ? 2023 Sign up Powered by GhostAction:
```
{
  "action": "Final Answer",
  "action_input": "The LangChain blog features posts on topics such as using LangSmith for fine-tuning, AI decision-making with LangSmith, deploying LLMs with LangSmith, and more. It also includes information on LangChain Hub and upcoming webinars. LangChain is a platform for debugging, testing, evaluating, and monitoring LLM applications."
}
```

> Finished chain.
The LangChain blog features posts on topics such as using LangSmith for fine-tuning, AI decision-making with LangSmith, deploying LLMs with LangSmith, and more. It also includes information on LangChain Hub and upcoming webinars. LangChain is a platform for debugging, testing, evaluating, and monitoring LLM applications.

4.2 使用AgentType构建Agent

llm = ChatOpenAI(temperature=0)  # Also works well with Anthropic models
agent_chain = initialize_agent(
    tools,
    llm,
    agent=AgentType.STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True,
)

response = await agent_chain.ainvoke(
    {"input": "Browse to blog.langchain.dev and summarize the text, please."}
)
print(response["output"])

??指定了代理的类型为STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION,这意味着代理是一个结构化工具聊天代理,可以使用多个工具进行交互,并根据工具的args_schema来填充和验证输入参数。

五、Self-ask with search

Self-ask with search agent可以使用SerpAPIWrapper等API作为搜索引擎,搜索相关信息来回答问题。

from langchain.agents import AgentType, Tool, initialize_agent
from langchain.llms import OpenAI
from langchain.utilities import SerpAPIWrapper

llm = OpenAI(temperature=0)
search = SerpAPIWrapper()
tools = [
    Tool(
        name="Intermediate Answer",
        func=search.run,
        description="useful for when you need to ask with search",
    )
]

5.1 使用 LCEL构建Agent

from langchain import hub
from langchain.agents.format_scratchpad import format_log_to_str
from langchain.agents.output_parsers import SelfAskOutputParser

prompt = hub.pull("hwchase17/self-ask-with-search")
llm_with_stop = llm.bind(stop=["\nIntermediate answer:"])
agent = (
    {
        "input": lambda x: x["input"],
        # Use some custom observation_prefix/llm_prefix for formatting
        "agent_scratchpad": lambda x: format_log_to_str(
            x["intermediate_steps"],
            observation_prefix="\nIntermediate answer: ",
            llm_prefix="",
        ),
    }
    | prompt
    | llm_with_stop
    | SelfAskOutputParser()
)
from langchain.agents import AgentExecutor
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
agent_executor.invoke(
    {"input": "What is the hometown of the reigning men's U.S. Open champion?"}
)
> Entering new AgentExecutor chain...
 Yes.
Follow up: Who is the reigning men's U.S. Open champion?Men's US Open Tennis Champions Novak Djokovic earned his 24th major singles title against 2021 US Open champion Daniil Medvedev, 6-3, 7-6 (7-5), 6-3. The victory ties the Serbian player with the legendary Margaret Court for the most Grand Slam wins across both men's and women's singles.
Follow up: Where is Novak Djokovic from?Belgrade, Serbia
So the final answer is: Belgrade, Serbia

> Finished chain.

{'input': "What is the hometown of the reigning men's U.S. Open champion?",
 'output': 'Belgrade, Serbia'}

5.2 使用AgentType构建Agent

self_ask_with_search = initialize_agent(tools, llm, agent=AgentType.SELF_ASK_WITH_SEARCH, verbose=True)
self_ask_with_search.run("What is the hometown of the reigning men's U.S. Open champion?")
> Entering new AgentExecutor chain...
 Yes.
Follow up: Who is the reigning men's U.S. Open champion?
Intermediate answer: Men's US Open Tennis Champions Novak Djokovic earned his 24th major singles title against 2021 US Open champion Daniil Medvedev, 6-3, 7-6 (7-5), 6-3. The victory ties the Serbian player with the legendary Margaret Court for the most Grand Slam wins across both men's and women's singles.

Follow up: Where is Novak Djokovic from?
Intermediate answer: Belgrade, Serbia
So the final answer is: Belgrade, Serbia

> Finished chain.

'Belgrade, Serbia'

六、OpenAI tools

from langchain.agents import AgentExecutor, AgentType, Tool, initialize_agent
from langchain.agents.format_scratchpad.openai_tools import (
    format_to_openai_tool_messages,
)
from langchain.agents.output_parsers.openai_tools import OpenAIToolsAgentOutputParser
from langchain.chat_models import ChatOpenAI
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.tools import BearlyInterpreterTool, DuckDuckGoSearchRun
from langchain.tools.render import format_tool_to_openai_tool

lc_tools = [DuckDuckGoSearchRun(), BearlyInterpreterTool(api_key="...").as_tool()]
oai_tools = [format_tool_to_openai_tool(tool) for tool in lc_tools]
llm = ChatOpenAI(temperature=0, model="gpt-3.5-turbo-1106")
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a helpful assistant"),
        ("user", "{input}"),
        MessagesPlaceholder(variable_name="agent_scratchpad"),
    ]
)
agent = (
    {
        "input": lambda x: x["input"],
        "agent_scratchpad": lambda x: format_to_openai_tool_messages(
            x["intermediate_steps"]
        ),
    }
    | prompt
    | llm.bind(tools=oai_tools)
    | OpenAIToolsAgentOutputParser()
)
agent_executor = AgentExecutor(agent=agent, tools=lc_tools, verbose=True)
agent_executor.invoke(
    {"input": "What's the average of the temperatures in LA, NYC, and SF today?"}
)

七、OpenAI functions

??某些OpenAI模型(如gpt-3.5-turbo-0613和gpt-4-0613)经过了精调,可以检测到何时应该调用函数,并将应该传递给函数的输入作为响应, OpenAI Functions Agent专为这些模型而设计。

本节内容和第一章相似

! pip install openai google-search-results
from langchain.agents import AgentType, Tool, initialize_agent
from langchain.chains import LLMMathChain
from langchain.chat_models import ChatOpenAI
from langchain.utilities import SerpAPIWrapper, SQLDatabase
from langchain_experimental.sql import SQLDatabaseChain

llm = ChatOpenAI(temperature=0, model="gpt-3.5-turbo-0613")
search = SerpAPIWrapper()
llm_math_chain = LLMMathChain.from_llm(llm=llm, verbose=True)
db = SQLDatabase.from_uri("sqlite:///../../../../../notebooks/Chinook.db")
db_chain = SQLDatabaseChain.from_llm(llm, db, verbose=True)
tools = [
    Tool(
        name="Search",
        func=search.run,
        description="useful for when you need to answer questions about current events. You should ask targeted questions",
    ),
    Tool(
        name="Calculator",
        func=llm_math_chain.run,
        description="useful for when you need to answer questions about math",
    ),
    Tool(
        name="FooBar-DB",
        func=db_chain.run,
        description="useful for when you need to answer questions about FooBar. Input should be in the form of a question containing full context",
    ),
]
  • Search工具:使用search.run函数,用于回答关于当前事件的问题。
  • Calculator工具:使用llm_math_chain.run函数,用于回答数学问题。
  • FooBar-DB工具:db_chain是一个SQLDatabaseChain对象,用于在代理中执行SQL数据库查询任务。FooBar-DB工具使用db_chain.run函数,可以回答关于FooBar数据库的问题。

7.1 使用 LCEL构建Agent

from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.tools.render import format_tool_to_openai_function

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a helpful assistant"),
        ("user", "{input}"),
        MessagesPlaceholder(variable_name="agent_scratchpad"),
    ]
)

llm_with_tools = llm.bind(functions=[format_tool_to_openai_function(t) for t in tools])
agent = (
    {
        "input": lambda x: x["input"],
        "agent_scratchpad": lambda x: format_to_openai_function_messages(
            x["intermediate_steps"]
        ),
    }
    | prompt
    | llm_with_tools
    | OpenAIFunctionsAgentOutputParser()
)
from langchain.agents import AgentExecutor

agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
agent_executor.invoke(
    {
        "input": "Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?"
    }
)
> Entering new AgentExecutor chain...

Invoking: `Search` with `Leo DiCaprio's girlfriend`


['Blake Lively and DiCaprio are believed to have enjoyed a whirlwind five-month romance in 2011. The pair were seen on a yacht together in Cannes, ...']
Invoking: `Calculator` with `0.43`


> Entering new LLMMathChain chain...
0.43```text
0.43
```
...numexpr.evaluate("0.43")...

Answer: 0.43
> Finished chain.
Answer: 0.43I'm sorry, but I couldn't find any information about Leo DiCaprio's current girlfriend. As for raising her age to the power of 0.43, I'm not sure what her current age is, so I can't provide an answer for that.

> Finished chain.
{'input': "Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?",
 'output': "I'm sorry, but I couldn't find any information about Leo DiCaprio's current girlfriend. As for raising her age to the power of 0.43, I'm not sure what her current age is, so I can't provide an answer for that."}

7.2 使用AgentType构建Agent

agent_executor = initialize_agent(tools, llm, agent=AgentType.OPENAI_FUNCTIONS, verbose=True)
agent_executor.invoke(
    {
        "input": "Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?"
    }
)

八、 OpenAI Multi Functions Agent

??OpenAI Multi Functions Agent是一种使用多个函数工具的代理类型,可以智能地调用这些工具来回答问题。你也可以可以根据需要配置和添加不同的函数工具,以满足特定问题的要求。

!pip install openai google-search-results
from langchain.agents import AgentType, Tool, initialize_agent
from langchain.chat_models import ChatOpenAI
from langchain.utilities import SerpAPIWrapper

import getpass
import os

os.environ["SERPAPI_API_KEY"] = getpass.getpass()
llm = ChatOpenAI(temperature=0, model="gpt-3.5-turbo-0613")
search = SerpAPIWrapper()
tools = [
    Tool(
        name="Search",
        func=search.run,
        description="Useful when you need to answer questions about current events. You should ask targeted questions.",
    ),
]

mrkl = initialize_agent(
    tools, llm, agent=AgentType.OPENAI_MULTI_FUNCTIONS, verbose=True
)
# Do this so we can see exactly what's going on under the hood
from langchain.globals import set_debug

set_debug(True)
mrkl.run("What is the weather in LA and SF?")
[chain/start] [1:chain:AgentExecutor] Entering Chain run with input:
{
  "input": "What is the weather in LA and SF?"
}
[llm/start] [1:chain:AgentExecutor > 2:llm:ChatOpenAI] Entering LLM run with input:
{
  "prompts": [
    "System: You are a helpful AI assistant.\nHuman: What is the weather in LA and SF?"
  ]
}
[llm/end] [1:chain:AgentExecutor > 2:llm:ChatOpenAI] [2.91s] Exiting LLM run with output:
{
  "generations": [
    [
      {
        "text": "",
        "generation_info": null,
        "message": {
          "content": "",
          "additional_kwargs": {
            "function_call": {
              "name": "tool_selection",
              "arguments": "{\n  \"actions\": [\n    {\n      \"action_name\": \"Search\",\n      \"action\": {\n        \"tool_input\": \"weather in Los Angeles\"\n      }\n    },\n    {\n      \"action_name\": \"Search\",\n      \"action\": {\n        \"tool_input\": \"weather in San Francisco\"\n      }\n    }\n  ]\n}"
            }
          },
          "example": false
        }
      }
    ]
  ],
  "llm_output": {
    "token_usage": {
      "prompt_tokens": 81,
      "completion_tokens": 75,
      "total_tokens": 156
    },
    "model_name": "gpt-3.5-turbo-0613"
  },
  "run": null
}
[tool/start] [1:chain:AgentExecutor > 3:tool:Search] Entering Tool run with input:
"{'tool_input': 'weather in Los Angeles'}"
[tool/end] [1:chain:AgentExecutor > 3:tool:Search] [608.693ms] Exiting Tool run with output:
"Mostly cloudy early, then sunshine for the afternoon. High 76F. Winds SW at 5 to 10 mph. Humidity59%."
[tool/start] [1:chain:AgentExecutor > 4:tool:Search] Entering Tool run with input:
"{'tool_input': 'weather in San Francisco'}"
[tool/end] [1:chain:AgentExecutor > 4:tool:Search] [517.475ms] Exiting Tool run with output:
"Partly cloudy this evening, then becoming cloudy after midnight. Low 53F. Winds WSW at 10 to 20 mph. Humidity83%."
[llm/start] [1:chain:AgentExecutor > 5:llm:ChatOpenAI] Entering LLM run with input:
{
  "prompts": [
    "System: You are a helpful AI assistant.\nHuman: What is the weather in LA and SF?\nAI: {'name': 'tool_selection', 'arguments': '{\\n  \"actions\": [\\n    {\\n      \"action_name\": \"Search\",\\n      \"action\": {\\n        \"tool_input\": \"weather in Los Angeles\"\\n      }\\n    },\\n    {\\n      \"action_name\": \"Search\",\\n      \"action\": {\\n        \"tool_input\": \"weather in San Francisco\"\\n      }\\n    }\\n  ]\\n}'}\nFunction: Mostly cloudy early, then sunshine for the afternoon. High 76F. Winds SW at 5 to 10 mph. Humidity59%.\nAI: {'name': 'tool_selection', 'arguments': '{\\n  \"actions\": [\\n    {\\n      \"action_name\": \"Search\",\\n      \"action\": {\\n        \"tool_input\": \"weather in Los Angeles\"\\n      }\\n    },\\n    {\\n      \"action_name\": \"Search\",\\n      \"action\": {\\n        \"tool_input\": \"weather in San Francisco\"\\n      }\\n    }\\n  ]\\n}'}\nFunction: Partly cloudy this evening, then becoming cloudy after midnight. Low 53F. Winds WSW at 10 to 20 mph. Humidity83%."
  ]
}
[llm/end] [1:chain:AgentExecutor > 5:llm:ChatOpenAI] [2.33s] Exiting LLM run with output:
{
  "generations": [
    [
      {
        "text": "The weather in Los Angeles is mostly cloudy with a high of 76°F and a humidity of 59%. The weather in San Francisco is partly cloudy in the evening, becoming cloudy after midnight, with a low of 53°F and a humidity of 83%.",
        "generation_info": null,
        "message": {
          "content": "The weather in Los Angeles is mostly cloudy with a high of 76°F and a humidity of 59%. The weather in San Francisco is partly cloudy in the evening, becoming cloudy after midnight, with a low of 53°F and a humidity of 83%.",
          "additional_kwargs": {},
          "example": false
        }
      }
    ]
  ],
  "llm_output": {
    "token_usage": {
      "prompt_tokens": 307,
      "completion_tokens": 54,
      "total_tokens": 361
    },
    "model_name": "gpt-3.5-turbo-0613"
  },
  "run": null
}
[chain/end] [1:chain:AgentExecutor] [6.37s] Exiting Chain run with output:
{
  "output": "The weather in Los Angeles is mostly cloudy with a high of 76°F and a humidity of 59%. The weather in San Francisco is partly cloudy in the evening, becoming cloudy after midnight, with a low of 53°F and a humidity of 83%."
}
'The weather in Los Angeles is mostly cloudy with a high of 76°F and a humidity of 59%. The weather in San Francisco is partly cloudy in the evening, becoming cloudy after midnight, with a low of 53°F and a humidity of 83%.'

??为了确保我们的代理不会陷入过长的循环中,我们可以设置max_iterations参数,限制代理的迭代次数。另外默认情况下,代理将返回一个固定的字符串作为输出。但是,您也可以选择使用generate方法,让代理在达到最大迭代次数后再通过LLM进行最后一次遍历,生成一个有意义的输出。

mrkl = initialize_agent(
    tools,
    llm,
    agent=AgentType.OPENAI_FUNCTIONS,
    verbose=True,
    max_iterations=2,
    early_stopping_method="generate",
)

mrkl.run("What is the weather in NYC today, yesterday, and the day before?")

输出日志略

'Today in NYC, the weather is currently 85°F with a southeast wind of 4 mph. The humidity is at 78% and there is 81% cloud cover. There is no rain expected today.\n\nYesterday in NYC, the maximum temperature was 81°F at 1:51 pm, and the minimum temperature was 72°F at 7:17 pm.\n\nFor the day before yesterday, I do not have the specific weather information.'

九、OpenAI assistants

??OpenAI Assistants允许您在自己的应用程序中构建AI助手。助手具有指令,并可以利用模型、工具和知识来回答用户的查询。assistants API目前支持三种类型的工具:代码解释器、检索和函数调用。

??您可以使用OpenAI工具或自定义工具与OpenAI助手进行交互。当仅使用OpenAI工具时,您可以直接调用助手并获得最终答案。当使用自定义工具时,您可以使用内置的AgentExecutor运行助手和工具执行循环,或者轻松编写自己的执行器。

??下面演示与assistants交互的不同方式。作为一个简单的示例,让我们构建一个数学辅导员,可以编写和运行代码。

9.1 OpenAI tools

from langchain.agents.openai_assistant import OpenAIAssistantRunnable

interpreter_assistant = OpenAIAssistantRunnable.create_assistant(
    name="langchain assistant",
    instructions="You are a personal math tutor. Write and run code to answer math questions.",
    tools=[{"type": "code_interpreter"}],
    model="gpt-4-1106-preview",
)
output = interpreter_assistant.invoke({"content": "What's 10 - 4 raised to the 2.7"})
output
[ThreadMessage(id='msg_qgxkD5kvkZyl0qOaL4czPFkZ', assistant_id='asst_0T8S7CJuUa4Y4hm1PF6n62v7', content=[MessageContentText(text=Text(annotations=[], value='The result of the calculation \\(10 - 4^{2.7}\\) is approximately \\(-32.224\\).'), type='text')], created_at=1700169519, file_ids=[], metadata={}, object='thread.message', role='assistant', run_id='run_aH3ZgSWNk3vYIBQm3vpE8tr4', thread_id='thread_9K6cYfx1RBh0pOWD8SxwVWW9')]

9.2 using custom tools

下面使用 E2B sandbox runtime tool重现此功能:

!pip install e2b duckduckgo-search
import getpass
from langchain.tools import DuckDuckGoSearchRun, E2BDataAnalysisTool

tools = [E2BDataAnalysisTool(api_key=getpass.getpass()), DuckDuckGoSearchRun()]

agent = OpenAIAssistantRunnable.create_assistant(
    name="langchain assistant e2b tool",
    instructions="You are a personal math tutor. Write and run code to answer math questions. You can also search the internet.",
    tools=tools,
    model="gpt-4-1106-preview",
    as_agent=True,
)
  • E2BDataAnalysisTool:一个数据分析工具,通过getpass.getpass()方法获取API密钥
  • DuckDuckGoSearchRun:一个网络搜索工具
9.2.1 using AgentExecutor

??OpenAIAssistantRunnable与AgentExecutor互相兼容,作为一个代理对象,OpenAIAssistantRunnable可以作为参数传递给AgentExecutor。AgentExecutor负责处理代理的执行过程,包括调用所需的工具,并将工具的输出返回给Assistants API。此外,它还带有内置的 LangSmith 跟踪功能。

from langchain.agents import AgentExecutor

agent_executor = AgentExecutor(agent=agent, tools=tools)
agent_executor.invoke({"content": "What's the weather in SF today divided by 2.7"})
{'content': "What's the weather in SF today divided by 2.7",
 'output': "The search results indicate that the weather in San Francisco is 67 °F. Now I will divide this temperature by 2.7 and provide you with the result. Please note that this is a mathematical operation and does not represent a meaningful physical quantity.\n\nLet's calculate 67 °F divided by 2.7.\nThe result of dividing the current temperature in San Francisco, which is 67 °F, by 2.7 is approximately 24.815.",
 'thread_id': 'thread_hcpYI0tfpB9mHa9d95W7nK2B',
 'run_id': 'run_qOuVmPXS9xlV3XNPcfP8P9W2'}
9.2.2 using custom execution

??使用 LCEL,我们可以很容易地编写自己的执行循环来运行助手,这使得我们能够完全控制执行过程。

agent = OpenAIAssistantRunnable.create_assistant(
    name="langchain assistant e2b tool",
    instructions="You are a personal math tutor. Write and run code to answer math questions.",
    tools=tools,
    model="gpt-4-1106-preview",
    as_agent=True,
)
from langchain_core.agents import AgentFinish


def execute_agent(agent, tools, input):
    tool_map = {tool.name: tool for tool in tools}
    response = agent.invoke(input)
    while not isinstance(response, AgentFinish):
        tool_outputs = []
        for action in response:
            tool_output = tool_map[action.tool].invoke(action.tool_input)
            print(action.tool, action.tool_input, tool_output, end="\n\n")
            tool_outputs.append(
                {"output": tool_output, "tool_call_id": action.tool_call_id}
            )
        response = agent.invoke(
            {
                "tool_outputs": tool_outputs,
                "run_id": action.run_id,
                "thread_id": action.thread_id,
            }
        )

    return response
response = execute_agent(agent, tools, {"content": "What's 10 - 4 raised to the 2.7"})
print(response.return_values["output"])
e2b_data_analysis {'python_code': 'result = 10 - 4 ** 2.7\nprint(result)'} {"stdout": "-32.22425314473263", "stderr": "", "artifacts": []}

\( 10 - 4^{2.7} \) equals approximately -32.224.

9.3 使用现有线程

??“Existing thread”(现有线程)是指在对话中已经存在的线程或会话。在调用代理时,通过将参数thread_id传递给execute_agent函数,可以使用现有线程。现有线程可以提供更连贯、个性化和有针对性的对话体验。它使助手能够更好地理解用户的意图,并根据先前的交互来生成更准确和有用的回答。

next_response = execute_agent(
    agent,
    tools,
    {"content": "now add 17.241", "thread_id": response.return_values["thread_id"]},
)
print(next_response.return_values["output"])
e2b_data_analysis {'python_code': 'result = 10 - 4 ** 2.7 + 17.241\nprint(result)'} {"stdout": "-14.983253144732629", "stderr": "", "artifacts": []}

\( 10 - 4^{2.7} + 17.241 \) equals approximately -14.983.

9.4 使用现有助手

??你也可以直接使用现有的 Assistant,只需要在初始化OpenAIAssistantRunnable对象时传递assistant_id就行。

agent = OpenAIAssistantRunnable(assistant_id="<ASSISTANT_ID>", as_agent=True)
文章来源:https://blog.csdn.net/qq_56591814/article/details/135040694
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。