1. langgraph实现Tree of Thoughts解决24点游戏

简介

24点游戏是一种数学益智游戏，目标是通过加、减、乘、除四种运算，使给定的四个数字的运算结果等于24。本教程将介绍如何使用langgraph实现一个简单的24点游戏求解器。

代码实现

定义数据结构

首先，定义一些基本的数据结构来表示运算符、方程式。

import operator
from typing import List, Literal, Union, NamedTuple, Optional
from pydantic import BaseModel, FieldOperatorType = Literal["+", "-", "*", "/"]
TokenType = Union[float, OperatorType]class Equation(BaseModel):"""表示一个公式，将提供的数字组合以达到目标值24。"""tokens: List[TokenType] = Field(description="逆波兰表示法中的令牌和运算符栈。例如：[3, 4, '+', -1, '*'] 将计算为 (3 + 4) * -1 = -7。",)def compute(self) -> float:op_funcs = {"+": operator.add,"-": operator.sub,"*": operator.mul,"/": operator.truediv,}stack = []for token in self.tokens:if isinstance(token, float):stack.append(token)else:b, a = stack.pop(), stack.pop()stack.append(op_funcs[token](a, b))return stack[0]

测试Equation类

通过一些示例来测试Equation类的计算功能。

# 示例表达式测试
test_cases = [[3.0, 4.0, "+", 2.0, "*"],[5.0, 1.0, 2.0, "+", 4.0, "*", "+", 3.0, "-"],[10.0, 2.0, "/", 5.0, "*"],[7.0, 2.0, 3.0, "*", "-"], [3.0, 4.0, "+", 8.0, "/"]
]for tokens in test_cases:equation = Equation(tokens=tokens)result = equation.compute()print(f"Tokens: {tokens}\nComputed: {result}")

输出结果：

Tokens: [3.0, 4.0, '+', 2.0, '*']
Computed: 14.0
Tokens: [5.0, 1.0, 2.0, '+', 4.0, '*', '+', 3.0, '-']
Computed: 14.0
Tokens: [10.0, 2.0, '/', 5.0, '*']
Computed: 25.0
Tokens: [7.0, 2.0, 3.0, '*', '-']
Computed: 1.0
Tokens: [3.0, 4.0, '+', 8.0, '/']
Computed: 0.875

使用LLM生成方程式

使用大型语言模型（LLM）来生成可能的方程式。

from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAIprompt = ChatPromptTemplate.from_messages([("system","You are playing the Game of 24. Using the provide numbers, create an equation that evaluates to 24.\n""Submit exactly {k} guesses for this round.",),("user", "Solve the 24 game for these numbers: {problem}.{candidate}"),],
).partial(candidate="")# 配置LLM
llm = ChatOpenAI(temperature=1,model="GLM-4-PLUS",openai_api_key="your_api_key",openai_api_base="https://open.bigmodel.cn/api/paas/v4/",max_tokens=512,
)bound_llm = llm.with_structured_output(GuessEquations)
solver = prompt | bound_llmresponse = solver.invoke({"k":"3","problem":"1 1 4 6"})
print(response)

输出结果：

GuessEquations(reasoning='I started by adding the two 1s to make 2, then multiplied by 4 to get 8, and finally multiplied by 6 to reach 24.', equations=[Equation(tokens=[1.0, 1.0, '+', 4.0, '*', 6.0])])

定义候选解和评分

定义一个Candidate类来表示候选解，并实现一个评分函数来评估候选解的好坏。

class Candidate(NamedTuple):candidate: Equationscore: Optional[float] = Nonefeedback: Optional[str] = Nonedef __str__(self):try:computed = self.candidate.compute()except Exception as e:computed = f"Invalid equation: {self.candidate.tokens}; Error: {repr(e)}"return f"Equation({self.candidate.tokens}) = {computed} (Reward: {self.score})"class ScoredCandidate(Candidate):candidate: Equationscore: floatfeedback: strdef compute_score(problem: str, candidate: Candidate) -> ScoredCandidate:numbers = list(map(int, problem.split()))# 检查候选方程是否使用了所有4个数字且仅使用一次used_numbers = [token for token in candidate.candidate.tokens if isinstance(token, float)]if sorted(used_numbers) != sorted(numbers):score = 0feedback = "The equation must use all 4 numbers exactly once."return ScoredCandidate(candidate=candidate.candidate, score=score, feedback=feedback)try:result = candidate.candidate.compute()score = 1 / (1 + abs(24 - result)) # 根据结果与目标值（24）之间的差距来评估候选解的好坏feedback = f"Result: {result}"except Exception as e:score = 0feedback = f"Invalid equation. Error: {repr(e)}"return ScoredCandidate(candidate=candidate.candidate, score=score, feedback=feedback)

测试评分函数

我们可以通过一些示例来测试评分函数。

# 测试 compute_score 函数
problems_and_candidates = [# 用例 1: 正确的表达式("3 4 2 1", Candidate(Equation(tokens=[3.0, 4.0, "+", 2.0, "*", 1.0, "+"]))),# 用例 2: 错误的表达式 (未使用所有数字)("5 6 7 8", Candidate(Equation(tokens=[5.0, 6.0, "+", 7.0, "*"]))),# 用例 3: 表达式结果不等于 24("2 3 4 5", Candidate(Equation(tokens=[2.0, 3.0, "+", 4.0, "+", 5.0, "+"]))),# 用例 4: 错误的表达式 (堆栈错误)("10 20 30 40", Candidate(Equation(tokens=[10.0, "+"]))),# 用例 5: 正确但复杂的表达式("8 3 1 2", Candidate(Equation(tokens=[8.0, 3.0, "-", 1.0, "+", 2.0, "*"]))),
]for problem, candidate in problems_and_candidates:scored_candidate = compute_score(problem, candidate)print(f"Problem: {problem}")print(f"Candidate: {candidate}")print(f"Score: {scored_candidate.score}")print(f"Feedback: {scored_candidate.feedback}")print("-" * 50)

输出结果：

Problem: 3 4 2 1
Candidate: Equation([3.0, 4.0, '+', 2.0, '*', 1.0, '+']) = 15.0 (Reward: None)
Score: 0.1
Feedback: Result: 15.0
--------------------------------------------------
Problem: 5 6 7 8
Candidate: Equation([5.0, 6.0, '+', 7.0, '*']) = 77.0 (Reward: None)
Score: 0
Feedback: The equation must use all 4 numbers exactly once.
--------------------------------------------------
Problem: 2 3 4 5
Candidate: Equation([2.0, 3.0, '+', 4.0, '+', 5.0, '+']) = 14.0 (Reward: None)
Score: 0.09090909090909091
Feedback: Result: 14.0
--------------------------------------------------
Problem: 10 20 30 40
Candidate: Equation([10.0, '+']) = Invalid equation: [10.0, '+']; Error: IndexError('pop from empty list') (Reward: None)
Score: 0
Feedback: The equation must use all 4 numbers exactly once.
--------------------------------------------------
Problem: 8 3 1 2
Candidate: Equation([8.0, 3.0, '-', 1.0, '+', 2.0, '*']) = 12.0 (Reward: None)
Score: 0.07692307692307693
Feedback: Result: 12.0
--------------------------------------------------

参考链接：https://langchain-ai.github.io/langgraph/tutorials/tot/tot/
如果有任何问题，欢迎在评论区提问。