使用 ChatGPT API 构建系统课程要点总结

Building Systems with the ChatGPT API 课程链接：https://learn.deeplearning.ai/chatgpt-building-system/

第一节简介

介绍了两种 LLM 的情况：Base LLM 使用监督学习进行训练，其开发周期相当漫长，而使用 Instruction tuned LLM 开发 prompt-based AI 则可以将开发过程极大程度缩短。

第二节 Language Models, the Chat Format and Tokens

讲到了 LLM 的 tokenizor 机制，导致 AI 看到的英文是 sub-word 级别而不是单个字母，进而导致 AI 没办法完成将一个单词按字母顺序倒序输出等这类字母基本的任务。我认为在中文中也会遇到类似的问题，我使用 tiktokenizor 对中文做过测试，实践表明有些中文被一整个切割，而有些可能被切为好几份。

然后讲到了 Chat Format，将对话分为三个角色 system、user、assistant。其中涉及对于 assistant 的角色风格和行为的设定最好放到 system 中。

第三节 Classification

讲到了使用 GPT 对用户的问题进行分类的实践，例子是以一个客服角色对用户问题进行二级分类并要求 GPT 以JSON 格式返回。
其中强调了 delimiter （分隔符）的作用，将需要被分类的用户问题用 delimiter 包裹效果会更好。

import os
import openai
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

openai.api_key  = os.environ['OPENAI_API_KEY']


def get_completion_from_messages(messages, 
                                 model="gpt-3.5-turbo", 
                                 temperature=0, 
                                 max_tokens=500):
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=temperature, 
        max_tokens=max_tokens,
    )
    return response.choices[0].message["content"]


delimiter = "####"
system_message = f"""
You will be provided with customer service queries. \
The customer service query will be delimited with \
{
      
      delimiter} characters.
Classify each query into a primary category \
and a secondary category. 
Provide your output in json format with the \
keys: primary and secondary.

Primary categories: Billing, Technical Support, \
Account Management, or General Inquiry.

Billing secondary categories:
Unsubscribe or upgrade
Add a payment method
Explanation for charge
Dispute a charge

Technical Support secondary categories:
General troubleshooting
Device compatibility
Software updates

Account Management secondary categories:
Password reset
Update personal information
Close account
Account security

General Inquiry secondary categories:
Product information
Pricing
Feedback
Speak to a human

"""


user_message = f"""\
I want you to delete my profile and all of my user data"""
messages =  [  
    {
    
    'role':'system', 
     'content': system_message},    
    {
    
    'role':'user', 
     'content': f"{
      
      delimiter}{
      
      user_message}{
      
      delimiter}"},  
] 
response = get_completion_from_messages(messages)
print(response)


user_message = f"""\
Tell me more about your flat screen tvs"""
messages =  [  
{
    
    'role':'system', 
 'content': system_message},    
{
    
    'role':'user', 
 'content': f"{
      
      delimiter}{
      
      user_message}{
      
      delimiter}"},  
] 
response = get_completion_from_messages(messages)
print(response)

第四节 Moderation

讲到可以使用 openai 的 Moderation API 对 GPT 生成的内容进行审核，会返回一系列的打分，主要是四类：仇恨（威胁）、自残、色情、暴力。

然后讲到可以构建 Prompt 让 GPT 帮助对用户的输入进行判断是否为 prompt injection，以及一些预处理，比如可以预先移除用户的输入中可能存在的 delimiter 等。

第五节 Chain of Thought Reasoning

提到了一个思维链的用例，GPT 作为一个导购分了 5 个步骤进行思考：

第一步：判断用户的问题是否为询问一个具体商品或者一个商品类别；
第二步：如果用户在问具体的商品则识别商品在不在给定的列表中，然后给了个列表；
第三步：如果消息包含了在上面列表的商品则列出用户在其消息中所做的任何假设；
第四步：如果用户做出了任何假设，请根据产品信息确定该假设是否成立；
第五步：首先礼貌地纠正客户的错误假设（如果可行的话），仅在提到或引用到列表中的 5 个提供的产品的时候（因为这些事商店中唯一销售的 5 个产品），礼貌地回答客户。

然后提到在 Prompt 中要求了 LLM 使用 delimiter 对每一步进行分割，那么在实际输出给用户看的时候可以将输出切分然后输出最后一段即可。

第六节 Chaining Prompts

提到了可以对 Prompts 进行链式处理，用以替代 CoT 进行开发。
其好处包括：

More Focused 更加专注（Break down a complex task 将一个复杂的任务拆分了）
Context Limitations 上下文限制（Max tokens for input and output 最大的输入输出token长度）
Reduced Costs 更少的费用（Pay per token 按token付费）

以及更便于开发和调试等优点。
讲解了一个使用多个 prompts 来处理复杂任务的例子：

第一步：提取相关产品和类别名称（使用了GPT）
第二步：检索提取的产品和类别的详细产品信息（根据上一步的返回从结构化数据中查取）
第三步：根据详细的产品信息生成用户查询的答案（详细的产品信息为上一步给出的 JSON）

其中最后一步的代码如下：

system_message = f"""
You are a customer service assistant for a \
large electronic store. \
Respond in a friendly and helpful tone, \
with very concise answers. \
Make sure to ask the user relevant follow up questions.
"""
user_message_1 = f"""
tell me about the smartx pro phone and \
the fotosnap camera, the dslr one. \
Also tell me about your tvs"""
messages =  [  
{
    
    'role':'system',
 'content': system_message},   
{
    
    'role':'user',
 'content': user_message_1},  
{
    
    'role':'assistant',
 'content': f"""Relevant product information:\n\
 {
      
      product_information_for_user_message_1}"""},   
]
final_response = get_completion_from_messages(messages)
print(final_response)

第七节 Check outputs

这一节讲了检查模型输出的几个办法，如使用 openai 的Moderation api 审核内容是否有害，也可以写个Promp让GPT根据提供的产品信息检查输出是否真实，案例中的代码如下：

system_message = f"""
You are an assistant that evaluates whether \
customer service agent responses sufficiently \
answer customer questions, and also validates that \
all the facts the assistant cites from the product \
information are correct.
The product information and user and customer \
service agent messages will be delimited by \
3 backticks, i.e. ```.
Respond with a Y or N character, with no punctuation:
Y - if the output sufficiently answers the question \
AND the response correctly uses product information
N - otherwise

Output a single letter only.
"""
customer_message = f"""
tell me about the smartx pro phone and \
the fotosnap camera, the dslr one. \
Also tell me about your tvs"""
product_information = """{ "name": "SmartX ProPhone", "category": "Smartphones and Accessories", "brand": "SmartX", "model_number": "SX-PP10", "warranty": "1 year", "rating": 4.6, "features": [ "6.1-inch display", "128GB storage", "12MP dual camera", "5G" ], "description": "A powerful smartphone with advanced camera features.", "price": 899.99 } { "name": "FotoSnap DSLR Camera", "category": "Cameras and Camcorders", "brand": "FotoSnap", "model_number": "FS-DSLR200", "warranty": "1 year", "rating": 4.7, "features": [ "24.2MP sensor", "1080p video", "3-inch LCD", "Interchangeable lenses" ], "description": "Capture stunning photos and videos with this versatile DSLR camera.", "price": 599.99 } { "name": "CineView 4K TV", "category": "Televisions and Home Theater Systems", "brand": "CineView", "model_number": "CV-4K55", "warranty": "2 years", "rating": 4.8, "features": [ "55-inch display", "4K resolution", "HDR", "Smart TV" ], "description": "A stunning 4K TV with vibrant colors and smart features.", "price": 599.99 } { "name": "SoundMax Home Theater", "category": "Televisions and Home Theater Systems", "brand": "SoundMax", "model_number": "SM-HT100", "warranty": "1 year", "rating": 4.4, "features": [ "5.1 channel", "1000W output", "Wireless subwoofer", "Bluetooth" ], "description": "A powerful home theater system for an immersive audio experience.", "price": 399.99 } { "name": "CineView 8K TV", "category": "Televisions and Home Theater Systems", "brand": "CineView", "model_number": "CV-8K65", "warranty": "2 years", "rating": 4.9, "features": [ "65-inch display", "8K resolution", "HDR", "Smart TV" ], "description": "Experience the future of television with this stunning 8K TV.", "price": 2999.99 } { "name": "SoundMax Soundbar", "category": "Televisions and Home Theater Systems", "brand": "SoundMax", "model_number": "SM-SB50", "warranty": "1 year", "rating": 4.3, "features": [ "2.1 channel", "300W output", "Wireless subwoofer", "Bluetooth" ], "description": "Upgrade your TV's audio with this sleek and powerful soundbar.", "price": 199.99 } { "name": "CineView OLED TV", "category": "Televisions and Home Theater Systems", "brand": "CineView", "model_number": "CV-OLED55", "warranty": "2 years", "rating": 4.7, "features": [ "55-inch display", "4K resolution", "HDR", "Smart TV" ], "description": "Experience true blacks and vibrant colors with this OLED TV.", "price": 1499.99 }"""
q_a_pair = f"""
Customer message: ```{
      
      customer_message}```
Product information: ```{
      
      product_information}```
Agent response: ```{
      
      final_response_to_customer}```

Does the response use the retrieved information correctly?
Does the response sufficiently answer the question

Output Y or N
"""
messages = [
    {
    
    'role': 'system', 'content': system_message},
    {
    
    'role': 'user', 'content': q_a_pair}
]

response = get_completion_from_messages(messages, max_tokens=1)
print(response)

another_response = "life is like a box of chocolates"
q_a_pair = f"""
Customer message: ```{
      
      customer_message}```
Product information: ```{
      
      product_information}```
Agent response: ```{
      
      another_response}```

Does the response use the retrieved information correctly?
Does the response sufficiently answer the question?

Output Y or N
"""
messages = [
    {
    
    'role': 'system', 'content': system_message},
    {
    
    'role': 'user', 'content': q_a_pair}
]

response = get_completion_from_messages(messages)
print(response)

第八节 Build an End-to-End System

展示了一个链式Prompt处理用户请求的系统的完整案例

import os
import openai
import sys
sys.path.append('../..')
import utils

import panel as pn  # GUI
pn.extension()

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

openai.api_key  = os.environ['OPENAI_API_KEY']

def get_completion_from_messages(messages, model="gpt-3.5-turbo", temperature=0, max_tokens=500):
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=temperature, 
        max_tokens=max_tokens, 
    )
    return response.choices[0].message["content"]


def process_user_message(user_input, all_messages, debug=True):
    delimiter = "```"
    
    # Step 1: Check input to see if it flags the Moderation API or is a prompt injection
    response = openai.Moderation.create(input=user_input)
    moderation_output = response["results"][0]

    if moderation_output["flagged"]:
        print("Step 1: Input flagged by Moderation API.")
        return "Sorry, we cannot process this request."

    if debug: print("Step 1: Input passed moderation check.")
    
    category_and_product_response = utils.find_category_and_product_only(user_input, utils.get_products_and_category())
    #print(print(category_and_product_response)
    # Step 2: Extract the list of products
    category_and_product_list = utils.read_string_to_list(category_and_product_response)
    #print(category_and_product_list)

    if debug: print("Step 2: Extracted list of products.")

    # Step 3: If products are found, look them up
    product_information = utils.generate_output_string(category_and_product_list)
    if debug: print("Step 3: Looked up product information.")

    # Step 4: Answer the user question
    system_message = f"""
    You are a customer service assistant for a large electronic store. \
    Respond in a friendly and helpful tone, with concise answers. \
    Make sure to ask the user relevant follow-up questions.
    """
    messages = [
        {
    
    'role': 'system', 'content': system_message},
        {
    
    'role': 'user', 'content': f"{
      
      delimiter}{
      
      user_input}{
      
      delimiter}"},
        {
    
    'role': 'assistant', 'content': f"Relevant product information:\n{
      
      product_information}"}
    ]

    final_response = get_completion_from_messages(all_messages + messages)
    if debug:print("Step 4: Generated response to user question.")
    all_messages = all_messages + messages[1:]

    # Step 5: Put the answer through the Moderation API
    response = openai.Moderation.create(input=final_response)
    moderation_output = response["results"][0]

    if moderation_output["flagged"]:
        if debug: print("Step 5: Response flagged by Moderation API.")
        return "Sorry, we cannot provide this information."

    if debug: print("Step 5: Response passed moderation check.")

    # Step 6: Ask the model if the response answers the initial user query well
    user_message = f"""
    Customer message: {
      
      delimiter}{
      
      user_input}{
      
      delimiter}
    Agent response: {
      
      delimiter}{
      
      final_response}{
      
      delimiter}

    Does the response sufficiently answer the question?
    """
    messages = [
        {
    
    'role': 'system', 'content': system_message},
        {
    
    'role': 'user', 'content': user_message}
    ]
    evaluation_response = get_completion_from_messages(messages)
    if debug: print("Step 6: Model evaluated the response.")

    # Step 7: If yes, use this answer; if not, say that you will connect the user to a human
    if "Y" in evaluation_response:  # Using "in" instead of "==" to be safer for model output variation (e.g., "Y." or "Yes")
        if debug: print("Step 7: Model approved the response.")
        return final_response, all_messages
    else:
        if debug: print("Step 7: Model disapproved the response.")
        neg_str = "I'm unable to provide the information you're looking for. I'll connect you with a human representative for further assistance."
        return neg_str, all_messages

user_input = "tell me about the smartx pro phone and the fotosnap camera, the dslr one. Also what tell me about your tvs"
response,_ = process_user_message(user_input,[])
print(response)


def collect_messages(debug=False):
    user_input = inp.value_input
    if debug: print(f"User Input = {
      
      user_input}")
    if user_input == "":
        return
    inp.value = ''
    global context
    #response, context = process_user_message(user_input, context, utils.get_products_and_category(),debug=True)
    response, context = process_user_message(user_input, context, debug=False)
    context.append({
    
    'role':'assistant', 'content':f"{
      
      response}"})
    panels.append(
        pn.Row('User:', pn.pane.Markdown(user_input, width=600)))
    panels.append(
        pn.Row('Assistant:', pn.pane.Markdown(response, width=600, style={
    
    'background-color': '#F6F6F6'})))
 
    return pn.Column(*panels)

剩余内容

剩下的部分是对于LLM输出进行评估的办法，以及总结。