姜子牙大语言模型fastapi接口服务
姜子牙大模型效果还可以,但是如何将它的模型文件部署成自己的服务呢,下面是教程代码
一、安装环境
python 版本 3.7
transformer最新版本
二、姜子牙fastapi接口服务代码
1.服务端代码
代码如下(示例):
import uvicorn
from fastapi import FastAPI
from pydantic import BaseModel
from transformers import AutoTokenizer
from transformers import LlamaForCausalLM
import torch
app = FastAPI()
class Query(BaseModel):
text: str
device = torch.device("cuda")
model = LlamaForCausalLM.from_pretrained('IDEA-CCNL/Ziya-LLaMA-13B-v1', device_map="auto")
tokenizer = AutoTokenizer.from_pretrained('IDEA-CCNL/Ziya-LLaMA-13B-v1')
@app.post("/generate_travel_plan/")
async def generate_travel_plan(query: Query):
inputs = '<human>:' + query.text.strip() + '\n<bot>:'
input_ids = tokenizer(inputs, return_tensors="pt").input_ids.to(device)
generate_ids = model.generate(
input_ids,
max_new_tokens=1024,
do_sample=True,
top_p=0.85,
temperature=1.0,
repetition_penalty=1.,
eos_token_id=2,
bos_token_id=1,
pad_token_id=0)
output = tokenizer.batch_decode(generate_ids)[0]
return {
"result": output}
if __name__ == "__main__":
uvicorn.run(app, host="192.168.138.218", port=7861)
2.调用代码
代码如下(示例):
import requests
url = "http:/192.168.138.210:7861/generate_travel_plan/"
query = {
"text": "帮我写一份去西安的旅游计划"}
response = requests.post(url, json=query)
if response.status_code == 200:
result = response.json()
print("Generated travel plan:", result["result"])
else:
print("Error:", response.status_code, response.text)
3.postman curl调用代码
curl --location 'http://192.168.138.210:7861/generate_travel_plan/' \
--header 'accept: application/json' \
--header 'Content-Type: application/json' \
--data '{"text":""}'
总结
例如:以上就是今天要讲的内容,本文介绍了姜子牙大语言模型fastapi服务的搭建。
需要进垂直领域大模型训练交流群的私信我