Scrapyd官方文档: https://scrapyd.readthedocs.io/en/latest/api.html
API
The following section describes the available resources in Scrapyd JSON API.
以下部分描述了Scrapyd JSON API中的可用资源。
1、daemonstatus.json
To check the load status of a service.
译:检查服务的负载状态。
- Supported Request Methods:
GET
Example request:
curl http://localhost:6800/daemonstatus.json
Example response:
{ "status": "ok", "running": "0", "pending": "0", "finished": "0", "node_name": "node-name" }
2、addversion.json
Add a version to a project, creating the project if it doesn’t exist.
译:添加项目的新版本到此Scrapy服务器的项目列表中,如果项目不存在则创建此项目。
- Supported Request Methods:
POST
- Parameters:
project
(string, required) - the project nameversion
(string, required) - the project versionegg
(file, required) - a Python egg containing the project’s code
Example request:
$ curl http://localhost:6800/addversion.json -F project=myproject -F version=r23 -F [email protected]
Example response:
{"status": "ok", "spiders": 3}
Note
Scrapyd uses the distutils LooseVersion to interpret the version numbers you provide.
The latest version for a project will be used by default whenever necessary.
schedule.json and listspiders.json allow you to explicitly set the desired project version.
3、schedule.json
Schedule a spider run (also known as a job), returning the job id.
译:启动服务器上某一爬虫,返回作业ID。
- Supported Request Methods:
POST
- Parameters:
project
(string, required) - the project namespider
(string, required) - the spider namesetting
(string, optional) - a Scrapy setting to use when running the spiderjobid
(string, optional) - a job id used to identify the job, overrides the default generated UUID_version
(string, optional) - the version of the project to use- any other parameter is passed as spider argument
Example request:
$ curl http://localhost:6800/schedule.json -d project=myproject -d spider=somespider
Example response:
{"status": "ok", "jobid": "6487ec79947edab326d6db28a2d86511e8247444"}
Example request passing a spider argument (arg1
) and a setting (DOWNLOAD_DELAY):
$ curl http://localhost:6800/schedule.json -d project=myproject -d spider=somespider -d setting=DOWNLOAD_DELAY=2 -d arg1=val1
Note
Spiders scheduled with scrapyd should allow for an arbitrary number of keyword arguments as scrapyd sends internally generated spider arguments to the spider being scheduled
4、cancel.json
New in version 0.15.
Cancel a spider run (aka. job). If the job is pending, it will be removed. If the job is running, it will be terminated.
译:取消spider(又名job)运行。如果作业处于待处理状态,则会将其删除。如果作业正在运行,它将被终止。
- Supported Request Methods:
POST
- Parameters:
project
(string, required) - the project namejob
(string, required) - the job id
Example request:
$ curl http://localhost:6800/cancel.json -d project=myproject -d job=6487ec79947edab326d6db28a2d86511e8247444
Example response:
{"status": "ok", "prevstate": "running"}
5、listprojects.json
Get the list of projects uploaded to this Scrapy server.
译:获取上传到此Scrapy服务器的项目列表
- Supported Request Methods:
GET
- Parameters: none
Example request:
$ curl http://localhost:6800/listprojects.json
Example response:
{"status": "ok", "projects": ["myproject", "otherproject"]}
6、listversions.json
Get the list of versions available for some project. The versions are returned in order, the last one is the currently used version.
译:获取已发布项目可用的版本列表。版本按顺序返回,最后一个版本是当前使用的版本。
- Supported Request Methods:
GET
- Parameters:
project
(string, required) - the project name
Example request:
$ curl http://localhost:6800/listversions.json?project=myproject
Example response:
{"status": "ok", "versions": ["r99", "r156"]}
7、listspiders.json
Get the list of spiders available in the last (unless overridden) version of some project.
译:获取某个项目的最后一个(除非被覆盖)版本中可用的spider列表。
- Supported Request Methods:
GET
- Parameters:
project
(string, required) - the project name_version
(string, optional) - the version of the project to examine
Example request:
$ curl http://localhost:6800/listspiders.json?project=myproject
Example response:
{"status": "ok", "spiders": ["spider1", "spider2", "spider3"]}
8、listjobs.json
New in version 0.15.
Get the list of pending, running and finished jobs of some project.
译:获取某个项目的待处理,正在运行和已完成的作业列表。
- Supported Request Methods:
GET
- Parameters:
project
(string, option) - restrict results to project name
Example request:
$ curl http://localhost:6800/listjobs.json?project=myproject | python -m json.tool
Example response:
{
"status": "ok",
"pending": [
{
"project": "myproject", "spider": "spider1",
"id": "78391cc0fcaf11e1b0090800272a6d06"
}
],
"running": [
{
"id": "422e608f9f28cef127b3d5ef93fe9399",
"project": "myproject", "spider": "spider2",
"start_time": "2012-09-12 10:14:03.594664"
}
],
"finished": [
{
"id": "2f16646cfcaf11e1b0090800272a6d06",
"project": "myproject", "spider": "spider3",
"start_time": "2012-09-12 10:14:03.594664",
"end_time": "2012-09-12 10:24:03.594664"
}
]
}
Note
All job data is kept in memory and will be reset when the Scrapyd service is restarted. See issue 12.
9、delversion.json
Delete a project version. If there are no more versions available for a given project, that project will be deleted too.
译:删除项目的指定版本;如果给定项目没有更多可用版本,则该项目也将被删除。
- Supported Request Methods:
POST
- Parameters:
project
(string, required) - the project nameversion
(string, required) - the project version
Example request:
$ curl http://localhost:6800/delversion.json -d project=myproject -d version=r99
Example response:
{"status": "ok"}
10、delproject.json
Delete a project and all its uploaded versions.
译:删除项目及所有上传的版本
- Supported Request Methods:
POST
- Parameters:
project
(string, required) - the project name
Example request:
$ curl http://localhost:6800/delproject.json -d project=myproject
Example response:
{"status": "ok"}