通过scrapyd定时调度爬虫,post远程开启服务器上的scrapy爬虫

wylc123 1年前 ⋅ 1279 阅读

#!usr/bin/env python
# -*- coding: utf-8 -*-
#!文件类型： python
#!创建时间： 2021-08-05 17:18
#!作者： SongBin
#!文件名称： sipderSchedule.py
#!简介：通过scrapyd定时调度爬虫
#!来源：https://www.daxueyiwu.com
import datetime
import threading
import urllib.request
import time
import schedule as schedule


def aiqSpider():
    # 启动爬虫
    print((datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S') + "#############AiqSipder调度程序开始运行###########"))
    url = "http://192.168.20.150:6800/schedule.json"
    header = {"User-Agent": "Opera/9.80 (Windows NT 6.1; U; en) Presto/2.7.62 Version/11.01"}
    formData = {'project': 'aiqSpider', 'spider': 'aiq'}
    # 将str类型转换为bytes类型
    data = urllib.parse.urlencode(formData).encode("utf-8")
    # 以下是post请求
    request = urllib.request.Request(url, data=data, headers=header)
    print(urllib.request.urlopen(request).read().decode("utf-8"))

    # 查看日志
    # 以下是get请求
    myproject = "aiqSpider"
    requrl = "http://192.168.20.150:6800/listjobs.json?project=" + myproject
    req = urllib.request.urlopen(requrl)
    res = req.read()  # res 是str类型
    print(res)
    print((datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S') + "#############AiqSipder调度程序运行结束###########"))
def job():
    aiqSpider()
def run_threaded(job_func):
    job_thread = threading.Thread(target=job_func)
    job_thread.start()

schedule.every().day.at("00:01").do(run_threaded, job)
schedule.every().day.at("10:00").do(run_threaded, job)
if __name__ == '__main__':
    # aiqSpider()
  while True:
    schedule.run_pending()
    time.sleep(10)

注意：本文归作者所有，未经作者允许，不得转载

#scrapyd #爬虫 #scrapy

相关文章推荐

使用scrapy+IP代理+多线程爬虫对拉钩网在杭州互联网职位信息的抓取 0 点赞 ⋅ 0 回复 ⋅ 1年前
Nginx反爬虫策略，禁止某些UA抓取网站 0 点赞 ⋅ 0 回复 ⋅ 1年前
PyCharm创建，运行，调试Scrapy爬虫 0 点赞 ⋅ 0 回复 ⋅ 1年前
Python3 网络爬虫（一）：初识网络爬虫之夜探老王家 0 点赞 ⋅ 0 回复 ⋅ 1年前
Python3 网络爬虫（三）：漫画下载，动态加载、反爬虫这都不叫事！ 0 点赞 ⋅ 0 回复 ⋅ 1年前
Python3 网络爬虫（二）：下载小说的正确姿势 0 点赞 ⋅ 0 回复 ⋅ 1年前
Python3 网络爬虫（五）：老板，需要特殊服务吗？ 0 点赞 ⋅ 0 回复 ⋅ 1年前
Python3 网络爬虫（六）：618，爱他/她，就清空他/她的购物车！ 0 点赞 ⋅ 0 回复 ⋅ 1年前
Python3 网络爬虫（四）：视频下载，那些事儿！ 0 点赞 ⋅ 0 回复 ⋅ 1年前
Python3下安装Scrapy爬虫环境 0 点赞 ⋅ 0 回复 ⋅ 1年前
Python3网络爬虫快速入门实战解析 0 点赞 ⋅ 0 回复 ⋅ 1年前
python爬虫相关 0 点赞 ⋅ 0 回复 ⋅ 1年前
requests.get()爬去中文网页乱码解决方法 0 点赞 ⋅ 0 回复 ⋅ 1年前
Requests库获取页面信息，BeautifulSoup库解析页面信息 0 点赞 ⋅ 0 回复 ⋅ 1年前
scrapy+selenium爬取懒加载网页 0 点赞 ⋅ 0 回复 ⋅ 1年前
scrapyd设置远程登录 0 点赞 ⋅ 0 回复 ⋅ 1年前
Scrapyd部署项目后启动报错，无法找到模块No module named xxxx(已解决) 0 点赞 ⋅ 0 回复 ⋅ 1年前
Scrapy初章-Scrapy理论简介 0 点赞 ⋅ 0 回复 ⋅ 1年前
Scrapy日志处理 0 点赞 ⋅ 0 回复 ⋅ 1年前
Scrapy最最最终章-搂一把sakimichan 0 点赞 ⋅ 0 回复 ⋅ 1年前

1
2
3

全部评论: 0 条

我有话说:

wylc123
- 900发布
- 8评论
收藏 0