大家早好、午好、晚好吖 ? ~欢迎光临本文章
动态数据抓包
requests发送请求
python 3.8 运行代码
pycharm 2022.3 辅助敲代码
requests pip install requests
一. 思路(需求)分析
1. 找到真实的数据来源
https://www.kuaishou.com/graphql
二. 代码实现
1. 发送请求
2. 获取数据
3. 解析数据
4. 保存数据
导入模块
import requests # 发送请求第三方库 额外安装
import os
if not os.path.exists('video'):
os.mkdir('video')
模拟浏览器
headers = {
'Cookie': 'kpf=PC_WEB; clientid=3; did=web_279f6644708643f6590253c172444317; userId=3293066791; kuaishou.server.web_st=ChZrdWFpc2hvdS5zZXJ2ZXIud2ViLnN0EqABq79HDdbVzT-1D0e7qqksHMYNpTLBY2Am2JRy4o9P5cKoGvQmhDaYmSyUEY7DtdpZy1zwi8UciRmwS9YR1QqkBEo450JX-ICZc6IqLgnKSLZSNHu-DZNr5_vNGt4VqfmGI9aDMaldlAwjOE7poXhWnFQ-rygoTnWX2ns_MQ9WMhLQ5ynmVahud0Rgltew4UzOVG5C7E1x4mRYhcKodRkHnhoSTdCMiCqspRXB3AhuFugv61B-IiCqOqQ5DfDB1wZLV7Ai7aTCi9-2surn8lQUoOqKBslLXSgFMAE; kuaishou.server.web_ph=7162254888b4f9d8a92a1d2c71c253720a5e; kpn=KUAISHOU_VISION',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36'
}
请求链接
url = 'https://www.kuaishou.com/graphql'
pcursor = ""
while True:
json = {
"operationName":"visionProfilePhotoList",
"variables":{
"userId":"3x4rustdcf45vti",
"pcursor":pcursor,
"page":"profile"
},
"query":"fragment photoContent on PhotoEntity {\n __typename\n id\n duration\n caption\n originCaption\n likeCount\n viewCount\n commentCount\n realLikeCount\n coverUrl\n photoUrl\n photoH265Url\n manifest\n manifestH265\n videoResource\n coverUrls {\n url\n __typename\n }\n timestamp\n expTag\n animatedCoverUrl\n distance\n videoRatio\n liked\n stereoType\n profileUserTopPhoto\n musicBlocked\n}\n\nfragment recoPhotoFragment on recoPhotoEntity {\n __typename\n id\n duration\n caption\n originCaption\n likeCount\n viewCount\n commentCount\n realLikeCount\n coverUrl\n photoUrl\n photoH265Url\n manifest\n manifestH265\n videoResource\n coverUrls {\n url\n __typename\n }\n timestamp\n expTag\n animatedCoverUrl\n distance\n videoRatio\n liked\n stereoType\n profileUserTopPhoto\n musicBlocked\n}\n\nfragment feedContent on Feed {\n type\n author {\n id\n name\n headerUrl\n following\n headerUrls {\n url\n __typename\n }\n __typename\n }\n photo {\n ...photoContent\n ...recoPhotoFragment\n __typename\n }\n canAddComment\n llsid\n status\n currentPcursor\n tags {\n type\n name\n __typename\n }\n __typename\n}\n\nquery visionProfilePhotoList($pcursor: String, $userId: String, $page: String, $webPageArea: String) {\n visionProfilePhotoList(pcursor: $pcursor, userId: $userId, page: $page, webPageArea: $webPageArea) {\n result\n llsid\n webPageArea\n feeds {\n ...feedContent\n __typename\n }\n hostName\n pcursor\n __typename\n }\n}\n"
}
response = requests.post(url=url, json=json, headers=headers)
json_data = response.json()
feeds = json_data['data']['visionProfilePhotoList']['feeds']
pcursor = json_data['data']['visionProfilePhotoList']['pcursor']
for feed in feeds:
photoUrl = feed['photo']['photoUrl'] # 视频链接
caption = feed['photo']['caption'] # 视频标题
print(caption, photoUrl)
# video_data = requests.get(photoUrl).content
# with open(f'video/{caption}.mp4', mode='wb') as f:
# f.write(video_data)
if pcursor == "no_more":
break
最后感谢你观看我的文章呐~本次航班到这里就结束啦 🛬
希望本篇文章有对你带来帮助 🎉,有学习到一点知识~
躲起来的星星🍥也在努力发光,你也要努力加油(让我们一起努力叭)。