python学习笔记--实现简单的爬虫(二)

2025/12/29 1:27:44 来源：https://blog.csdn.net/shenxiaomo1688/article/details/146446976 浏览: 次关键词：python学习笔记--实现简单的爬虫(二)

任务：爬取B站上最爱欢迎的编程课程

网址：编程-哔哩哔哩_bilibili

打开网页的代码模块，如下图：

标题均位于class_="bili-video-card__info--tit"的h3标签中，下面通过代码来实现，需要说明的是URL中的中文写到程序中，已自动转义：

import requests
from bs4 import BeautifulSoupurl = 'https://search.bilibili.com/all?keyword=%E7%BC%96%E7%A8%8B&from_source=banner_search&order=show&duration=0&tids_1=0'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36'
}
# 设置请求头，模拟浏览器访问
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"
}# 发送GET请求
response = requests.get(url, headers=headers)# 检查请求是否成功
if response.status_code == 200:# 解析HTML内容soup = BeautifulSoup(response.text, "html.parser")# 查找所有<h3>标签h3_tags = soup.find_all("h3", class_="bili-video-card__info--tit")# 遍历所有<h3>标签，提取title属性值for index, h3_tag in enumerate(h3_tags, start=1):title = h3_tag.get("title")if title:print(f"Title {index}: {title}")else:print(f"Title {index}: 无title属性")
else:print("请求失败，状态码:", response.status_code)

输出：

需要注意的是：网页的代码并非一成不变，爬取时一定要使用F12分析其代码结构。

python学习笔记--实现简单的爬虫(二)

相关资讯

热文排行

最新新闻

推荐新闻

热搜词