C#解析html库
网上查找有如下几个库
- SGMLReader (好久不更新了)
- html-agility-pack(活跃)
- AngleSharp(活跃)
- CsQuery(好久不更新)
选择AngleSharp
原因:能向写js一样获取元素
安装AngleSharp 1.2.0-beta.431
using AngleSharp.Html.Parser;namespace HtmlParse01;class Program
{static void Main(string[] args){var htmlPart =@"<td><div class='cell'><div>编码</div></div>
</td>
<td><div class='cell'><div><div>1234567</div></div></div>
</td>";var parser = new HtmlParser();var htmlDocument = parser.ParseDocument(htmlPart);var cellDivs = htmlDocument.QuerySelectorAll(".cell");Console.WriteLine(cellDivs.Length);foreach (var cellDiv in cellDivs){Console.WriteLine(cellDiv.TextContent.Trim());}}
}
结果如下
也可以向原生js一样操作dom,参考下面的文章
https://www.jb51.net/article/251499.htm
参考
https://github.com/jamietre/CsQuery
https://www.nuget.org/packages/HtmlAgilityPack/
https://github.com/MindTouch/SGMLReader
https://html-agility-pack.net/
https://github.com/AngleSharp/AngleSharp
AngleSharp
https://scrapingant.com/blog/parse-html-dot-net
https://www.jb51.net/article/251499.htm