How to filter/search tags in html document using htmlAgilityPack
Ways to filter html elements
based on filter criteria. Below given list using different filter criteria
assigned to sFilterCriteria variable
which is later used for filtration
1) string sFilterCriteria="//elementName";
Examples:
· string sFilterCriteria = "//div";
· string sFilterCriteria = "//img";
· string sFilterCriteria = "//a";
Description: To filter all <elementName> of given Html Document
2) string sFilterCriteria="//elementName[@AttributeName]";
Examples:
· string sFilterCriteria = "//div[@id]";
· string sFilterCriteria = "//img[@alt]";
· string sFilterCriteria = "//p[@style]";
Description: To filter all <elementName> elements having id attribute of given Html Document
3) string sFilterCriteria="//elementName[@AttributeName='AttributeValue']";
Examples:
· string sFilterCriteria = "//div[@id='div1']";
· string sFilterCriteria = "//a[@href='MrGST']";
· string sFilterCriteria = "//img[@alt='title']";
Description: To filter
all <elementName> elements
having AttributeName attribute
with value AttributeValue of given Html Document
4) string sFilterCriteria="//*[@AttributeName]";
Examples:
· string sFilterCriteria = "//*[@id]";
· string sFilterCriteria = "//*[@href]";
· string sFilterCriteria = "//*[@face]";
· string sFilterCriteria = "//*[@alt]";
· string sFilterCriteria = "//*[@src]";
Description: To filter
all html elements having AttributeName attribute of given Html Document
5) string sFilterCriteria="//*[@AttributeName='AttributeValue']";
Examples:
· string sFilterCriteria = "//*[@id='div1']";
· string sFilterCriteria = "//*[@href='MrGST']";
· string sFilterCriteria = "//*[@href='MrGST']";
Description: To filter
all html elements having AttributeName attribute with value AttributeValue of given Html Document
6) string sFilterCriteria="//*[@AttributeName='AttributeValue']";
7) Conditional filtration criteria
Examples:
· string sFilterCriteria = "//img[@src and (@width
or @height)]";
· string sFilterCriteria = "//span[@lang='EN-US'
and @style]";
· string sFilterCriteria = "//*[contains(@style,
'Wingding')]";
[Note: In last example, contains is used to check if winding is present in
style attribute or not]
Running the Filter Criteria for
HTMLDocument
Use below code to get list of
html elements present in document based of filter criteria sFilterCriteria
Create html document object
HtmlAgilityPack.HtmlDocument htmDoc = new HtmlAgilityPack.HtmlDocument();
htmDoc.LoadHtml(“<html>…………</html>”);
Use Search filter criteria sFilterCriteria
HtmlNodeCollection nc =
doc.DocumentNode.SelectNodes(sFilterCriteria);
if (nc != null)
{
foreach (HtmlNode node in nc)
{
//Logic body
}
}
Example:
string sFilterCriteria = "//a[@target]";
HtmlNodeCollection nc =
doc.DocumentNode.SelectNodes(sFilterCriteria);
if (nc != null)
{
foreach (HtmlNode node in nc)
{
//Logic body
}
}