Skip to main content

How to find filters

Using the open or api command often generates a large data set that can be tricky to find filters for. For example, if we are trying to get all headlines from a news site, how do we know what filters describe a headline?

A good pattern for finding the filters that describe a particular element is to use the find command to look for a specific piece of text in our data.

For example, if one of the headlines is Liverpool wins the Champion's League we could try isolating just elements on the page containing that text, then find the specific row/element containing the headline. Our query would look something like this:

... || find "Liverpool wins the Champion's League"

Then we can look through the row values for values that describe the element. Maybe in this case the row has an attributes.class column where the value is headline and a nodeName column where the value is H2. Our filter would look something like:

... || filter "attributes.class == 'headline' and nodeName == 'H2'"

If we remove the find command, we will likely get back all headlines.

Sometimes you will need to try a few times to get the right combination of filters to capture all the elements you are interested in. For example, maybe the main headline has a slightly different filter that needs to be added as an or to our filter expression.