How to find filters
api command often generates a large data set that can be tricky to find filters for. For example, if we are trying to get all headlines from a news site, how do we know what filters describe a headline?
A good pattern for finding the filters that describe a particular element is to use the
find command to look for a specific piece of text in our data.
For example, if one of the headlines is
Liverpool wins the Champion's League we could try isolating just elements on the page containing that text, then find the specific row/element containing the headline. Our query would look something like this:
... || find "Liverpool wins the Champion's League"
Then we can look through the row values for values that describe the element. Maybe in this case the row has an
attributes.class column where the value is
headline and a
nodeName column where the value is
H2. Our filter would look something like:
... || filter "attributes.class == 'headline' and nodeName == 'H2'"
If we remove the
find command, we will likely get back all headlines.
Sometimes you will need to try a few times to get the right combination of filters to capture all the elements you are interested in. For example, maybe the main headline has a slightly different filter that needs to be added as an
or to our filter expression.