Create RSS feed from any web page using Yahoo Pipes

In this post, i’m going to write a simple explanation / basic example about using Yahoo Pipes to fetch a webpage (you are free to use any pages you want assuming they allow Yahoo Pipes) and then create a RSS Feed from it so you can read it on your favorite rss reader

As an example, in this post i’m going to give an example of creating RSS Feed from HorribleSubs website (horriblesubs.org) that i’ve been using (for myself only) so i can keep track on their Gintama release easily (i read that they’re planning on doing a total makeover of their site so i guess it’s okay to use them as an example)

Yahoo Pipes HorribleSubs RSS Output example using Fetch Page module

Before anything else, please see the source of the pipe used in this example (you need to log in to Yahoo first) because you’ll need to be logged in to Yahoo to see or create a new pipe

Update 1: Here’s the updated version of the pipe which is used for their new domain (horriblesubs.info) and their new site design. The old pipe is left there in case you want to compare the old pipe with the updated pipe and also because the screenshot that is used here is based from the old pipe. As you can see, the process itself is still the same but with some adjustments

Update 2: As of June 2012 it is still working (last time i checked their website because ever since Gintama end, i don’t check it anymore), and i noticed that they now published their own RSS Feed so if you only use this simple example just to see their rss feed, i recommend to grab their official feed instead because i saw on their page that they’re planning on redesigning their website

Update 3: As of November 2012 the above pipe (the pipe in update 1) is still working (if you need to see a working example), and i made some changes in this post by including images

1. First thing you need to do is obviously examine the page source you’re going to fetch to see where you should start cutting and how the items separated

For example, in this case the content i’m going to pick is wrapped within a div ( <div id="tab3" class="boxcontent"> ) and the items is separated by <br/> tag and so i just need to write that into the fetch page module

Target HTML Source

and here is what it looks like on the Yahoo Pipes side

Example of Yahoo Pipes fetch page module

2. At this part, i’m filtering the content from unneded html tags and content that i deemed unnecessary by using the regex / regular expression module as you can see on the pipe source. But because there’s no single regex rule to rule them all (because it depend on your needs), you’ll need to experiment by yourself at the regex parts

Cleaning up stuff using the regex module

3. And now, so i can process each item separately, i’m mapping the previously cleaned up content as title, description, and link which is going to be used for the RSS Feed title, description and link respectively by using the rename module

Mapping items on Yahoo Pipes

and here’s the output

Yahoo Pipes items mapping

4. Once again the regex module is used and this time i’m using it to clean-up the html tags in the title (to differentiate it from the description) and the link so it gives you the target url only (in this case it is the torrent link) so when you click on the title from your RSS Reader you’ll go to the target url directly (note: see the output difference between below output image and the above image)

Another example of using regex module to clean up stuff

and here’s the output

Yahoo Pipes mapping cleaned up

5. Finally connect it to the pipe output and to get it as RSS, you just need to copy the Get as RSS link from your pipe to display it as RSS Feed and done

Yahoo Pipes Fetch Page example

Also because there’s a usage limits imposed by Yahoo Pipes as quoted below

200 runs (of a given Pipe) in 10 minutes
200 runs (of any Pipe) from an IP in 10 minutes
If you exceed the 200 runs in a 10 minute block, your Pipe will be 999’ed for a hour.

You should make sure to cache your output before using it (unless perhaps you’re the only person that use the pipe you’ve created though it’s still better to cache it)

Comment?

Note: Comment may not appear right away.

3 comments on “Create RSS feed from any web page using Yahoo Pipes

  1. Thanks for the tutorial. FYI there’s a new way to fetch the exact HTML content you need using XPath. It makes things so much easier (and generally more reliable) than trying to extract using the old BEGIN and END fields.