资源说明:A simple, flexible, and extensible RSS and Atom parser for Ruby. Based on the popular SimpleRSS library, but with many nice extra features. (retired)
## DESCRIPTION:
A simple, flexible, and extensible RSS and Atom parser for Ruby. Based on the popular SimpleRSS library, but with many nice extra features.
## FEATURES/PROBLEMS:
* Parse RSS 0.91, 0.92, 1.0, and 2.0
* Parse Atom
* Parse all tags by default, or choose the tags you want to parse
* Access all attributes and content as if they were methods
* Access all values of tags that can appear multiple times
* Delicious syntactic sugar that makes it simple to get the data you want
### SYNOPSIS:
The API is similar to SimpleRSS:
require 'rubygems'
require 'feedme'
require 'open-uri'
rss = FeedMe.parse open('http://slashdot.org/index.rdf')
rss.version # => 1.0
rss.channel.title # => "Slashdot"
rss.channel.link # => "http://slashdot.org/"
rss.items.first.link # => "http://books.slashdot.org/article.pl?sid=05/08/29/1319236&from=rss"
But since the parser can read Atom feeds as easily as RSS feeds, there are aliases that allow more atom like reading:
rss.feed.title # => "Slashdot"
rss.feed.link # => "http://slashdot.org/"
rss.entries.first.link # => "http://books.slashdot.org/article.pl?sid=05/08/29/1319236&from=rss"
Under the covers, all element values are stored in arrays. This means that you can access all content for an element that appears multiple times (i.e. category):
rss.items.first.category_array # => ["News for Nerds", "Technology"]
rss.items.first.category # => "News for Nerds"
You also have access to all the attributes as well as tag values:
rss.items.first.guid.isPermaLink # => "true"
rss.items.first.guid.content # => http://books.slashdot.org/article.pl?sid=05/08/29/1319236
FeedMe also adds some syntactic sugar that makes it easy to get the information you want:
rss.items.first.category? # => true
rss.items.first.category_count # => 2
rss.items.first.guid_value # => http://books.slashdot.org/article.pl?sid=05/08/29/1319236
There are two different parsers that you can use, depending on your needs. The default parser is "promiscuous," meaning that it parses all tags. There is also a strict parser that only parses tags specified in a list. Here is how you create the different types of parsers:
FeedMe.parse(source) # parse using the default (promiscuous) parser
FeedMe::ParserBuilder.new.parse(source) # equivalent to the previous line
FeedMe.parse_strict(source)
FeedMe::StrictParserBuilder.new.parse(source) # only parse certain tags
The FeedMe class methods and the parser builder constructors also accept an options hash. Options are also passed on to the Parser constructor. Currently, only two options are available:
1. :empty_string_for_nil => false # return the empty string instead of a nil value
2. :error_on_missing_key => false # raise an error if a specified key or virtual method does not exist (otherwise nil is returned)
The strict parser can be extended by adding new tags to parse:
builder = FeedMe::StrictParserBuilder.new
builder.rss_tags << :some_new_tag
builder.rss_item_tags << :'item+myrel' # parse an item that has a custom rel type
builder.item_ext_tags << :feedburner_origLink # parse an extension tag - one that has a specific
# namespace (use '_', not ':', to separate namespace
# from attribute name)
Either parser can be extended by adding aliases to existing tags:
builder.aliases[:updated] => :pubDate # now you can always access the updated date using :updated,
# regardless of whether it's an RSS or Atom feed
If you don't know ahead of time what type of feed you'll be parsing, you can tell FeedMe to always emulate RSS or Atom. These methods just add a bunch of aliases:
builder.emulate_rss!
builder.emulate_atom!
Another bit of syntactic sugar are transformations. These are modifications that can be applied to feed content. There is a default transformation that can be applied by adding '!' to the tag name.
rss.entry.content # => Some great stuff
rss.entry.content! # => Some great stuff
The default transformation can be changed:
builder.default_transformation = [ :cleanHtml ]
Custom transformations are defined by mapping one or more transformation functions to a suffix:
builder.transformations['clean'] = [ :cleanHtml ]
rss.entry.content # => This is a bunch of text
English
