Sunday, February 13, 2011

Parsing through XML elements in XmlReader

Hi all,

I'm building an application that needs to run through an XML feed but I'm having a little trouble with getting certain elements.

I'm using the Twitter feed and want to run through all the <item> elements. I can connect fine and get the content from the feed but I can't figure out how to select only the item elements when I'm loopuing through reader.Read();.

Thanks for your help!

  • The easiest way to do that is to use XPath. Example to follow.

     string xml = @"<?xml version=""1.0"" encoding=""UTF-8""?>
    <rss version=""2.0"">
        <channel>
        <title>Twitter public timeline</title>
        <link>http://twitter.com/public_timeline</link>
        <description>Twitter updates from everyone!</description>
        <language>en-us</language>
        <ttl>40</ttl>
    
        <item>
          <title>yasu_kobayashi: rTwT: @junm : yayaya</title>
          <description>yasu_kobayashi: rTwT: @junm : yayaya</description>
          <pubDate>Tue, 28 Oct 2008 12:04:48 +0000</pubDate>
          <guid>http://twitter.com/yasu_kobayashi/statuses/978829930</guid>
          <link>http://twitter.com/yasu_kobayashi/statuses/978829930</link>
    
        </item><item>
          <title>FreeGroup: WikiFortio - foobar http://tinyurl.com/5gvttf</title>
          <description>FreeGroup: WikiFortio - foobar
          http://tinyurl.com/5gvttf</description>
          <pubDate>Tue, 28 Oct 2008 12:04:47 +0000</pubDate>
          <guid>http://twitter.com/FreeGroup/statuses/978829929</guid>
          <link>http://twitter.com/FreeGroup/statuses/978829929</link>
    
        </item></channel></rss>
            ";
                XPathDocument doc = new XPathDocument(new StringReader(xml));
                XPathNavigator nav = doc.CreateNavigator();
    
                // Compile a standard XPath expression
    
                XPathExpression expr;
                expr = nav.Compile("/rss/channel/item");
                XPathNodeIterator iterator = nav.Select(expr);
    
                // Iterate on the node set
    
                try
                {
                    while (iterator.MoveNext())
                    {
                        XPathNavigator nav2 = iterator.Current.Clone();
                        nav2.MoveToChild("title","");
                        Console.WriteLine(nav2.Value);
                        nav2.MoveToParent();
                        nav2.MoveToChild("pubDate","");
                        Console.WriteLine(nav2.Value);
    
                    }
                }
                catch (Exception ex)
                {
                    Console.WriteLine(ex.Message);
                }
                Console.ReadKey();
    

    And this is jan's approach working

            XmlDocument doc2 = new XmlDocument();
            doc2.LoadXml(xml);
            XmlNode root = doc2.DocumentElement;
    
            foreach (XmlNode item in root.SelectNodes(@"/rss/channel/item"))
            {
                Console.WriteLine(item.SelectSingleNode("title").FirstChild.Value);
                Console.WriteLine(item.SelectSingleNode("pubDate").FirstChild.Value);
            }
    
  • An alternative:

    // starts as in Vinko Vrsalovic 's answer
    // and not including decent eror handling
    XmlDocument doc = new XmlDocument(new StringReader(xml)); 
    
    foreach (XmlNode item in doc.SelectNodes(@"/rss/channel/item"))
    {
      Console.WriteLine(item.SelectSingleNode("title").Value);
      Console.WriteLine(item.SelectSingleNode("pubDate").Value);
    }
    

    I don't know if this code is slower or bad practice. Please do comment.

    I find it more readable than the other one using Navigator and Iterator.

    Edit: I use an XmlDocument. An XPathDocument as in Vinko Vrsalovic 's answer doesn't support this way of working, but is a lot faster: (MSDN)

    AnthonyWJones : Unless you have a very large stream which makes loading the content into a document undesirable then this would a reasonable approach and a lot more readable than using an XmlReader.
    Vinko Vrsalovic : I like this approach too. But you can't use it with a XPathDocument, you need a XmlNode
    jan : My mistake! I meant XmlDocument. But since I can imagine a Twitter rss feed to become very large, I'd go - in this case - for an implementation using XPathDocument too.
    From jan

0 comments:

Post a Comment