Parsing is very slow #25

ferrous26 · 2011-09-17T02:20:40Z

There are a few reasons why:

Handsoap::XmlQueryFront::NokogiriDriver#to_s is very inefficient

The method uses a lot of literal strings that are constant and do not need to be modified. Using literal strings means they need to be #duped every time they are used in the method. Also, there are several #gsub calls where #gsub! could be called instead.

There is also a note about Nokogiri APIs being unstable, I'm not sure if this is the case anymore, but I overrode this method to just call #content on the backing Nokogiri node. I have something like this as a solution:

diff --git a/lib/handsoap/xml_query_front.rb b/lib/handsoap/xml_query_front.rb
index 3df435c..742d7e1 100644
--- a/lib/handsoap/xml_query_front.rb
+++ b/lib/handsoap/xml_query_front.rb
@@ -168,9 +168,8 @@ module Handsoap
       # Returns the underlying native element.
       #
       # You shouldn't need to use this, since doing so would void portability.
-      def native_element
-        @element
-      end
+      attr_reader :native_element
+
       # Returns the node name of the current element.
       def node_name
         raise NotImplementedError.new
@@ -350,13 +349,34 @@ module Handsoap
           element = @element.children.first
         end
         return if element.nil?
+        string = element.content
+
         # This looks messy because it is .. Nokogiri's interface is in a flux
         if element.kind_of?(Nokogiri::XML::CDATA)
-          element.serialize(:encoding => 'UTF-8').gsub(/^<!\[CDATA\[/, "").gsub(/\]\]>$/, "")
+          stirng.gsub!(EBEGIN_CDATA, BLANK_STRING)
+          string.gsub!(EEND_CDATA,   BLANK_STRING)
         else
-          element.serialize(:encoding => 'UTF-8').gsub('&lt;', '<').gsub('&gt;', '>').gsub('&quot;', '"').gsub('&apos;', "'").gsub('&amp;', '&')
+          string.gsub!(ELT,   LT)
+          string.gsub!(EGT,   GT)
+          string.gsub!(EQUOT, QUOT)
+          string.gsub!(EAPOS, APOS)
+          string.gsub!(EAMP,  AMP)
         end
-      end
+        string
+      end
+      EBEGIN_CDATA = /^<!\[CDATA\[/
+      EEND_CDATA   = /\]\]>$/
+      BLANK_STRING = ''
+      ELT          = '&lt;'
+      LT           = '<'
+      EGT          = '&gt;'
+      GT           = '>'
+      EQUOT        = '&quot;'
+      QUOT         = '"'
+      EAPOS        = '&apos;'
+      APOS         = "'"
+      EAMP         = '&amp;'
+      AMP          = '&'
     end
   end
 end

All the data transformers use #to_s

This is expensive since calling #to_s is expensive, but even if #to_s is fixed I do not think the other transformers need to unescape the escape sequences, do they?

I don't really have the time to fix this right now and also make sure I don't break the other drivers. :(

Using XPath is not very efficient for large data structures

Rewalking the XML subtree is expensive for big data structures. I'm not sure if this is a problem for Handsoap, but maybe a notice in the documentation should be added.

I have worked around all of these issues in a gem that uses handsoap: http://github.com/Marketcircle/jiraSOAP.

The text was updated successfully, but these errors were encountered:

troelskn · 2011-09-18T20:20:41Z

Hi,

Thanks for this. Most seems like quite good suggestions. Unfortunately I don't really have the time to maintain Handsoap at the moment, as the issue tracker here shows. If you'll provide a patch and a pull request, I'll get it merged into the main branch.

Or if you're interested in taking over maintenance of the library, let me know.

ferrous26 · 2011-09-24T02:27:32Z

Hi,

Unfortunately, I'm overcommitted as it is, so I can't take over maintainership. I will try to cleanup fixes for the points I've raised and then open a pull request in the not too distant future.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parsing is very slow #25

Parsing is very slow #25

ferrous26 commented Sep 17, 2011

troelskn commented Sep 18, 2011

ferrous26 commented Sep 24, 2011

Parsing is very slow #25

Parsing is very slow #25

Comments

ferrous26 commented Sep 17, 2011

troelskn commented Sep 18, 2011

ferrous26 commented Sep 24, 2011