Amazon

Monday, September 12, 2011

Processing a zipper structure

Today I was working on processing some XML that I parsed into a zipper structure. After processing, I needed to turn it back into XML. I know functions exist to do this but I wanted to see what I could come up with. So here's what I wrote:

(defn render-element [open-or-close node]
  (let [slash (if (= open-or-close :open)
                ""
                "/")]
    (if (map? node)
      (str "<" slash (name (:tag node)) ">")
      node)))

(defn traverse [root arrive-fn leave-fn]
  (loop [loc root
         nodes []
         direction :arrive]
    (let [node-fn (if (= direction :arrive) arrive-fn leave-fn)
          new-nodes (conj nodes (node-fn (zip/node loc)))]
      (if (and (= loc root) (= direction :leave))
        new-nodes
        (let [child-loc (zip/down loc)
              right-loc (zip/right loc)]
          (if (or (nil? child-loc) (= direction :leave))
            (if (nil? right-loc)
              (recur (zip/up loc) new-nodes :leave)
              (recur right-loc new-nodes :arrive))
            (recur child-loc new-nodes :arrive)))))))
gist


Let's take a look at this a little bit at a time.

(defn render-element [open-or-close node]
  (let [slash (if (= open-or-close :open)
                ""
                "/")]
    (if (map? node)
      (str "<" slash (name (:tag node)) ">")
      node)))
The render-element is pretty self explanatory. It takes a node and an :open or :close flag. Based on those flags it will return a rendered xml element. The function is not complete but it's enough to get the idea.


(defn traverse [root arrive-fn leave-fn]
The traverse function takes a zipper location and 2 one arg functions. The first renders a node when we first arrive at it and the second renders it when we're leaving.


(loop [loc root
         nodes []
         direction :arrive]
Here I initialize the loop bindings. I set the current location to the current zipper and initialize the nodes list to an empty vector. The nodes list is where we will accumulate everything we process. Finally, I set the direction to :arrive since we're just starting to process the zipper.


(let [node-fn (if (= direction :arrive) arrive-fn leave-fn)
          new-nodes (conj nodes (node-fn (zip/node loc)))]
I tend to use a lot of local bindings just to make things a little cleaner. Here, I create a shortcut to the appropriate render function and I go ahead and add the current node to the node list because I know I'll need it later.


(if (and (= loc root) (= direction :leave))
        new-nodes
The next step is to determine whether we're at the end or not. If the current location is the same as the initial location and the direction is :leave then we know we're done so we just return the list of rendered nodes.


(let [child-loc (zip/down loc)
              right-loc (zip/right loc)]
Again, some more local bindings just to make the remaining code cleaner. I know I'm going to need the leftmost child and the sibling to the right of the current location so I create a binding for those.


(if (or (nil? child-loc) (= direction :leave))
            (if (nil? right-loc)
              (recur (zip/up loc) new-nodes :leave)
              (recur right-loc new-nodes :arrive))
            (recur child-loc new-nodes :arrive)))))))
Ok, here's where we navigate the tree. Let's deal with the else clause first. If there are children to process then we loop, setting the current location to the leftmost child and we set the direction as :arrive. If there aren't any children or we're done processing all of them (indicated by a direction of :leave) then we see if there are any siblings to the right. Again, the else clause causes us to move to the sibling to the right and to loop to process it. If there isn't a sibling to the right then we move back up to the parent and set the direction to :leave indicating that we're all done with this subtree.

This seems to work pretty well. To try it out, try something like:

(traverse my-zip (partial render-element :open) (partial render-element :close))
Comments and/or suggestions are always welcome.

Wednesday, September 7, 2011

Leiningen, Maven and JDK7

A little while ago, I upgraded my machine at work to JDK7.  Everything seemed to be working fine until I needed to download a new artifact from our local Artifactory repository.  I kept getting messages that my connection was timing out.  After some investigation I figured out that JDK7 tries to use IPv6 by default and something on our local network didn't like that.  I found a System property, java.net.preferIPv4Stack that tells Java to try IPv4 first. I setup an environment variable

MAVEN_OPTS=-Djava.net.preferIPv4Stack=true
and everything seemed to work fine.

That is until yesterday when I tried to add a new dependency to a Leiningen project. Apparently, Leiningen doesn't utilize the MAVEN_OPTS variable. Instead, it uses

JAVA_OPTS=-Djava.net.preferIPv4Stack=true
Now everything is working again and life is good!.

Friday, September 2, 2011

Programming Praxis: Two String Exercises

Today's exercise has two parts. First, replace multiple occurrences of a character in a string with just the first occurrence. So, aaabbb becomes ab and abcbd becomes abcd.

The second part is to replace multiple consecutive spaces with one space. In this case a b stays unchanged while a   b becomes a b.

These kinds of problems are rather trivial in Clojure due to the rich set of functions you get out of the box.

;;Remove dups & mult spaces
(def s1 "aaabbb")
(def s2 "abcbd")
(def s3 "abcd")
(def s4 "a b");
(def s5 "a  bb    x b");

(defn remove-dup-chars [s]
  (apply str (distinct s)))

(defn remove-consecutive-spaces [s]
  (apply str (mapcat (fn [l] (if (= (first l) \space)
                               (distinct l)
                               l))
                     (partition-by
                      (partial = \space)
                      s))))

In the first case, we use distinct to remove the duplicates and the apply str to turn it back into a string.

In the second case, we partition the string breaking on strings. Then we mapcat using our function that looks at the sublists and removes duplicate spaces. Since mapcat flattens our sublists into 1 list we can just use apply str on the result to turn it back into a string.

gist