Amazon

Monday, September 12, 2011

Processing a zipper structure

Today I was working on processing some XML that I parsed into a zipper structure. After processing, I needed to turn it back into XML. I know functions exist to do this but I wanted to see what I could come up with. So here's what I wrote:

(defn render-element [open-or-close node]
  (let [slash (if (= open-or-close :open)
                ""
                "/")]
    (if (map? node)
      (str "<" slash (name (:tag node)) ">")
      node)))

(defn traverse [root arrive-fn leave-fn]
  (loop [loc root
         nodes []
         direction :arrive]
    (let [node-fn (if (= direction :arrive) arrive-fn leave-fn)
          new-nodes (conj nodes (node-fn (zip/node loc)))]
      (if (and (= loc root) (= direction :leave))
        new-nodes
        (let [child-loc (zip/down loc)
              right-loc (zip/right loc)]
          (if (or (nil? child-loc) (= direction :leave))
            (if (nil? right-loc)
              (recur (zip/up loc) new-nodes :leave)
              (recur right-loc new-nodes :arrive))
            (recur child-loc new-nodes :arrive)))))))
gist


Let's take a look at this a little bit at a time.

(defn render-element [open-or-close node]
  (let [slash (if (= open-or-close :open)
                ""
                "/")]
    (if (map? node)
      (str "<" slash (name (:tag node)) ">")
      node)))
The render-element is pretty self explanatory. It takes a node and an :open or :close flag. Based on those flags it will return a rendered xml element. The function is not complete but it's enough to get the idea.


(defn traverse [root arrive-fn leave-fn]
The traverse function takes a zipper location and 2 one arg functions. The first renders a node when we first arrive at it and the second renders it when we're leaving.


(loop [loc root
         nodes []
         direction :arrive]
Here I initialize the loop bindings. I set the current location to the current zipper and initialize the nodes list to an empty vector. The nodes list is where we will accumulate everything we process. Finally, I set the direction to :arrive since we're just starting to process the zipper.


(let [node-fn (if (= direction :arrive) arrive-fn leave-fn)
          new-nodes (conj nodes (node-fn (zip/node loc)))]
I tend to use a lot of local bindings just to make things a little cleaner. Here, I create a shortcut to the appropriate render function and I go ahead and add the current node to the node list because I know I'll need it later.


(if (and (= loc root) (= direction :leave))
        new-nodes
The next step is to determine whether we're at the end or not. If the current location is the same as the initial location and the direction is :leave then we know we're done so we just return the list of rendered nodes.


(let [child-loc (zip/down loc)
              right-loc (zip/right loc)]
Again, some more local bindings just to make the remaining code cleaner. I know I'm going to need the leftmost child and the sibling to the right of the current location so I create a binding for those.


(if (or (nil? child-loc) (= direction :leave))
            (if (nil? right-loc)
              (recur (zip/up loc) new-nodes :leave)
              (recur right-loc new-nodes :arrive))
            (recur child-loc new-nodes :arrive)))))))
Ok, here's where we navigate the tree. Let's deal with the else clause first. If there are children to process then we loop, setting the current location to the leftmost child and we set the direction as :arrive. If there aren't any children or we're done processing all of them (indicated by a direction of :leave) then we see if there are any siblings to the right. Again, the else clause causes us to move to the sibling to the right and to loop to process it. If there isn't a sibling to the right then we move back up to the parent and set the direction to :leave indicating that we're all done with this subtree.

This seems to work pretty well. To try it out, try something like:

(traverse my-zip (partial render-element :open) (partial render-element :close))
Comments and/or suggestions are always welcome.

Wednesday, September 7, 2011

Leiningen, Maven and JDK7

A little while ago, I upgraded my machine at work to JDK7.  Everything seemed to be working fine until I needed to download a new artifact from our local Artifactory repository.  I kept getting messages that my connection was timing out.  After some investigation I figured out that JDK7 tries to use IPv6 by default and something on our local network didn't like that.  I found a System property, java.net.preferIPv4Stack that tells Java to try IPv4 first. I setup an environment variable

MAVEN_OPTS=-Djava.net.preferIPv4Stack=true
and everything seemed to work fine.

That is until yesterday when I tried to add a new dependency to a Leiningen project. Apparently, Leiningen doesn't utilize the MAVEN_OPTS variable. Instead, it uses

JAVA_OPTS=-Djava.net.preferIPv4Stack=true
Now everything is working again and life is good!.

Friday, September 2, 2011

Programming Praxis: Two String Exercises

Today's exercise has two parts. First, replace multiple occurrences of a character in a string with just the first occurrence. So, aaabbb becomes ab and abcbd becomes abcd.

The second part is to replace multiple consecutive spaces with one space. In this case a b stays unchanged while a   b becomes a b.

These kinds of problems are rather trivial in Clojure due to the rich set of functions you get out of the box.

;;Remove dups & mult spaces
(def s1 "aaabbb")
(def s2 "abcbd")
(def s3 "abcd")
(def s4 "a b");
(def s5 "a  bb    x b");

(defn remove-dup-chars [s]
  (apply str (distinct s)))

(defn remove-consecutive-spaces [s]
  (apply str (mapcat (fn [l] (if (= (first l) \space)
                               (distinct l)
                               l))
                     (partition-by
                      (partial = \space)
                      s))))

In the first case, we use distinct to remove the duplicates and the apply str to turn it back into a string.

In the second case, we partition the string breaking on strings. Then we mapcat using our function that looks at the sublists and removes duplicate spaces. Since mapcat flattens our sublists into 1 list we can just use apply str on the result to turn it back into a string.

gist

Tuesday, August 30, 2011

Programming Praxis: Hamming Numbers

Today's problem is to generate the first 1000 numbers in the sequence of Hamming numbers. More information on Hamming numbers can be found at the Programming Praxis blog post.

When I first saw this problem my mind jumped right to an imperative way of solving it. Start building up a list of numbers, keep checking to see if I had a 1000, and then print the list. Although this would work, it didn't seem very Clojureish.

That's when I remebered lazy sequences. A lazy sequence will generate the numbers as I need them. Then I can just ask for a 1000 and it will generate a 1000 numbers (maybe a few more, but we'll never see these). Here's the code:

;;Hamming Numbers

(defn hamming-generator
  ([] (hamming-generator (sorted-set 1)))
  ([some-set] (let [n (first some-set)
                    s (disj some-set n)]
                (cons n
                      (lazy-seq (hamming-generator
                                 (conj s (* n 2) (* n 3) (* n 5))))))))

(def hamming-numbers (take 1000 (hamming-generator)))

The first line in the function is there to seed the function with a "1". The real meat of the function goes like this: I bind the first value in the sorted set to n and then I remove n from the sorted set and bind the new set to s. From there I create a seq with n at the front and a lazy-seq representing the future values at the end. I don't get a stack overflow with the recursive call because lazy-seq only calls hamming-generator when a new number is needed. lazy-seq caches the values from the previous calls to hamming-generator and passes the last generated set to hamming-generator to get the next value in the sequence. Nice and elegant and it does all the book keeping work for you!

gist

Friday, August 26, 2011

Programming Praxis: Reverse sublists

Today's problem should be especially easy in clojure given clojure's partition functions. Let's see if I'm right.

First we create a linked list using the native java class java.util.LinkedList. What's really neat here is that clojure's sequence functions all work on java's native collection classes. So the solution is super easy at that point. Partition the list into k elements and then mapcat that list using reverse.

;;Reverse k nodes

(def linked-list (java.util.LinkedList. [1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20]))

(defn reverse-sub-lists [l k]
  (mapcat reverse (partition-all k l)))

The more I use clojure the more I like it!

gist

Tuesday, August 23, 2011

Programming Praxis: Hett's problem 1.28

Today's Programming Praxis problem didn't really interest me so I decided to go back and do an older one. The problem I'm going to work on today is one of sorting lists of lists. So, given a list that contains other lists you must sort it two ways. First, by the length of each inner list and second by the frequency of that length occurring. See the original post for an example. Given clojure's sort functions this shouldn't be too hard.

The first one is especially easy:

;;Hett's Problem 1.28

(def test-list [[:a :b :c] [:d :e] [:f :g :h] [:d :e] [:i :j :k :l] [:m :n] [:o]])

(defn sort-by-length [some-list]
  (sort-by count some-list))
Just use the sort-by function passing it the count function. So for each inner list the sort-by receives it will sort by the value returned from applying count to that inner list.

The second sort is a little harder but not by much:

(defn sort-by-length-freq [some-list]
  (let [freq-map (frequencies (map count test-list))]
    (sort-by (fn [l] (freq-map (count l))) some-list)))
First we build a map that associates the count of each inner list with the number of times it occurs. So a count of 3 occurs twice, count of 2 occurs 3 times, count of 4 occurs once, and a count of 1 occurs once. We then use that to give sort-by a function that looks up the occurrences based on the count.

gist

Friday, August 19, 2011

Programming Praxis: First Non-Repeating Character

UPDATED based on comments. Thanks guys!

Today's problem is to find the first non-repeating character in a string. If there are no non-repeating characters then we should indicate that also.

Here's how I broke down the problem. First I decided to group the characters in the string by their identity and index them based on the character's position in the string. For the string "abbc" this would produce a map like this:

{0 [\a] 1 [\b \b] 3 [\c]}
From there, I sort based on the index and drop all the pairs whose second element has a count > 1 until I reach my answer.
;;first non-repeating character
(def test-str "aabcbcdeef")

(defn find-non-repeat [some-str]
  (let [x (first
           (drop-while (fn [x] (> (count (last x)) 1))
                       (sort (group-by (fn [c] (.indexOf some-str (str c)))
                                       some-str))))]
    (if (nil? x)
      "No non-repeating characters"
      (str (first (last x))))))

gist

Tuesday, August 16, 2011

Programming Praxis: Cyclic Sorted List

Today's problem is to insert an element into the correct place in a cyclic sorted list. A cyclic sorted list is an ordered list whose last element points to the first element. One thing to keep in mind is that you may be given any node in the list to start with. The problem doesn't say it but I'm going to assume that this list is singly linked.

The first thing I need to figure out is how to represent a cyclic sorted list in Clojure. I've decided to represent the list as a series of nodes. Each node is an atom that points to a two element vector. The first element is the value of the node and the second element is a function that returns the next node. I tried to make the second element an atom but this gave me a stack overflow error.

After that I just need functions to find the insert spot and to insert the value.

;;cyclic list insertion

(def start-node (atom [1 (fn [] start-node)]))

(defn insert-after [node val]
  (let [new-node (atom [val (last @node)])]
    (swap! node (fn [n] [(first n) (fn [] new-node)])))) 

(defn find-insert-spot [node val]
  (loop [cnode node
         ctr 0]
    (let [next-node ((last @cnode))
          cval (first @cnode)
          nval (first @next-node)]
      (if (= next-node node)
        cnode
        (if (or (<= cval val nval)
                (and (> cval nval) (or (>= val cval)
                                       (<= val nval))))
          cnode
          (recur next-node (inc ctr)))))))

(defn insert-val [node val]
  (insert-after (find-insert-spot node val) val))

(defn get-nth-after [start cnt]
  (loop [cnode start i cnt]
    (if (= i 0)
      cnode
      (recur ((last @cnode)) (dec i)))))

(defn ring-vals [start]
  (loop [cnode start
         acc []]
    (let [next-node ((last @cnode))
          cval (first @cnode)]
      (if (= next-node start)
        (conj acc cval)
        (recur next-node (conj acc cval))))))
gist

Friday, August 12, 2011

Programming Praxis: Word Breaks

Here's my solution to Programming Praxis: Word Breaks
;; word-breaks
(def dict #{"a" "apple" "applepie" "base" "baseball"
            "bat" "gift" "i" "pi" "pie" "twist"})

(defn permute [ss]
  (distinct 
   (mapcat identity 
           (let [cnt (count ss)
                 coll (take cnt (iterate rest ss))]
             (for [n (range 1 (inc cnt))]
               (mapcat (fn [x]
                         (map (fn [y] (apply str y))
                              (partition n x)))
                       coll))))))

(defn find-word-bounds [ss]
  (filter dict (permute ss)))
gist