We all make mistakes. We have all written bad pieces of code at some point of our lives. They might have been non-idiomatic, unreadable, or had bad performance.
This post catalogs all the little things which sit down at the bottom of your code, accumulate like dust, and when taken in as a whole, clog down the code base. It's important to remember code is read more than it is edited, and is not only a way for us to communicate with the computer, but with other programmers (including ourselves in the future).
Hopefully the advice in this post will help you recognize all those little code smells and write more readable, maintainable code.
Happy hacking.
Mapping
mapcat
To The Rescue
Sometimes you need to concat the results of mapping. Using mapcat
is an idiomatic option for this case:
;;; Don't Do (apply concat (map f xs)) => (mapcat f xs)
What Is It For?
Avoid trivial for
(list comprehensions)
;;; Don't Do (for [x xs] (f x)) => (map f xs)
And yes, I've seen a combination of both:
;;; Please Don't (apply concat (for [x xs] (f x)))
Trivial Lambda
There's no reason to wrap a function in an anonymous function:
;;; Don't Do (map #(f %) xs) => (map f xs)
Realize When Collections Want To Be Associative
If you find yourself scanning collections of maps looking for a map where a certain key has a certain value, your collection might be telling you it wants to be associative, not sequential:
(def people [{:person/name "Fred"} {:person/name "Ethel"} {:person/name "Lucy"}]) (defn person-in-people? [person people] (some #(= person (:person/name %)) people)) (person-in-people? "Fred" people);; => true (person-in-people? "Bob" people) ;; => nil
Instead, index the collection according to the parameter which interests you.
With group-by
:
(def collected-people (group-by :person/name people)) (contains? collected-people "Fred");; => true (contains? collected-people "Bob") ;; => false
With clojure.set/index
:
(def collected-people (clojure.set/index people [:person/name])) (contains? collected-people {:person/name "Fred"});; => true (contains? collected-people {:person/name "Bob"}) ;; => false
This code is readable: "Does collected-people
contain a person
who's name is Bob?".
An important distinction between group-by and index is that index's behavior is undefined when not dealing with sets. (Thanks to Jozef Wagner for pointing this out). If you're not working with sets, you might lose information!
If you repeatedly filter according to certain attributes, you might
want to index according to them, too, with group-by
or
clojure.set/index
.
An important distinction between group-by and index, is that group-by indexes according to a function, so you can presumably bin values as well, for example, by age or height, while index works specifically with keys.
Filtering
This problem is similar to a trivial lambda, just wrapped in negation.
Remember remove
is the negated case of filter
:
;;; Don't Do (filter #(not (p %)) xs) => (remove p xs) (remove #(not (p %)) xs) => (filter p xs)
Sequence navigation
Sometimes you need to dig into elements in a sequence. Plenty of users aren't familiar with the concise versions:
;;; Don't Do (first (first xs)) => (ffirst xs) (first (next xs)) => (fnext xs) or (second xs) (next (first xs)) => (nfirst xs) (next (next xs)) => (nnext xs)
Sort of like cadr
but makes more sense.
Emptiness
The concept of emptiness is tricky in Clojure. Did you know nil
is
also empty? What defines an empty sequence? How can we know if a lazy
unrealized sequence is empty? Do we count
it? What if it's infinite?
To begin with, let's look at what empty?
actually does:
(defn empty? "Returns true if coll has no items - same as (not (seq coll)). Please use the idiom (seq x) rather than (not (empty? x))" [coll] (not (seq coll)))
If you have eyes to see, you notice the docstring and know exactly where this is going:
Don't use (not (empty? x))
!
You've probably seen or written at least one of the following examples:
;;; Don't Do (when (not (empty? x)) ...) => (when (seq x) ...) (when-not (empty? x) ...) => (when (seq x) ...) (when (= 0 (count x)) ...) => (when (empty? x) ...) (when (< 0 (count x)) ...) => (when (seq x) ...)
Into into
into
is a pretty useful function, but one often abused.
The (mis)usage of into
can usually be broken to three distinct
cases:
Type Transformations
;;; Don't Do (into [] xs) => (vec xs) (into #{} xs) => (set xs)
Map Mapping
;;; Don't (into {} (map (fn [[k v]] [k (f v)]) m)) (into {} (for [[k v] m] [k (f v)])) ;;; Do (reduce-kv (fn [m k v] (assoc m k (f v))) {} m) ;;; Or faster* but less pretty (persistent! (reduce-kv (fn [m k v] (assoc! m k (f v))) (transient {}) m))
Don't write it out manually every time. Turn it into a function and throw it into one of half-dozen util namespaces.
NOTE: the benefits from the transient version increase with the size of the input. For very small maps, the non-transient version is faster.
Not Using The Transducer API
;;; Don't Do (into coll (map f xs)) => (into coll (map f) xs) (into coll (filter p xs)) => (into coll (filter p) xs)
Working With Maps
The Value Of Nothing
Clojure maps are collections, not slots. Combined with nil
's
meaning being "nothing", nil
values inside maps are confusing:
(get {:a nil} :a) ;; => nil (get {:b 111} :a) ;; => nil
Try to avoid inserting nil
values into a map.
If it is not a map you control, you can always prune it.
Both implementations are valid:
(defn assoc-some [m k v] (if (nil? v) m (assoc m k v))) (defn prune-nils [m] (reduce-kv assoc-some {} m)) (prune-nils {:a 1 :b nil :c 2 :d nil}) ;; => {:a 1, :c 2} (defn remove-nil [m k v] (if (nil? v) (dissoc m k) m)) (defn prune-nils [m] (reduce-kv remove-nil m m)) (prune-nils {:a 1 :b nil :c 2 :d nil}) ;; => {:a 1, :c 2}
Beware of merge
Where Performance Matters
merge
is clear and easy to work with, but has terrible performance.
If you're more interested in the performance side of things, watch this talk.
Keep in mind while reading the following sections that using merge
and select-keys
comes at a price. While the general advice is to
avoid doing things manually, every rule has an exception.
Be aware there are alternatives as well, but merge
still performs
poorly in general.
Thanks to Borkdude for highlighting this issue quickly.
Avoid Manual Merge
The below piece of code conditionally merges m2
into m1
when the
values in m2
are not nil
:
(let [m1 {} m1 (if (nil? (:k m2)) m1 (assoc m1 :k (:k m2)))])
Imagine it being done for every key in m2
. There can be 20 Keys.
Who can even make sense of the important parts of the code
afterwards?
Instead, combine nil pruning with merge, as two steps of understandable data transformations:
(merge m1 (prune-nils m2))
Avoid Manual Key Selection
I see this usually going hand-in-hand with manual merges:
{:a (:a m1) :b (:b m2)}
Again, this can usually involve pretty big maps. Instead, try to:
(merge (select-keys m1 [:a]) (select-keys m2 [:b]))
Conditional Build-Up
I often see this pattern repeating itself:
(defn foo [in] (let [m {:k0 (f0 in)} ;; mandatory m (if (p1 in) (assoc m :k1 (f1 in)) m) ;; optional m (if (p2 in) (assoc m :k2 (f2 in)) m)] m))
Instead, use cond->
:
(defn foo [in] (cond-> {:k0 (f0 in)} ;; mandatory (p1 in) (assoc :k1 (f1 in)) ;; optional (p2 in) (assoc :k2 (f2 in))))
This way, flow control turns into syntax and there's no state to keep track of.
Numbers!
Clojure has functions covering some common use cases when working with numbers which both perform and convey intent better.
Absolute Zero
;;; Don't Do (= 0 x) => (zero? x) (> x 0) => (pos? x) (< x 0) => (neg? x)
Small warning regarding zero?
: It expects its argument to be a
number and will throw otherwise. Checking for equality to 0 is more
permissive.
One Away
;;; Don't Do (+ 1 x) => (inc x) (- 1 x) => (dec x)
Truth Be Told
Same case with numbers, no need to compare to booleans and nil.
;;; Don't Do (= true x) => (true? x) (= false x) => (false? x) (= nil x) => (nil? x)
doall
doall
is a macro which forcefully realizes lazy sequences. It should
not be used in production.
See my previous posts regarding alternative to doall
in a single
threaded and a multi-threaded context.
on a side note, I hope you never see something like: (doall (doseq [x xs] ..))
.
It's wrong on two levels
doseq
is already strictdoseq
returns nothing, so there's nothing todoall
Style
Implicit do
blocks
Some expressions have implicit do
blocks in them, making it
unnecessary to use a do
block.
when
;;; Don't (when test (do expr1 expr2)) ;;; Do (when test expr1 expr2)
let
;;; Don't (let bindings (do expr1 expr2)) ;;; Do (let bindings expr1 expr2)
Function body
;;; Don't (fn [] (do expr1 expr2)) ;;; Do (fn [] expr1 expr2)
More
Same goes for try
, catch
, and any other macro with a type
signature & body
.
Threading
Avoid trivial threading:
(-> x f) => (f x) (-> x (f a)) => (f x a)
And remember to thread with style:
->
is for collections->>
is for sequences
Thread Instead Of A Deep Call Stack
Deep call stacks tie implementations together and make testing more difficult as the input space becomes larger instead of smaller by successive function calls.
They take atomic and understandable data transformation functions and make them opaque.
;;; Don't (defn h [x] ...) (defn g [x] ... (h x)) (defn f [x] ... (g x)) ;;; Do (-> x f g h)
some->
Nested when/let can be converted to some->
;;; Don't (when-let [x1 (f0 x0)] (when-let [x2 (f1 x1)] (when-let [x3 (f2 x2)] (f3 x3)))) ;;; Do (some-> x0 f0 f1 f2 f3)
Same goes for this structure:
;;; Don't (let [x (if (nil? x) nil (x f0)) x (if (nil? x) nil (x f1)) x (if (nil? x) nil (x f2))] (if (nil? x) nil (f3 x))) ;;; Do (some-> x0 f0 f1 f2 f3)
cond->
(let [x (if (p0 x) (f0 x) x) x (if (p1 x) (f1 x) x)] (if (p2 x) (f2 x) x))
(cond-> x (p0 x) f0 (p1 x) f1 (p2 x) f2)
The only subtle difference to keep in mind that the predicates can't depend on the result of the previous computation stage.
We can achieve that behavior with a slight modification to cond->
:
(defmacro cond*-> [expr & clauses] (assert (even? (count clauses))) (let [g (gensym) steps (map (fn [[test step]] `(if (-> ~g ~test) (-> ~g ~step) ~g)) (partition 2 clauses))] `(let [~g ~expr ~@(interleave (repeat g) (butlast steps))] ~(if (empty? steps) g (last steps))))) (cond*-> x p0 f0 p1 f1 p2 f2)
let
Shadowing
There's usually very little reason to shadow a binding inside a let
body:
(let [x (f0 x) x (f1 x) x (f2 x)] (f3 x))
Should be:
(-> x f0 f1 f2 f3)
Nested Forms
Plenty of macros with binding forms don't need to be nested:
;;; Don't (let [x 1] (let [y 2] [x y])) ;; Do (let [x 1 y 2] [x y])
A special case is that of nested iteration:
;;; Don't (doseq [x xs] (doseq [y ys] (println x y))) ;;; Do (doseq [x xs y ys] (println x y))
Trivial Mapping Wrappers
We can sometimes find the following form in code, of do-one-thing then do-one-thing-n-times:
(defn build-person [x] ...) (defn build-persons [xs] (map build-person xs))
It may not be immediately apparent why it's a code smell.
- It ties implementation to concretion. Now the entire code will be in the context of a sequence of things and not thinking about one thing at a time. We have to mentally jump back and forth between the collection of things and one thing.
- Fragile: we might end up adding more logic to
build-persons
, which makes us nominative liars, since the name doesn't match what it does anymore. This change now won't be reflected across the system by anyone using only the singular function. - Composes badly: had we wanted to compose several singular
functions, we could have just used a threading macro or
comp
and be done with it. To compose for a sequential use case we could have used transducers. Now we are tied to a sequential concretion. - Less reusable: due to all of the above, once a singular function has been dragged to the plural context, we decrease the chances of being able to reuse the singular from one side, and tie the plural down with further details which prevent usage in other contexts, or would necessitate ugly hacks, like mapping it on a sequence with one element.
Positional Arguments
Avoid An Explosion In Input Arguments
If your functions take a very large number of arguments, suspect they should be broken apart, because it's unlikely all arguments are used simultaneously.
(defn f [fizz buzz foo bar quux ohgod enough])
Prefer A map To Key-Value Rest Arguments
(defn f ([x y] (f x y :a 1 :b 2)) ([x y & {a :a b :b}] ,,,)) (f x y :a 3 :b 4) ;;; vs. (defn f ([x y] (f x y {:a 1 :b 2})) ([x y {a :a b :b}] ,,,)) (f x y {:a 3 :b 4})
The main reason for this is when you wish to pass the function
around as a higher order function you need to be mindful of the type
of the tail arguments, make sure they always get unpacked, and
remember to usually apply
everywhere. This is easily avoided by
using the second style.
Prefer Returning Maps Over Returning Positional Values
Using positional return values encodes meaning to indices, giving semantic or business meaning to indices/ordering. It's better to encode that meaning as explicit keywords:
(defn sieve [p xs] [(filter p xs) (remove p xs)]) (first (sieve even? (range 9))) ;; => (0 2 4 6 8) ;;; vs. (defn sieve [p xs] {:true (filter p xs) :false (remove p xs)}) (:true (sieve even? (range 9))) ;; => (0 2 4 6 8)
Afterword
This post and my advice are clearly opinionated. If you disagree with anything I've said, found a mistake, or think I missed important bits, feel free to shout at me on any of the social media platforms I'm on.
Translation
Korean - Thank you Lee Won Jun!