login

Dzen published 2 weeks ago, tagged with arch, bash, linux

Here’s for a small change of pace…

I’d like to talk about a tool I’ve all but forgotten I’m even using (and that’s a compliment to its stability and unobtrusiveness).

dzen is a great little application from the folks at suckless. It’s one of those do one thing and do it well types of tools. It’s probably not useful at all for anyone with a bloated –ahem, excuse me– featureful desktop environment or window manager (or both).

In my case, I’m using just XMonad with its beautiful simplicity. This means, of course, that there’s no out-of-the box… anything.

I’ve already covered some of this from an XMonad perspective, so this post is more about dzen’s general usefulness.

Volume

First up, a small visual notification when I adjust my volume:

ossvol screenshot

It fades in (implicitly thanks to xcompmgr) for just a second when I adjust my volume and gives me that nice, unobtrusive indication of the volume level.

The actual volume adjustment can be done in many alsa or oss specific ways; for my implementation, just see the script as it is live. Completely separate of that, however, we can just use dzen to show the notification:

level=$(get_it_from_alsa_or_oss)

# we use a fifo to buffer the repeated commands that are common with 
# volume adjustment
pipe='/tmp/volpipe'

# define some arguments passed to dzen to determine size and color.
dzen_args=( -tw 200 -h 25 -x 50 -y 50 -bg '#101010' )

# similarly for gdbar
gdbar_args=( -w 180 -h 7 -fg '#606060' -bg '#404040' )

# spawn dzen reading from the pipe (unless it's in mid-action already).
if [[ ! -e "$pipe" ]]; then
  mkfifo "$pipe"
  (dzen2 "${dzen_args[@]}" < "$pipe"; rm -f "$pipe") &
fi

# send the text to the fifo (and eventually to dzen). oss reports 
# something like "15.5" on a scale from 0 to 25 so we strip the decimals 
# and send gdbar an optional "upper limit" argument
(echo ${level/.*/} 25 | gdbar "${gdbar_args[@]}"; sleep 1) >> "$pipe"

Pretty easy, and about as light-weight as you can get.

Status bar

Little known fact: you can use the ubiquitous conky to feed a simple statusbar via dzen. This means you can also use dzen escapes in your TEXT block to do cool things:

dzen screenshot

My statusbar has the following “features”

  • Shows CPU/Mem/Network
  • The time, of course
  • Shows “Now playing” information from MPD
  • The music state (playing/paused) can be clicked to toggle it
  • The track title, when clicked, will advance

And here’s the conkyrc to achieve it:

background no
out_to_console yes
out_to_x no
override_utf8_locale yes
update_interval 1
total_run_times 0
mpd_host 192.168.0.5
mpd_port 6600

TEXT
[ ^ca(1, mpc toggle)${mpd_status}^ca()

  ${if_mpd_playing}- ${mpd_elapsed}/${mpd_length}$endif ]

  ^fg(\#909090)^ca(1, mpc next)${mpd_title}^ca()^fg() by

  ^fg(\#909090)${mpd_artist}^fg() from

  ^fg(\#909090)${mpd_album}^fg()

  Cpu: ^fg(\#909090)${cpu}%^fg()

  Mem: ^fg(\#909090)${memperc}%^fg()

  Net: ^fg(\#909090)${downspeedf eth0} / ${upspeedf eth0}^fg()

  ${time %a %b %d %H:%M}
Line breaks added for clarity.

The most interesting part is the clickable areas: ^ca( ... )some text^ca() defines an area of “some text” that can be clicked. The two arguments inside the first parens are are “which mouse button” and “what command to run”. Pretty simple and damn convenient.

Then all you’ve got to do is call this from your startup script:

$ conky -c ~/path/to/that | dzen2 -p -other -args

The -p option just means “persist” so the dzen will never close.

Wrap-up

This was just two examples of some uses for a simple “pipe some text in and see it” GUI toolkit – there are plenty others.

This echoes one of the great things about open-source: something like this is so small, so simple, it could never have survived marketing meetings, planning sessions or cost-benefit analyses – but here it is, and I find it oh-so-very-useful.

Git Submodule Config published 3 weeks ago, tagged with git, linux

Git submodules are pretty great. They allow you to have nested git repositories so that modular parts of your app can exist as separate repos but be worked with as one file tree. Another benefit is that when submodules are pushed to github they appear simply as links to the repos they represent.

If your not familiar, go ahead and google then come back – how submodules work overall is not the point of this post.

One of the ways I use submodules is to take modular pieces of my dotfiles repo and separate them out into single purpose, independently clonable repos for oh-my-zsh, vim and screen. A level down, inside the vim submodule itself, I use additional submodules in accordance with tpope’s awesome pathogen plugin to bundle the various vim plugins I use. At both of these levels there exist submodules of which I am the author and an active developer.

When I work on these submodules, I like to do so from within the parent repo (vs independently in some other directory). This is especially important in vim so that I can test out my changes immediately. If I didn’t do this, I would have to hack on the submodule, commit, push, go into the vim repo’s copy and pull – all before seeing the affects (Bret Victor would not be very happy with that workflow).

What this means is the submodule must be added with a pushable remote. And since I like to push using ssh keys and not enter my github credentials each time, I use the git@github url for that. Problem is, when someone wants to clone my repo (that’s what it’s there for), they’re unable to recursively clone those submodules because they don’t have access to them using that url. For that to work, I would’ve had to have added the submodules using the https protocol which allows anonymous checkouts.

As it turns out, due to the unexpected (but perfectly reasonable) behavior of a git submodule add command, I can actually have my cake and eat it too.

You see, when you do a git submodule add <url> <directory>, it writes that url to .gitmodules. This is the file and url that’s used when you clone this repo anywhere else and init the submodules within. But this is not the url that’s used when you actually try to push or pull from within the submodules!

In addition to .gitmodules, the url of the remote also gets written into the .git/config of the submodule as the origin remote (this is just normal clone behavior). This is the url that’s used for push/pull from within the submodule. If you think about it, it makes perfect sense: you’re in a valid git repo; when executing a push, you wouldn’t expect it to use anything but the remote that was defined and stored in your .git/config.

In some versions of git, I find that a submodule’s .git is actually a file pointing to a .git/modules/name/ directory in the parent repo.

Finally, the url/directory mapping for the submodule also gets written into the parent repo’s .git/config. What purpose does that serve? If you figure it out, let me know…

So (however unlikely this is) if you find yourself in the same situation as I, this is how you do that:

$ git add submodule https://github.com/you/repo some/dir
$ git commit -m 'added submodule repo'
$ cd some/dir
$ git remote rm origin
$ git remote add origin git@github.com:you/repo

Now anyone who clones (recursively) will get the anonymous checkout version (as defined in .gitmodules), but the origin remote in the local submodule has been changed to the git@github version and is pushable using ssh keys.

Test Driven Development published on Oct 2, tagged with linux, ruby, work

With my recent job shift, I’ve found myself in a much more sophisticated environment than I’m used to with respect to Software Engineering.

At my last position, there wasn’t much existing work in the X++ realm; We were breaking new ground, no one cared about elegance; if you got the thing working – more power to you.

Here, it’s slightly different.

People here are working in a sane, documented, open-source world; and they’re good. Everyone is acutely aware of what’s good design and what’s not. There’s a focus on elegant code, industry standards, solid OOP principles, and most importantly, we practice Test Driven Development.

I’m completely new to this method for development, and I gotta say, it’s quite nice.

Now, I’m not going to say that this is the be-all-end-all of development styles (I’m a functional, strictly-typed, compiler-checked code guy at heart), but I do find it quite interesting – and effective.

So why not do a write-up on it?

Test Framework

The prerequisite for doing anything in TDD is a good test framework. Luckily, ruby is pretty strong in this area. The way it works is the following:

You subclass Test::Unit and define methods that start with test_ where you execute system logic and make assertions about certain results; and then you run that class.

Ruby looks for those methods named as test_whatever and runs them “as tests”. Running a method as a test means that errors and failures (any of your assert methods returning false) will be logged and displayed at the end as part of the “test report”.

All of these test classes can be run automatically by a build-bot and (depending on your test coverage) you get good visibility into what’s working and what’s not.

This is super convenient and empowering in its own right. In a dynamic language like ruby, tests are the only way you have any level of confidence that your most recent code change doesn’t blow up in production.

So now that you’ve got this ability to write and run tests against your code base, here’s a wacky idea, write the tests first.

Test Driven

It’s amazing what this approach does to the design process.

I’ve always been the type that just starts coding. I’m completely comfortable throwing out 6 hours worth of code and starting over. I know my “first draft” isn’t going to be right (though it will be useful). I whole-heartedly believe in refactorings, etc. But most importantly, I need to code to sketch things out. It’s how I’ve always worked.

TDD is sort of the same thing. You do a “rough sketch” of the functionality you’ll add simply by writing tests that enforce that functionality.

You think of this opaque object – a black box. You don’t know how it does what it does, but you’re trying to test it doing it.

This automatically gives you an end-user perspective. You now focus solely on the interface, the input and the output.

This is a wise position to design from.

You also tend to design small self-contained pieces of functionality. Methods that don’t care about state, return the same output for a given input, and generally do one simple thing. Of course, you do this because these are the easiest kind of methods to test.

So, out of sheer laziness, you design a cohesive, easy to use, and completely simple interface, an API.

Now you just have to “plumb it up”. Hack until the tests pass, and you’re done. That might be an over-simplification, but it’s not off by much…

Come to think of it, this is exactly the type of design Haskell favors. With gratuitous use of undefined, the super-high-level logic of a Haskell program can be written out with named functions to “do the heavy lifting”. If you make these functions simple enough and give them descriptive enough names, they practically write themselves.

So that’s TDD (at least my take on it). So far, I like it.

Mairix published on Jul 3, tagged with arch, bash, linux, mutt

Mairix is a nice little utility for indexing and searching your emails. Its smooth integration with mutt is also a plus.

I used to use native mutt search, but it’s pretty slow. So far, mairix is giving me a good approximation of the google-powered search available in the web interface and it’s damn fast.

As I go through this setup, keep in mind the example config files are designed to work with my overall mutt setup; one which is described in two other posts here and here.

If you need a little context, checkout my mutt-config repo which has a fully functioning ~/.mutt, example files for the other apps involved (offlineimap, msmtprc, and now mairix), and any scripts the setup needs.

Mairix

First, of course, install mairix:

pacman -S mairix

Then, setup a ~/.mairixrc which defines where your mails are and their type as well as where to store the results and index. Here’s an example:

# where you keep your mail
base=/home/<you>/Mail

# colon separated list of maildirs to index.
#
# I have two accounts each in their own subfolder. the '...' means there 
# are subdirectories to search as well; it's like saying GMail/* and 
# GMX/*
maildir=GMail...:GMX...

# I omit gmail's archive folder so as to pevent duplicate hits
omit=GMail/all_mail

# search results will be copied to base/<this folder> for viewing in 
# mutt
mfolder=mfolder

# and the path to the index itself
database=/home/<you>/Mail/.mairix_database

With that in place, run mairix once to build the initial index. This first run will be slower but in my tests, subsequent rebuilds were almost instant.

In situations like these, I’ll usually add a verbose flag so I can be sure things are working as expected.

At this point, you could actually do some searching right from the commandline:

mairix some search term # search and populate mfolder
mutt -f mfolder         # open it in mutt

This wasn’t the usage I was after however, I’m typically already in mutt when I want to search my mails.

Mutt

My original script for this purpose was pretty simple. It prompted for the search term and ran it. The problem was you then needed a separate keybind to actually view the results.

Thankfully, Scott commented and provided a more advanced script which got around this issue. Many thanks to Scott and whomever wrote the script in the first place.

This version does some manual tty trickery to build its own prompt, read your input, execute the search and open the results. All from just one keybind.

I merged the two scripts together into what you see below. The main changes from Scott’s version are the following:

  1. I kept my clear, purge, search method rather than relying on cron to keep the index up to date.
  2. I removed the append-search functionality; not my use-case.
  3. I removed the <return> from the ^G trap; it was getting executed by mutt and opening the first message in the inbox after a cancelled search.
  4. I fixed it so that backspace works properly in the prompt.

So, here it is:

#!/bin/bash

read_from_config() {
  local key="$1" config="$HOME/.mairixrc"

  sed '/^'"$key"'=\([^ ]*\) *.*$/!d; s//\1/g' "$config"
}

read -r base    < <(read_from_config 'base')
read -r mfolder < <(read_from_config 'mfolder')

# prevent rm / further down...
[[ -z "$base$mfolder" ]] && exit 1

searchdir="$base/$mfolder"

set -f                          # disable globbing.
exec < /dev/tty 3>&1 > /dev/tty # restore stdin/stdout to the terminal,
                                # fd 3 goes to mutt's backticks.
saved_tty_settings=$(stty -g)   # save tty settings before modifying
                                # them

# trap <Ctrl-G> to cancel search
trap '
  printf "\r"; tput ed; tput rc
  printf "/" >&3
  stty "$saved_tty_settings"
  exit
' INT TERM

# put the terminal in cooked mode. Set eof to <return> so that pressing
# <return> doesn't move the cursor to the next line. Disable <Ctrl-Z>
stty icanon echo -ctlecho crterase eof '^M' intr '^G' susp ''

set $(stty size) # retrieve the size of the screen
tput sc          # save cursor position
tput cup "$1" 0  # go to last line of the screen
tput ed          # clear and write prompt
tput sgr0
printf 'Mairix search for: '

# read from the terminal. We can't use "read" because, there won't be
# any NL in the input as <return> is eof.
search=$(dd count=1 2>/dev/null)

# clear the folder and execute a fresh search
( rm -rf "$searchdir"
  mairix -p
  mairix $search
) &>/dev/null

# fix the terminal
printf '\r'; tput ed; tput rc
stty "$saved_tty_settings"

# to be executed by mutt when we return
printf "<change-folder-readonly>=$mfolder<return>" >&3

A non-trivial macro provides the interface to the script. It sets a variable called my_cmd to the output of the script, which should be the actual change-folder command, then executes it.

macro generic ,s "<enter-command>set my_cmd = \`$HOME/.mutt/msearch\`<return><enter-command>push \$my_cmd<return>" "search messages"

I’ve gotten used to “comma-keybinds” from setting that as my localleader in vim. It’s nice because it very rarely conflicts with anything existing and it’s quite fast to type.

One downside which I’ve been unable to fix (and believe me, I’ve tried!) is that if you press ^G to cancel a search but you’ve typed a few letters into the prompt, mutt will read those letters as commands (via the push) and execute them.

The only thing I could do is prefix those characters with something. I’ve decided to use /. That makes mutt see it as a normal search which you can execute or ^G again to cancel. Annoying, but better than mutt flailing around executing rando commands…

I haven’t had the time yet to learn all the tricks, but here are some of the more useful-looking searches from man mairix:

Useful searches

   t:word                             Match word in the To: header.

   c:word                             Match word in the Cc: header.

   f:word                             Match word in the From: header.

   s:word                             Match word in the Subject: header.

   m:word                             Match word in the Message-ID: 

                                      header.

   b:word                             Match word in the message body 
                                      (text or html!)

   d:[start-datespec]-[end-datespec]  Match messages with Date: headers 
                                      lying in the specific range.

Multiple body parts may be grouped together, if a match in any of them 
is sought.

   tc:word  Match word in either the To: or Cc: headers (or both).

   bs:word  Match word in either the Subject: header or the message body 
            (or both).

   The a: search pattern is an abbreviation for tcf:; i.e. match the 
   word in the To:, Cc: or From: headers.  ("a" stands for "address" in 
   this case.)

The "word" argument to the search strings can take various forms.

   ~word        Match messages not containing the word.  

   word1,word2  This matches if both the words are matched in the 
                specified message part.

   word1/word2  This matches if either of the words are matched in the 
                specified message part.

   substring=   Match any word containing substring as a substring

   substring=N  Match any word containing substring, allowing up to N 
                errors in the match.

   ^substring=  Match any word containing substring as a substring, with 
                the requirement that substring occurs at the beginning 
                of the matched word.

Happy searching!

Pacprune published on Jun 11, tagged with linux, bash, arch

A fairly long time ago, there was a thread on the Arch forums about clearing your pacman cache.

Pacman’s normal -Sc will remove all versions of any packages that are no longer installed and -Scc will clear that plus old versions of packages that are still installed.

The poster wanted a way to run -Scc but also keep the last 1 or 2 versions back from installed. There was no support for this in pacman directly, so a bit of a bash-off ensued.

I wrote a pretty crappy script which I posted there, it laid around in my ~/.bin collecting dust for a while, but I recently rewrote it. I’m pretty proud of the result for its effectiveness and succinctness, so I think it deserves a little discussion.

The methodology of the two versions is the same, but this new version leans heavily on good ol’ unix shell-scripting principles to provide the exact same functionality in way less code, memory, and time.

Approach

The first approach discussed on the thread was to parse filenames for package and version, then do a little sort-grepping to figure out which versions to keep and which versions to discard. This method is fast, but provably inaccurate if a package name contains numbers on the end.

I went a different way.

For each package, pull the .PKGINFO file out of the archive, parse the pkgname and pkgversion variables out of it, then do the same sort-grepping to figure out what to discard.

My first implementation of this algorithm was really bad. I’d parse and write pkgname|pkgversion to a file in /tmp. Then I’d grep unique package names using -m to return at most the number of versions you want to keep (of each package) and store that in another file. I’d then walk those files and rm the packages.

Ick.

Needs moar unix

The aforementioned ugliness, plus some configuration and error checking weighed in at 162 lines of code, used two files, and was dirt slow. I decided to re-attack the problem with a unix mindset.

In a nutshell: write small units that do one thing and communicate via simple text streams.

The first unit this script needs is a parser. It should accept a list of packages (relative file paths) on stdin, parse and output two space-separated values on stdout: name and path. The path will be needed by the next unit down the line, so we need to pass it through.

parse() {
  local package opt

  while read -r package; do
    case "$package" in
      *gz) opt='-qxzf' ;;
      *xz) opt='-qxJf' ;;
    esac

    bsdtar -O $opt "$package" .PKGINFO |\
        awk -v package="$package" '/^pkgname/ { printf("%s %s\n", $3, package) }'
  done
}

11 lines and damn fast. Thank god for bsdtar’s -q option. It tells the extraction to stop after finding the file I’ve requested. Since the .PKGINFO file is usually the first thing in the archive, we barely do any work to get the values.

It’s also done completely in RAM by piping tar directly to awk.

Step two would be the actual pruning. Accept that same space-separated list on stdin and for any package versions beyond the ones we want to keep (the 3 most recent), echo the full path to the package file on stdout.

prune() {
  local name package last_seen='' num_seen=0

  while read -r name package; do
    [[ -n "$last_seen" ]] && [[ "$last_seen" != "$name" ]] && num_seen=0

    num_seen=$((num_seen+1))

    # print full path
    [[ $num_seen -gt $versions_to_keep ]] && readlink -f "$package"

    last_seen="$name"
  done
}

Just watch the list go by and count the number of packages for each name. I’m ensuring that the list is coming in reverse sorted already, so once we see the number of packages we want to keep, any same-named packages after that should be printed.

So simple.

This function can get away with being simple because it doesn’t take into account what’s actually installed on your system. It just keeps the most recent 3 versions of each unique package in the cache. Therefore, to do a full clean, run pacman -Sc first to remove all versions of uninstalled software. Then use this script to clear all but installed plus the two previous versions. This assumes the highest version in the cache is the installed version which may or may not be true in all cases.

All that’s left is to make that reverse sorted list and pipe it through.

find ./ -maxdepth 1 -type f -name '*.pkg.tar.[gx]z' | LC_ALL='C' sort -r | parse | prune

So the whole script (new version) weighs in at ~30 lines (with whitespace) and I claim it is exactly as feature-rich as the first version.

I know what you’re saying: there’s no definition of the cache, no optional safe-list vs actual-removing behavior, there’s no removing at all!

Well, you’re just not thinking unix.

$ cd /some/cache/of/packages
$ pacprune                  # as a normal user, just print the list that 
                            # should be removed -- totally safe.
$ pacprune | sudo xargs rm  # then do the actual removal

You’re free to get as fancy as you’d like too…

$ archiveit() { sudo mv "$@" ~/pkg_archive/; }
$ pacprune | xargs archiveit

And the only configuration is setting the versions_to_keep variable at the top of the script.

The script can be found in my scripts repo.