retro gaming
raspberry pi
3d printing
book review
google code in
design patters
engineering expo
massive attack

Jan 26 2017 vim grep sed

Search and Replace The VIM Way

Did you know that it is 2017 and the VIM editor still does not have a decent multi-file search and replacement mechanism?! While you can always roll your own, it’s rather cumbersome, and even though some would say this isn’t in the spirit of an editor such as VIM, a large community has emerged around extending it in ways to behave more like a traditional IDE.

Having written about doing something similar to this via the cmd line a while back, and having refactored a large amount of code recently that involved lots of renaming, I figured it was time to write a plugin to do just that, rename strings across source files, using grep and sed

Before we begin, it should be noted that this is of most use with a ‘rooting’ plugin like vim-rooter. By using this, you will ensure vim is always running in the root directory of the project you are working on, regardless of the file being modified. Thus all search & replace commands will be run relative to the top project dir.

To install vsearch, we use Vundle. Setup & installation of that is out of scope for this article, but I highly recommend familiarizing yourself with Vundle as it’s the best Vim plugin management system (in my opinion).

Once Vundle is installed, using vsearch is as simple as adding the following to your ~/.vim/vimrc:

Plugin ‘movitto/vim-vsearch’

Restart Vim and run :PluginInstall to install vsearch from github. Now you’re good to go!

vsearch provides two commands :VSearch and :VReplace.

VSearch simply runs grep and displays the results, without interrupting the buffer you are currently editing.

VReplace runs a search in a similar manner to VSearch but also performs and in-memory string replacement using the specified args. This is displayed to the user who is prompted for comfirmation. Upon receiving it, the plugin then executes sed and reports the results.

Nov 7 2016 aikido philosophy splix gaming

Lessons on Aikido and Life via Splix

Recently, I've stumbled upon splix, a new obsession game, with simple mechanics that unfold into a complex competitive challenge requiring fast reflexes and dynamic tactics.

Splix intro

At the core the rule set is very simple: - surround territory to claim it - do not allow other players to hit your tail (you lose... game over)

Splix overextended

While in your territory you have no tail, rendering you invulnerable, but during battles territory is always changing, and you don't want to get caught deep on an attack just to be surrounded by an enemy who swaps the territory alignment to his!

Splix deception

The simple dynamic yields an unbelievable amount of strategy & tactics to excel at while at the same time requiring quick calculation and planning. A foolheardy player will just rush into enemy territory to attempt to capture squares and attack his opponent but a smart player will bait his opponent into his sphere of influence through tactful strikes and misdirections.

Splix bait

Furthermore we see age old adages such as "better to run and fight another day" and the wisdom of pitting opponents against each other. Alliances are always shifting in splix, it simply takes a single tap from any other player to end your game. So while you may be momentarily coordinating with another player to surround and obliterate a third, watch your back as the alliance may dissove at the first opportunity (not to mention the possiblity of outside players appearing anytime!)

Splix alliance

All in all, I've found careful observation and quick action to yield the most successful results on the battlefield. The ideal kill is from behind an opponent who has periously invaded your territory deeply. Beyond this, lurking at the border so as the goad the enemy into a foolheardy / reckless attack is a robust tactic provided you have built up the relfexes and coordination to quickly move in and out of territory which is constantly changing. Make sure you don't fall suspect to your own trick and overpenetrate the enemy border!

Splix bait2

Another tactic to deal w/ an overly aggressive opponent is to slightly fallback into your safe zone to quickly return to the front afterwords, perhaps at a different angle or via a different route. Often a novice opponent will see the retreat as a sign of fear or weakness and become over confident, penetrating deep into your territory in the hopes of securing a large portion quickly. By returning to the front at an unexpected moment, you will catch the opponents off guard and be able to destroy them before they have a chance to retreat to their safe zone.

Splix draw out

Of course if the opponent employs the same strategy, a player can take a calculated risk and drive a distance into the enemy territory before returning to the safe zone. By paying attention to the percentage of visible territory which the player's vulnerability zone occupies and the relative position of the opponent, they should be able to safely guage the safe distance to which they can extend so as to ensure a safe return. Taking large amounts of territory quickly is psychologically damaging to an opponent, especially one undergoing attacks on multiple fronts.

Splix draw out2

If all else fails to overcome a strong opponent, a reasonable retreat followed by an alternate attack vector may result in success. Since in splix we know that an safe zone corresponds to only one enemy, if we can guage / guess where they are, we can attempt to alter the dynamics of the battle accordingly. If we see that an opponent has stretch far beyond the mass of his safe zone via a single / thin channel, we can attempt to cut them off, preventing a retreat without crossing your sphere of influence.

Splix changing

This dynamic becomes even more pronounced if we can encircle an opponent, and start slowly reducing his control of the board. By slowly but mechanically & gradually taking enemy territory we can drive an opponent in a desired direction, perhaps towards a wall or other player.

Splix tactics2

Regardless of the situation, the true strategist will always be shuffling his tactics and actions to adapt to the board and setup the conditions for guaranteed victory. At no point should another player be underestimated or trusted. Even a new player with little territory can pose a threat to the top of the leader board given the right conditions and timing. The victorious will stay clam in the heat of the the battle, and use careful observations, timing, and quick reflexes to win the game.

(<endnote> the game *requires* a keyboard, it can be played via smartphone (swapping) but the arrow keys yields the fastest feedback</endnode>)
May 12 2016 nethack android gaming

Nethack Encyclopedia Reduxd

I've been working on way too many projects recently... Alas, I was able to slip in some time to update the NetHack Encyclopedia app on the Android MarketPlace (first released nearly 5 years ago!).

Version 5.3 brings several features including new useful tools. The first is the Message Searcher that allows the user to quickly query the many cryptic game messages by substring & context. Additionally the Game Tracker has been implemented, faciliting player, item, and level identification in a persistant manner. Simply enter entity attributes as they are discovered and the tracker will deduce the remaining missing information based on its internal alogrithm. This is ontop of many enhancements to the backend including the incorporation of a searchable item database.

The logic of the application has been highly refactored & cleaned up, the code has come along ways since first being written. In large, I feel pretty comfortable with the Android platform at the current time, it has its nuances, but all platorms do, and it's pretty easy to go from concept to implementation.

As far as the game itself, I have a ways to go before retrieving the Amulet! It's quite a challenge, but you learn with every replay, and thus you get closer. Ascension will be mine! (someday)

Nethack 5.3 screen1 Nethack 5.3 screen2 Nethack 5.3 screen3 Nethack 5.3 screen4
Mar 29 2016 lvm storage

LVM Internals

This post is intended to detail the LVM internal disk layout including the thin-volume metadata structure. While documentation of the LVM user space management utilities is abundant, very little exists in the realm of on-disk layout & structures. Having just added support for this to CloudForms, I figure this would be a good opportunity to expand on this for future reference.

The LVM framework relies on the underlying 'device-mapper' library to map blocks on physical disks (called physical volumes) to custom constructs depending on the intent of the system administrator. Physical and other volumes can be sliced, mixed, and matched to form Logical Volumes, which are presented to the user as normal disks, but dispatch read / write operations to the underlying storage objects depending on configuration. The Physical and Logical Volumes are organized into Volume Groups for management purposes.


To analyze a LVM instance, one could start from the bottom up, inspecting each physical volume for the on disk metadata structures, reconstructuing and using them to lookup the blocks to read and write. Physical volumes may constitute any block device Linux normally presents, there are no special restrictions, and this way LVM managed volumes can be chained together. On a recent VM, /dev/sda2 was used for the LVM managed / and /home partitions on installation, after which I extended the logical volume pool to include /dev/sdb using the recent thin pool provisioning features (more on this below).

Examining /dev/sda2 we can find the LVM Disk Label which may reside on one of the first 4 512-byte sectors on the disk. The address of the Physical Volume Header is given from this, specifically:

  pv_header_address = label_header.sector.xl * SECTOR_SIZE (512 bytes) + label_header.offset_xl

The Physical Volume Header gives us the base information about the physical volume including disk data and metadata locations. This call all be read sequentially / incrementally from the addresses contained in the header. These data structures can be seen below:

SECTOR_SIZE         = 512

LVM_ID_LEN          = 8
LVM_TYPE_LEN        = 8
LVM_ID              = "LABELONE"

PV_ID_LEN           = 32
MDA_MAGIC_LEN       = 16
FMTT_MAGIC          = "\040\114\126\115\062\040\170\133\065\101\045\162\060\116\052\076"

LABEL_HEADER = BinaryStruct.new([
  "A#{LVM_ID_LEN}",       'lvm_id',
  'Q',                    'sector_xl',
  'L',                    'crc_xl',
  'L',                    'offset_xl',
  "A#{LVM_TYPE_LEN}",     'lvm_type'

# On disk physical volume header.
PV_HEADER = BinaryStruct.new([
  "A#{PV_ID_LEN}",        'pv_uuid',
  "Q",                    'device_size_xl'

# On disk disk location structure.
DISK_LOCN = BinaryStruct.new([
  "Q",                    'offset',
  "Q",                    'size'

# On disk metadata area header.
MDA_HEADER = BinaryStruct.new([
  "L",                    'checksum_xl',
  "A#{MDA_MAGIC_LEN}",    'magic',
  "L",                    'version',
  "Q",                    'start',
  "Q",                    'size'

# On disk raw location header, points to metadata.
RAW_LOCN = BinaryStruct.new([
  "Q",                    'offset',
  "Q",                    'size',
  "L",                    'checksum',
  "L",                    'filler'

The raw LVM metadata contents areas consists of simple JSON-like key / value data structs where objects, arrays, and primtive values (including strings) may be encoded. The top level of each extracted metadata contents will consist of a single key / value pair, the volume group name and encoded properties. From there logical and physical volumes are detailed. Sample metadata contents can be seen below:

fedora {
    id = "sOIQC3-75Rq-SQnT-0lfj-fgni-cU0i-Bnbeao"
    seqno = 11
    format = "lvm2"
    status = ["RESIZEABLE", "READ", "WRITE"]
    flags = []
    extent_size = 8192
    max_lv = 0
    max_pv = 0
    metadata_copies = 0
    physical_volumes {
        pv0 {
            id = "ZDOhNU-09hz-rsd6-MrJH-20sN-ajcg-opqhDf"
            device = "/dev/sda2"
            status = ["ALLOCATABLE"]
            flags = []
            dev_size = 19945472
            pe_start = 2048
            pe_count = 2434
        pv1 {
            id = "QT6OH2-1eCc-CyxL-vYkj-RJn3-vuFO-Jg9Qu2"
            device = "/dev/sdb"
            status = ["ALLOCATABLE"]
            flags = []
            dev_size = 6291456
            pe_start = 2048
            pe_count = 767
    logical_volumes {
        swap {
            id = "iSNIOA-N4dh-qeYp-hAG9-mUGG-PFsL-MomHTO"
            status = ["READ", "WRITE", "VISIBLE"]
            flags = []
            creation_host = "localhost"
            creation_time = 1454442463
            segment_count = 1
            segment1 {
                start_extent = 0
                extent_count = 256
                type = "striped"
                stripe_count = 1
                stripes = [
                    "pv0", 0
        pool00 {
            id = "seRK1m-3AYe-0Y3N-TCvF-ABhh-7AKj-gX85eY"
            status = ["READ", "WRITE", "VISIBLE"]
            flags = []
            creation_host = "localhost"
            creation_time = 1454442464
            segment_count = 1
            segment1 {
                start_extent = 0
                extent_count = 1940
                type = "thin-pool"
                metadata = "pool00_tmeta"
                pool = "pool00_tdata"
                transaction_id = 1
                chunk_size = 128
                discards = "passdown"
                zero_new_blocks = 1
        root {
            id = "w0IcgL-HHnY-ptff-wZmT-3uQx-KBuu-Ep0YFq"
            status = ["READ", "WRITE", "VISIBLE"]
            flags = []
            creation_host = "localhost"
            creation_time = 1454442464
            segment_count = 1
            segment1 {
                start_extent = 0
                extent_count = 1815
                type = "thin"
                thin_pool = "pool00"
                transaction_id = 0
                device_id = 1
        lvol0_pmspare {
            id = "Sm33QK-HzFZ-Vo6b-6qBf-DsE2-uufG-T5EAG7"
            status = ["READ", "WRITE"]
            flags = []
            creation_host = "localhost"
            creation_time = 1454442463
            segment_count = 1
            segment1 {
                start_extent = 0
                extent_count = 2
                type = "striped"
                stripe_count = 1
                stripes = [
                    "pv0", 256
        pool00_tmeta {
            id = "JdOGun-8vt0-UdUI-I3Ju-aNjN-NurO-Yd7kan"
            status = ["READ", "WRITE"]
            flags = []
            creation_host = "localhost"
            creation_time = 1454442464
            segment_count = 1
            segment1 {
                start_extent = 0
                extent_count = 2
                type = "striped"
                stripe_count = 1
                stripes = [
                    "pv0", 2073
        pool00_tdata {
            id = "acemRb-wAqV-Nvwh-LR2L-LHyT-Lhvm-g3Wl3F"
            status = ["READ", "WRITE"]
            flags = []
            creation_host = "localhost"
            creation_time = 1454442464
            segment_count = 2
            segment1 {
                start_extent = 0
                extent_count = 1815
                type = "striped"
                stripe_count = 1
                stripes = [
                    "pv0", 258
            segment2 {
                start_extent = 1815
                extent_count = 125
                type = "striped"
                stripe_count = 1
                stripes = [
                    "pv0", 2075

This is all the data necessary to map logical volume lookups to normal / striped physical volume sectors. Note if there are multiple physical volumes in the volume group, be sure to extract all metadata from as mappings may result in blocks being cross-referenced.

To lookup a logical address, first determine which logical volume segment range the address falls into. Segments boundries are specified via extents whose size is given in the volume group metadata (note this is represnted in sectors, or 512-byte blocks). From there it's a simple matter of reading the blocks off the physical volume segments given by the specified stripe id and offset, being sure to correct map positional start / stop offsets.


The process gets a little more complicated for thinly provisioned volumes, which are a relatively new addition to the LVM framework where logical volumes marked as 'thin' do not directly map to physical extents but rather are pooled together via a mapping structure and shared data partition. This allows the centralized partition to grow / shrink on demand and the decoupling of pool properties from actual underlying data space availability.

To implement this each thin logical volume references a pool volume which in return references metadata and data volumes (as can be seen above). Addresses to be accessed from the thin volume are first processed using the pool metadata volume which contains an on disk BTree structure mapping thin volume blocks to data volume blocks.

The thin volume metadata superblock can be read off the metadata volume starting at address 0. This gives us the data & metadata space maps as well as the device details and data mapping trees allowing us to perform the actual address resolution. Device Details is a one level id -> device info BTree providing thin volume device information while the Data Map is a 2 level BTree mapping of device id -> device blocks -> data blocks.

Once this information is parsed from the pool metadata determine which data volume blocks to read for a given thin volume address is simply a matter of looking up the corresponding blocks via the data map and offsetting the start / ending positions accordingly. The complete thin volume metadata structures can be seen below:


  THIN_MAGIC = 27022010



  SUPERBLOCK = BinaryStruct.new([
   'L',                       'csum',
   'L',                       'flags_',
   'Q',                       'block',
   'A16',                     'uuid',
   'Q',                       'magic',
   'L',                       'version',
   'L',                       'time',

   'Q',                       'trans_id',
   'Q',                       'metadata_snap',

   "A#{SPACE_MAP_ROOT_SIZE}", 'data_space_map_root',
   "A#{SPACE_MAP_ROOT_SIZE}", 'metadata_space_map_root',

   'Q',                       'data_mapping_root',

   'Q',                       'device_details_root',

   'L',                       'data_block_size',     # in 512-byte sectors

   'L',                       'metadata_block_size', # in 512-byte sectors
   'Q',                       'metadata_nr_blocks',

   'L',                       'compat_flags',
   'L',                       'compat_ro_flags',
   'L',                       'incompat_flags'

  SPACE_MAP = BinaryStruct.new([
    'Q',                      'nr_blocks',
    'Q',                      'nr_allocated',
    'Q',                      'bitmap_root',
    'Q',                      'ref_count_root'

  DISK_NODE = BinaryStruct.new([
    'L',                      'csum',
    'L',                      'flags',
    'Q',                      'blocknr',

    'L',                      'nr_entries',
    'L',                      'max_entries',
    'L',                      'value_size',
    'L',                      'padding'
    #'Q',                      'keys'

  INDEX_ENTRY = BinaryStruct.new([
    'Q',                      'blocknr',
    'L',                      'nr_free',
    'L',                      'none_free_before'

  METADATA_INDEX = BinaryStruct.new([
    'L',                      'csum',
    'L',                      'padding',
    'Q',                      'blocknr'

  BITMAP_HEADER = BinaryStruct.new([
    'L',                      'csum',
    'L',                      'notused',
    'Q',                      'blocknr'

  DEVICE_DETAILS = BinaryStruct.new([
    'Q',                      'mapped_blocks',
    'Q',                      'transaction_id',
    'L',                      'creation_time',
    'L',                      'snapshotted_time'

  MAPPING_DETAILS = BinaryStruct.new([
    'Q',                       'value'

One can see this algorithm in action via this LVM parsing script extract from CloudForms. You will need to install the 'binary_struct' gem and run this script as a privileged user inorder to read the binary disks:

$ sudo gem install binary_struct
$ sudo ruby ruby lvm-parser.rb -d /dev/sda2 -d /dev/sdb

From there you can extract any info from the lvm metadata structures or data segments for further analysis.

"If it's on the Internet it must be true"
  -George Washington
Sep 26 2015 bitcoin projects sig315

Bitcoin Aware Barbers Pole

A few months back the guild hosted an Arduino Day workshop. The event was a great success, there was a large turnout, many neat presentations & demos, and much more. My project for the day was an Arduino controlled Barber's Pole that would poll data from the network and activate/deactivate multiple EL wires attached to it. Unfortunately due to a few technical issues the project took a bit longer than originally planned and had to be sidelined. But having recently finished it up, I present the Bitcoin Aware Barber's Pole, now on display at the guild!


The premise was straightforward, an Arduino Diecimila would be used in combination with the Ethernet Shield to retrieve data from the Internet and activate one of two el-wires mounted to a pole like object. For that several materials were considered but we ended up using pvc as it was easiest to work with and was what we were going for aesthetically. Since the EL wire is driven from an AC source we used two SPDT relays to activate the circuit based on the state of the Arduino's digital pin output. The constructed circuit was simple, incorporating the necessary components to handle flyback current.

Arduino pole Barber circuitry1

The software component of this project is what took the most time, due to several setbacks. Perhaps the biggest was the shortage of address space we had to work with, micro-controller platforms are notorious for this, but the Diecimila only gave us 16KB of flash memory to use, which after what I'm assuming is space for the bootloader, shared libraries, and other logic is reserved, amounts to ~14KB of memory for the user program and data. Contrast this to modern general purpose PCs, where you'd be hard pressed to find a system with less than 2GB of memory! This had many side effects including not having enough address space to load and use the Arduino HttpClient or Json libraries. Thus a very rudimentary HTTP parsing and request implementation was devised so as to serve the application's needs. All and all it was very simple but specialized, only handling the edge cases we needed an nothing else.

Of course the limited address space meant we were also limited in the amount of constants and variables we could use. Using the heap (also small on this platform) always introduces additional complexities / logic of its own so was avoided. Since each data source would require the metadata needed to access it, we decided to only poll one location and use it to activate either of the two el-wires depending on its state.

In all of this you may be asking why we just didn't order a newer Arduino chip with a bigger address space to which I reply what would I do with the one that I had?!?! Plus developing for platforms with memory & other restrictions introduces fun challenges of its own.


At one point we tried splitting the sketch into multiple modules via the Arduino IDE interface. This was done to try and better organize the project, in a more OOD fashion, but introduced more complexities than it was worth. From what I gather, most sketches are single module implementations, perhaps incorporating some external libraries via the standard mechanisms. When we attempted to deviate from this we noticed so weird behavior, perhaps as a result of the includes from centralized Arduino & supporting libraries being pulled into multiple modules. We didn't debug too far, as overall the application isn't that complex.

One of the last challenges we had to face was selecting the data to poll. Again due to the limited memory space, we could only store so much http response data. Additionally even any rudimentary parsing of JSON or other format would take a bit of logic which we didn't have the space for. Luckily we found Bitcoin Average which provides an awesome API for getting up-to-date Bitcoin market data. Not only do they provide a rich JSON over REST interface, but fields can be polled individually for their flat text values, for which we retrieve the BTC/USD market average every 5 minutes. When bitcoin goes up, the blue light is activated, when it goes down, the red light is turned on. Of course this value is a decimal and enabling floating point arithmetic consumes more memory. To avoid this, we parsed the integer and decimal portions of the currency separately and ran the comparisons individually (in sequence).

But unfortunately there was one last hiccup! While the Bitcoin Average documentation stated that HTTP was supported, but in fact querying their server via port 80 just resulted in a 301 redirect to HTTPS running on 443. Since even w/ more modern Arduino platforms w/ larger address spaces, HTTPS/SSL handling proves to be outright impossible due to the complexity of the algorithms, we had to devise a solution to be able to communicate with the server via http in order to retrieve the data. To do so we wrote & deployed a proxy that listens for http requests, issue a https request to bitcoin average and returned the result. This was simple enough to do w/ the Sinatra micro-framework as you can see below:

# HTTP -> HTTPS proxy
# Written to query the bitcoinaverage.com via http (only accessible by https).
# Run as a standard Rack / Sinatra application
# Author: Mo Morsi <mo@morsi.org>
# License: MIT

require 'sinatra'
require 'open-uri'

URL = "https://api.bitcoinaverage.com/ticker/USD/last"

get '/' do
  open(URL) do |content|

The final result was hosted on this server and the Arduino sketch was updated to use it. All in all the logic behind the Barber's Pole can be seen below:

//// Bitcoin Barber Shop Pole
//// Author: Mo Morsi <mo@morsi.org>
//// Arduino Controller Sketch
//// License: MIT
//// For use at the Syracuse Innovators Guild (sig315.org)

#include <SPI.h>
#include <Ethernet.h>

//// sketch parameters

byte mac[]                           = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 };
int port                             = 80;
char server[]                        = "projects.morsi.org";
char host[]                          = "Host: projects.morsi.org";
char request[]                       = "GET /barber/ HTTP/1.1";
char user_agent[]                    = "User-Agent: arduino-ethernet";
char close_connection[]              = "Connection: close";
char content_length_header[]         = "Content-Length";

char CR                              = '\r';
char NL                              = '\n';

unsigned long lastConnectionTime     = 0;
const unsigned long postingInterval  = 300000; // - every 5 mins
boolean lastConnected                = false;

const int  max_data                  = 32;
int  data_buffer_pos                 = 0;
char data_buffer[max_data];
int  content_length                  = -1;
boolean in_body                      = false;

int current_btc                      = 0;
int current_btc_decimal              = 0; // since were not using floats

const int blue_pin                   = 5;
const int red_pin                    = 7;
unsigned long lastLightingTime       = -1;
const unsigned long lightingInterval = 5000;

// arduino hook in points & config

EthernetClient client;

void setup() {





void loop() {

void pins_config(){
  pinMode(blue_pin, OUTPUT);
  pinMode(red_pin, OUTPUT);

void serial_config(){
  while (!Serial) { ; } // this check is only needed on the Leonardo

// network operations

void net(){
  else if(should_issue_request())

  lastConnected = client.connected();

void block(){
  for(;;) { ; }

boolean should_reset(){
  return !client.connected() && lastConnected;

void net_reset(){

boolean should_issue_request(){
  return !client.connected() && (millis() - lastConnectionTime > postingInterval);

void net_config(){
  if (Ethernet.begin(mac) == 0) {
    Serial.println("net failed");

void net_read(){
  if(client.available()) {
    char c = client.read();

void net_request(){
  if (client.connect(server, port)) {

    lastConnectionTime = millis();

  }else {

// data buffer management

void buffer_append(char c){
  data_buffer[data_buffer_pos] = c;
  data_buffer_pos += 1;
  if(data_buffer_pos >= max_data)
    data_buffer_pos = 0;

void buffer_reset(){
  data_buffer_pos = 0;

// moves last char in buffer to first, sets pos after
void buffer_cycle(){
  data_buffer[0]  = data_buffer[data_buffer_pos-1];
  data_buffer_pos = 1;

void buffer_print(){
  Serial.print("buf ");
  Serial.print(": ");
  for(int p = 0; p < data_buffer_pos; p++)

// http parsing / handling

// https://en.wikipedia.org/wiki/HTTP_message_body

int char_pos(char ch){
  for(int p = 1; p < data_buffer_pos; p++)
    if(data_buffer[p] == ch)
      return p;
  return -1;

int seperator_pos(){
  return char_pos(':');

int decimal_pos(){
  return char_pos('.');

boolean status_detected(){
  if(data_buffer_pos < 4) return false;
  int cr_pos    = data_buffer_pos - 3;
  int lf_pos    = data_buffer_pos - 2;
  int alpha_pos = data_buffer_pos - 1;

  // only upper case letters
  int alpha_begin = 65;
  int alpha_end   = 90;

  return data_buffer[cr_pos]    == CR          &&
         data_buffer[lf_pos]    == NL          &&
         data_buffer[alpha_pos] >= alpha_begin &&
         data_buffer[alpha_pos] <= alpha_end;

boolean header_detected(){
  if(data_buffer_pos < 5) return false;
  int cr_pos     = data_buffer_pos - 2;
  int lf_pos     = data_buffer_pos - 1;

  return seperator_pos()     != -1   &&
         data_buffer[cr_pos] == CR   &&
         data_buffer[lf_pos] == NL;

boolean is_header(char* name){
  int pos = 0;
  while(name[pos] != '\0'){
    if(name[pos] != data_buffer[pos])
      return false;
  return true;

boolean body_detected(){
  if(data_buffer_pos < 4) return false;
  int first_cr  = data_buffer_pos - 4;
  int first_lf  = data_buffer_pos - 3;
  int second_cr = data_buffer_pos - 2;
  int second_lf = data_buffer_pos - 1;

  return (data_buffer[first_cr]  == CR &&
          data_buffer[first_lf]  == NL &&
          data_buffer[second_cr] == CR &&
          data_buffer[second_lf] == NL);

int extract_content_length(){
  int value_pos = seperator_pos() + 1;
  char content[data_buffer_pos - value_pos];
  for(int p = value_pos; p < data_buffer_pos; p++)
    content[p-value_pos] = data_buffer[p];
  return atoi(content);

void process_headers(){

  else if(header_detected()){
      content_length = extract_content_length();

  else if(body_detected()){
    in_body = true;

int extract_new_btc(){
  int decimal  = decimal_pos();
  int buf_size = decimal == -1 ? data_buffer_pos - 1 : decimal;
  int iter_end = decimal == -1 ? data_buffer_pos     : decimal;

  char value[buf_size];
  for(int p = 0; p < iter_end; p++)
    value[p] = data_buffer[p];
  return atoi(value);

int extract_new_btc_decimal(){
  int decimal  = decimal_pos();
  if(decimal == -1 || decimal == data_buffer_pos - 1) return 0;
  int buf_size = data_buffer_pos - decimal - 1;
  int iter_start = decimal + 1;

  char value[buf_size];
  for(int p = iter_start; p < data_buffer_pos; p++)
    value[p - iter_start] = data_buffer[p];
  return atoi(value);

void process_body(){
  if(!in_body || data_buffer_pos < content_length) return;

  process_new_btc(extract_new_btc(), extract_new_btc_decimal());

  content_length = -1;
  in_body = false;

void process_response(){

// target specific data processing

void print_btc(int btc, int btc_decimal){

boolean value_increased(int new_btc, int new_btc_decimal){
  return new_btc > current_btc || (new_btc == current_btc && new_btc_decimal > current_btc_decimal);

boolean value_decreased(int new_btc, int new_btc_decimal){
  return new_btc < current_btc || (new_btc == current_btc && new_btc_decimal < current_btc_decimal);

void process_new_btc(int new_btc, int new_btc_decimal){
  //print_btc(current_btc, current_btc_decimal);
  //print_btc(new_btc, new_btc_decimal);

  if(value_increased(new_btc, new_btc_decimal)){

  else if(value_decreased(new_btc, new_btc_decimal)){


  current_btc = new_btc;
  current_btc_decimal = new_btc_decimal;

// pin output handling

boolean should_turn_off(){
  return lastLightingTime != -1 && (millis() - lastLightingTime > lightingInterval);

void lights(){
    lastLightingTime = -1;

void turn_on_blue(){
  lastLightingTime = millis();
  digitalWrite(blue_pin, HIGH);

void turn_off_blue(){
  digitalWrite(blue_pin, LOW);

void turn_on_red(){
  lastLightingTime = millis();
  digitalWrite(red_pin, HIGH);

void turn_off_red(){
  digitalWrite(red_pin, LOW);

void turn_on_both(){

void turn_off_both(){

The actual construction of the pole consists of a short length of PVC pipe capped at both ends. The text was spray painted over and a small hole drilled in the back for the power & network cables. The circuity was simply placed flat inside the pvc, no special mounting or attachments were used or needed.

The final setup was placed near the enterance of the Guild where anyone walking in / out could see it.

Barber enterance Barber closeup

All in all it was a fun project that took a bit longer than originally planned, but when is that not the case?! Microcontrollers always prove to be unique environments, and although in this case it just amounted to some C++ development, the restricted platform presented several interesting challenges I hadn't encountered since grad school. Going forward I'm contemplating looking into the Raspberry Pi platform for my next project as it seems to be a bit more flexible & has more address space, while still available at a great price point.


Sep 15 2015 miq db

CloudForms v2 MiQ DB - 08/2015

Cfme db 0

Now that's a db! Created using Dia.

Relevant table reference / listing can be found here

Modeling is the first step towards Optimization.
Aug 9 2015 projects refs polisher sig315

Polished to a Resilience

Long time since the last post, it's been quite an interesting year! We all move forward as do the efforts on various fronts. Just a quick update today on two projects previously discussed as things ramp up after some downtime (look for more updates going into the fall / winter). Polisher has received alot of work in the domains of refactoring and new tooling. The codebase is more modular and robust, test coverage has been greatly expanded, and as far as the new utilities:

These can be seen in action via the asciinema.org screencasts referenced above.

Resilience, our expiremental REFS parser, has also been polished. Various utilities previously written have been renamed, refactored, and expanded; and new tooling written to continue the analysis. Of particular notability are:

Also worthy to note are other ongoing efforts including updating ruby to 2.3 in rawhide and updating rails to 4.x in EPEL.

Finally, the SIG has been going through (another) transitionary period. While membership is still growing there are many logistical matters currently up in the air that need to be resolved. Look for updates on that front as well as many others in the near future.

Mar 15 2015 meditation

The Right Mind And The Confused Mind


From The Unfettered Mind which offers some great advice on the subject of meditation (take or leave what you will):

The Right Mind And The Confused Mind The Right Mind is the mind that does not remain in one place. It is the mind that stretches throughout the entire body and self. The Confused Mind is the mind that, thinking something over, congeals in one place. When the Right Mind congeals and settles in one place, it becomes what is called the Confused Mind. When the Right Mind is lost, it is lacking in function here and there. For this reason, it is important not to lose it. In not remaining in one place, the Right Mind is like water. The Confused Mind is like ice, and ice is unable to wash hands or head. When ice is melted, it becomes water and flows everywhere, and it can wash the hands, the feet or anything else. If the mind congeals in one place and remains with one thing, it is like frozen water and is unable to be used freely: ice that can wash neither hands nor feet. When the mind is melted and is used like water, extending throughout the body, it can be sent wherever one wants to send it. This is the Right Mind.

The Mind Of The Existent Mind And The Mind Of No-Mind The Existent Mind is the same as the Confused Mind and is literally read as the "mind that exists." It is the mind that thinks in one direction, regardless of subject. When there is an object of thought in the mind, discrimination and thoughts will arise. Thus it is known as the Existent Mind.

The No-Mind is the same as the Right Mind. It neither congeals nor fixes itself in one place. It is called No-Mind when the mind has neither discrimination nor thought but wanders about the entire body and extends throughout the entire self.

The No-Mind is placed nowhere. Yet it is not like wood or stone. Where there is no stopping place, it is called No-Mind. When it stops, there is something in the mind. When there is nothing in the mind, it is called the mind of No-Mind. It is also called No-Mind-No-Thought.

When this No-Mind has been well developed, the mind does not stop with one thing nor does it lack any one thing. It is like water overflowing and exists within itself. It appears appropriately when facing a time of need. The mind that becomes fixed and stops in one place does not function freely. Similarly, the wheels of a cart go around because they are not rigidly in place. If they were to stick tight, they would not go around. The mind is also something that does not function if it becomes attached to a single situation. If there is some thought within the mind, though you listen to the words spoken by another, you will not really be able to hear him. This is because your mind has stopped with your own thoughts.

If your mind leans in the directions of these thoughts, though you listen, you will not hear; and though you look, you will not see. This is because there is something in your mind. What is there is thought. If you are able to remove this thing that is there, your mind will become No-Mind, it will function when needed, and it will be appropriate to its use.

The mind that thinks about removing what is within it will by the very act be occupied. If one will not think about it, the mind will remove these thoughts by itself and of itself become No-Mind. If one always approaches his mind in this way, at a later date it will suddenly come to this condition by itself. If one tries to achieve this suddenly, it will never get there.

An old poem says:

To think, "I will not think"-
This, too, is something in one's thoughts.
Simply do not think
About not thinking at all.

Nov 2 2014 refs filesystems

ReFS Part II - May the Resilience Be With You

Not too long after the last post, it became aparent that the disk I was analyzing wasn't a valid filesystem. Possibly due to a transfer error, several bytes were missing resulting in disk structures that weren't aligned with the addresses where they should've resided. After generating a new image I was able to make alot of headway on the analysis. To start off a valid metadata page address became immediately aparent on page 0x1e. Recall that page 0x1e is the first metadata page residing at a fixed / known location after the start of the partition:
  bytes 0xA0-A7: 90 19 00 00 00 00 00 00
        0xA8-AF: 67 31 01 00 00 00 00 00
Pages 0x1990 and 0x13167 are valid metadata pages containing similar contents. Most likely one is a backup of the other. Assuming the first record is the primary copy (0x1990). Note this address appears at byte 0xA0 on page 0x1E. Byte 0xA0 is referenced earlier on in the page:
  byte 0x50: A0 00 00 00  02 00 00 00  B0 00 00 00  18 00 00 00
So it is possible that this page address is not stored at a static location but at a offset referenced earlier in the page.

The System Table

The word 6 value (previously refered to as virtual page number) of page 0x1990 is '0' indicating this is a critical table. Lets call this the System Table for the reasons found below. This page contains 6 rows of 24-byte entries, each containing a valid metadata partition, some flags, and a 16-byte unique id/checksum of some sort. Early on in the page the table header resides:
  byte 0x58: 06 00 00 00   98 00 00 00
             B0 00 00 00   C8 00 00 00
             E0 00 00 00   F8 00 00 00
             10 01 00 00
06 is the number of records and each dword after this contains the offset from the very start of the page to each table record:
  table offsets: 98, B0, C8, E0, F8, 110
Each table record has a page id, flags, and some other unique qword of some sort (perhaps an object id or checksum),
  page ids:      corresponding virtual page id values:
    2c2               2
     22               E
     28               D
     29               C
    2c8               1
    2c5               3
These correspond to the latest revisions of the critical system pages highlighted in previous analysis.


We've previously established Virtual Page 0x2 contains the object table and upon furthur examination of the keys (object id's) and values (page id's)' we see object 0x0000000000006000000000000000000 is the root directory (this is consistent across images). The format of a directory page varies depending its type. Like all metadata-pages the first 0x30 bytes contains the page metadata. This is followed by a attribute of unknown purpose (seems to be related to the page's contents, perhaps a generic bucket / container descriptor). This is followed by the table header attribute, 0x20 bytes in length. This attribute contains: Table Type Flags: Table records here work like any other table consisting of The semantics of the record values differ depending on the table type. Directory lists contain: B+ trees contain: When iterating over directory list records, the record flags seem to indicate record context. A value of '4' stored in the record flags seems to indicate a historical / old entry, for example an old directory name before it was renamed (eg 'New Folder'). The files / directories we are interested in contain '0' or '8' in the record flags. The intent of each matching directory list record can be furthur deduced by the first 4 bytes in its key which may be:
  0x00000010 - directory information
  0x00020030 - subdirectory - name will be the rest of the key
  0x00010030 - file - name will be the rest of the key
  0x80000020 - ???
In the case of subdirectories, the first 16 bytes of the record value will contain the directory object id. The object table can be used to look this up to access its page. For B+ trees the record values will contain the ids of pages containing directory records (and possibly more B+ levels though I didn't verify this). Full filesystem traversal can be implemented by iterating over the root tree, subdirs, and file records.

File Tables

File metadata is stored as a table embedded directly into the directory table which the file is under. Each file table always starts with an attribute 0xA8 length containing the file timestamps (4 qwords starting at byte 0x28 of this attribute) & file length (starting at byte 0x68 of this attribute). Note the actual units of time which the timestamps represent are still unknown. After this there exists several related metadata attributes. The second attribute (starting at byte 0xA8 of the file table):
  20 00 00 00 # length of this record
  A0 01 00 00 # length of this record + next record
  D4 00 00 00 # amount of padding after next record
  00 02 00 00 # table type / flags ?
  74 02 00 00 # next 'insert' address ?
  01 00 00 00 # number of records ?
  78 02 00 00 # offset to padding
  00 00 00 00
The next record looks like a standard table record as we've seen before:
  80 01 00 00 # length of this record, note this equals 2nd dword value of last record minus 0x20
  10 00 0E 00 # offset to key / key length
  08 00 20 00 # flags / offset to value
  60 01 00 00 # value length / padding
The key of this record starts at 0x10 of this attribute and is 0x0E length:
  60 01 00 00
  00 00 00 00
  80 00 00 00
  00 00 00
The value starts at attribute offset 0x20 and is of length 0x160. This value contains yet another embeded attribute:
  88 00 00 00 # length of attribute
  28 00 01 00
  01 00 00 00
  20 01 00 00
  20 01 00 00
  02 00 00 00
  00 00 00 00
  00 00 00 00
  00 00 00 00
  00 00 00 00
  01 00 00 00
  00 00 00 00
  00 00 00 00
  00 00 03 00
  00 00 00 00
  2C 05 02 00 # length of the file
  00 00 00 00
  2C 05 02 00 # length of the file
  00 00 00 00
  # 0's for the rest of this attribute
The file length is represented twice in this attribute (perhaps allocated & actual lengths) The next attribute is as follows:
  20 00 00 00 # length of attribute
  50 00 00 00 # length of this attribute + length of next attribute
  84 00 00 00 # amount of padding after this attribute
  00 02 00 00 # ?
  D4 00 00 00 # next insert address
  01 00 00 00 # ?
  D8 00 00 00 # offset to padding
  00 00 00 00
The format of this attribute looks similar to the second in the file (see above) and seems to contain information about the next record(s). Perhaps related to the 'bucket' concept discussed here At first glance the next attribute looks like another standard record but the key and value offsets are the same. This attribute contains the starting page # of the file content
 30 00 00 00 # length of this record
 10 00 10 00 # key offset / length ?
 00 00 10 00 # flags / value offset ?
 20 00 00 00 # value length / padding ? 
 00 00 00 00
 00 00 00 00
 0C 00 00 00
 00 00 00 00
 D8 01 00 00 # starting page of the file
 00 00 00 00
 00 00 00 08
 00 00 00 00 
For larger files there are more records following this attribute, each of 0x30 length, w/ the same record header. Many of the values contain the pages containing the file contents, though only some have the same format as the one above. Other records may correspond to compressed / sparse attributes and have a different format. The remainder of this attribute is zero and closes out the third attribute in the file record. After this there is the amount of padding described by the second attribute in the file (see above) after which there are two more attributes of unknown purpose.


After investigation it seems the ReFS file system driver doesn't clear a page when copying / overwriting shadow pages. Old data was aparent after valid data on newer pages. Thus a parser cannot rely on 0'd out regions to acts as deliminators or end markers. Using the above analysis I threw together a ReFS file lister that iterates over all directories and files from the root. It can be found on github here. Use it like so:
ruby rels.rb --image foo.image --offset 123456789

Next Steps

Besides verifying all of the above, the next major action items are to extract the pages / clusters containing file data as well as all file metadata.
Sep 13 2014 refs filesystems

ReFS - All Your Resilience Are Belong To Us

(grammer intentional) The last few months I've been looking into the Resilent File System (ReFS), which has only undergone limited analysis so far. Let's fix that shall we!

Before we begin, I've found these to be the best existing public resources so far concerning the FS, they've helped streamline the investigation greatly.

[1] blogs.msdn.com - Straight from the source, a msdn blog post on various concepts around the FS internals.

[2] williballenthin.com - An extended analysis of the high level layout and data structures in ReFS. I've verified alot of these findings using my image locally and expanded upon various points below. Aspects of the described memory structures can be seen in the images locally.

[3] forensicadventures.blogspot.com - Another good analysis, of particular interest is the ReFS / NTFS comparison graphic (here).

Note in general it's good to be familiar w/ generic FS concepts and ones such as B+ trees and journaling.

Also familiarity w/ the NTFS filesystem helps.

Also note I'm not guaranteeing the accuracy of any of this, there could be mistakes in the data and/or algorithm analysis.

Volume / Partition Layout

The size of the image I analyzed was 92733440 bytes with the ReFS formatted partition starting at 0x2010000.

The first sector of this partition looks like:

byte 0x00: 00 00 00 52   65 46 53 00   00 00 00 00   00 00 00 00
byte 0x10: 46 53 52 53   00 02 12 E8   00 00 3E 01   00 00 00 00
byte 0x20: 00 02 00 00   80 00 00 00   01 02 00 00   0A 00 00 00
byte 0x30: 00 00 00 00   00 00 00 00   17 85 0A 9A   C4 0A 9A 32

Since assumably some size info needs to be here, it is possible that:

vbr bytes 0x20-0x23 : bytes per sector    (0x0200)
vbr bytes 0x24-0x27 : sectors per cluster (0x0080)


1 sector = 0x200 bytes = 512 bytes
0x80 sectors/cluster * 0x200 bytes/sector = 0x10000 bytes/cluster = 65536 = 64KB/cluster

Clusters are broken down into pages which are 0x4000 bytes in size (see [2] for page id analysis).

In this case:

  0x10000 (bytes / cluster) / 0x4000 (bytes/page) = 4 pages / cluster


  0x4000 (bytes/page) / 0x200 (bytes/sector) = 0x20 = 32 sectors per page

VBR bytes 0-0x16 are the same for all the ReFS volumes I've seen.

This block is followed by 0's until the first page.


According to [1]:

"The roots of these allocators as well as that of the object table are reachable from a well-known location on the disk"

On the images I've seen the first page id always is 0x1e, starting 0x78000 bytes after the start of the partition.

Metadata pages all have a standard header which is 0x30 (48) bytes in length:

byte 0x00: XX XX 00 00   00 00 00 00   YY 00 00 00   00 00 00 00
byte 0x10: 00 00 00 00   00 00 00 00   ZZ ZZ 00 00   00 00 00 00
byte 0x20: 01 00 00 00   00 00 00 00   00 00 00 00   00 00 00 00
bytes 0/1 (XX XX) is the page id which is sequential and corresponds to the 0x4000 offset of the page
byte 2 (YY) is the sequence number
byte 0x18 (ZZ ZZ) is the virtual page number

The page id is unique for every page in the FS. The virtual page number will be the same between journals / shadow pages though the sequence is incremented between those.

From there the root page has a structure which is still unknown (likely a tree root as described [1] and indicated by the memory structures page on [2]).

The 0x1f page is skipped before pages resume at 0x20 and follow a consistent format.

Page Layout / Tables

After the page header, metadata pages consist of entries prefixed with their length. The meaning of these entities vary and are largely unknown but various fixed and relational byte values do show consistency and/or exhibit certain patterns.

To parse the entries (which might be refered to a records or attributes), one could:

The four bytes following the length often takes on one of two formats depending on the type of entity:

If the entry is a table record,

These values can be seen in the memory structures described in [2]. An example record looks like:

bytes 0-3: 50 00 00 00 # attribute length
bytes 4-7: 10 00 10 00 # key offset / key length
bytes 8-B: 00 00 20 00 # flags / value offset
bytes C-F: 30 00 00 00 # value length / padding

bytes 10-1F: 00 00 00 00   00 00 00 00   20 05 00 00   00 00 00 00 # key (@ offset 0x10 and of length 0x10)
bytes 20-2F: E0 02 00 00   00 00 00 00   00 00 02 08   08 00 00 00 # -|
bytes 30-3F: 1F 42 82 34   7C 9B 41 52   00 00 00 00   00 00 00 00 #  |-value (@ offset 0x20 and length 0x30)
bytes 40-4F: 08 00 00 00   08 00 00 00   00 05 00 00   00 00 00 00 # -|


Various attributes and values in them take on particular meaning.

Special Pages

Particular pages seem to take on specified connotations:

The rest of the pages were either filled with 0's or non-metadata pages containing content. Of particular note is pages 0x2d0 - 0x2d7 containing the upcase table (as seen in ntfs).


I've thrown together a simple ReFS parser using the above assumpions and threw it upon github via a gist.

To utilize download it, and run it using ruby:

ruby resilience.rb -i foo.image --offset 123456789 --table --tree

You should get output similar to the following:


Of course if it doesn't work it could be because there are differences between our images that are unaccounted for, in which case if you drop me a line we can tackle the issue together!

Next Steps

The next steps on the analysis roadmap are to continue diving into the page allocation and addressing mechanisms, there is most likely additional mechanisms to navigate to the critical data structures immediately from the first sector or page 0x1e (since the address of that is known / fixed). Also continuing to investigate each page and analyzing it's contents, especially in the scope of various file and system changes should go a long ways to revealing semantics.