Tags

hardware
ovirt
3d printing
raspberry pi
refs
filesystems
ruby
rubygems
polisher
ldap
fedora
rails
screencasting
hackerspace
vnc
meditation
legaleese
travel
brno
selenium
testing
vim
ssh
sshfs
puppet
html5
jruby
fudcon
conference
aeolus
ohiolinuxfest
jquery
svg
redmine
plugins
haikus
poetry
gsoc
apache
qpid
cloud
miq
db
snap
pygtk
android
nethack
bitcoin
projects
sig315
book review
hopex
conferences
linux
crypto
syrlug
presentation
pidgin
gnome-shell
deltacloud
music
massive attack
fosscon
xmlrpc
retro gaming
sinatra
qmf
webdev
libvirt
virtualization
isitfedoraruby
bundler_ext
slackware
xpath
tips
mercurial
x3d
git
aikido
jsonrpc
rjr
backups
google code in
gaming
svn
yum
xsd
rxsd
virtfs
design patters
rome
europe
project
gcc
compiler
omega
simulator
lvm
storage
engineering expo
rpm
rake
notacon
grep
sed
distros
philosophy
splix
gtk
python
autotools
passenger

Sep 26 2015 bitcoin projects sig315

Bitcoin Aware Barbers Pole

A few months back the guild hosted an Arduino Day workshop. The event was a great success, there was a large turnout, many neat presentations & demos, and much more. My project for the day was an Arduino controlled Barber's Pole that would poll data from the network and activate/deactivate multiple EL wires attached to it. Unfortunately due to a few technical issues the project took a bit longer than originally planned and had to be sidelined. But having recently finished it up, I present the Bitcoin Aware Barber's Pole, now on display at the guild!

Barber1

The premise was straightforward, an Arduino Diecimila would be used in combination with the Ethernet Shield to retrieve data from the Internet and activate one of two el-wires mounted to a pole like object. For that several materials were considered but we ended up using pvc as it was easiest to work with and was what we were going for aesthetically. Since the EL wire is driven from an AC source we used two SPDT relays to activate the circuit based on the state of the Arduino's digital pin output. The constructed circuit was simple, incorporating the necessary components to handle flyback current.

Arduino pole Barber circuitry1

The software component of this project is what took the most time, due to several setbacks. Perhaps the biggest was the shortage of address space we had to work with, micro-controller platforms are notorious for this, but the Diecimila only gave us 16KB of flash memory to use, which after what I'm assuming is space for the bootloader, shared libraries, and other logic is reserved, amounts to ~14KB of memory for the user program and data. Contrast this to modern general purpose PCs, where you'd be hard pressed to find a system with less than 2GB of memory! This had many side effects including not having enough address space to load and use the Arduino HttpClient or Json libraries. Thus a very rudimentary HTTP parsing and request implementation was devised so as to serve the application's needs. All and all it was very simple but specialized, only handling the edge cases we needed an nothing else.

Of course the limited address space meant we were also limited in the amount of constants and variables we could use. Using the heap (also small on this platform) always introduces additional complexities / logic of its own so was avoided. Since each data source would require the metadata needed to access it, we decided to only poll one location and use it to activate either of the two el-wires depending on its state.

In all of this you may be asking why we just didn't order a newer Arduino chip with a bigger address space to which I reply what would I do with the one that I had?!?! Plus developing for platforms with memory & other restrictions introduces fun challenges of its own.

Engineering

At one point we tried splitting the sketch into multiple modules via the Arduino IDE interface. This was done to try and better organize the project, in a more OOD fashion, but introduced more complexities than it was worth. From what I gather, most sketches are single module implementations, perhaps incorporating some external libraries via the standard mechanisms. When we attempted to deviate from this we noticed so weird behavior, perhaps as a result of the includes from centralized Arduino & supporting libraries being pulled into multiple modules. We didn't debug too far, as overall the application isn't that complex.

One of the last challenges we had to face was selecting the data to poll. Again due to the limited memory space, we could only store so much http response data. Additionally even any rudimentary parsing of JSON or other format would take a bit of logic which we didn't have the space for. Luckily we found Bitcoin Average which provides an awesome API for getting up-to-date Bitcoin market data. Not only do they provide a rich JSON over REST interface, but fields can be polled individually for their flat text values, for which we retrieve the BTC/USD market average every 5 minutes. When bitcoin goes up, the blue light is activated, when it goes down, the red light is turned on. Of course this value is a decimal and enabling floating point arithmetic consumes more memory. To avoid this, we parsed the integer and decimal portions of the currency separately and ran the comparisons individually (in sequence).

But unfortunately there was one last hiccup! While the Bitcoin Average documentation stated that HTTP was supported, but in fact querying their server via port 80 just resulted in a 301 redirect to HTTPS running on 443. Since even w/ more modern Arduino platforms w/ larger address spaces, HTTPS/SSL handling proves to be outright impossible due to the complexity of the algorithms, we had to devise a solution to be able to communicate with the server via http in order to retrieve the data. To do so we wrote & deployed a proxy that listens for http requests, issue a https request to bitcoin average and returned the result. This was simple enough to do w/ the Sinatra micro-framework as you can see below:

# HTTP -> HTTPS proxy
# Written to query the bitcoinaverage.com via http (only accessible by https).
# Run as a standard Rack / Sinatra application
#
# Author: Mo Morsi <mo@morsi.org>
# License: MIT

require 'sinatra'
require 'open-uri'

URL = "https://api.bitcoinaverage.com/ticker/USD/last"

get '/' do
  open(URL) do |content|
    content.read
  end
end

The final result was hosted on this server and the Arduino sketch was updated to use it. All in all the logic behind the Barber's Pole can be seen below:

//// Bitcoin Barber Shop Pole
//// Author: Mo Morsi <mo@morsi.org>
//// Arduino Controller Sketch
////
//// License: MIT
//// For use at the Syracuse Innovators Guild (sig315.org)

#include <SPI.h>
#include <Ethernet.h>

//////////////////////////////////
//// sketch parameters

byte mac[]                           = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 };
int port                             = 80;
char server[]                        = "projects.morsi.org";
char host[]                          = "Host: projects.morsi.org";
char request[]                       = "GET /barber/ HTTP/1.1";
char user_agent[]                    = "User-Agent: arduino-ethernet";
char close_connection[]              = "Connection: close";
char content_length_header[]         = "Content-Length";

char CR                              = '\r';
char NL                              = '\n';

unsigned long lastConnectionTime     = 0;
const unsigned long postingInterval  = 300000; // - every 5 mins
boolean lastConnected                = false;

const int  max_data                  = 32;
int  data_buffer_pos                 = 0;
char data_buffer[max_data];
int  content_length                  = -1;
boolean in_body                      = false;

int current_btc                      = 0;
int current_btc_decimal              = 0; // since were not using floats

const int blue_pin                   = 5;
const int red_pin                    = 7;
unsigned long lastLightingTime       = -1;
const unsigned long lightingInterval = 5000;

//////////////////////////////////
// arduino hook in points & config

EthernetClient client;

void setup() {
  pins_config();

  serial_config();
  delay(5000);

  net_config();
  delay(1000);

  turn_on_both();

  Serial.println("started");
}

void loop() {
  net();
  lights();
}

void pins_config(){
  pinMode(blue_pin, OUTPUT);
  pinMode(red_pin, OUTPUT);
}

void serial_config(){
  Serial.begin(9600);
  while (!Serial) { ; } // this check is only needed on the Leonardo
}

/////////////////////////////
// network operations

void net(){
  net_read();
  if(should_reset())
    net_reset();
  else if(should_issue_request())
    net_request();

  process_response();
  lastConnected = client.connected();
}

void block(){
  for(;;) { ; }
}

boolean should_reset(){
  return !client.connected() && lastConnected;
}

void net_reset(){
  client.stop();
}

boolean should_issue_request(){
  return !client.connected() && (millis() - lastConnectionTime > postingInterval);
}

void net_config(){
  if (Ethernet.begin(mac) == 0) {
    Serial.println("net failed");
    block();
  }
}

void net_read(){
  if(client.available()) {
    char c = client.read();
    buffer_append(c);
    //Serial.print(c);
  }
}

void net_request(){
  if (client.connect(server, port)) {
    client.println(request);
    client.println(host);
    client.println(user_agent);
    client.println(close_connection);
    client.println();

    lastConnectionTime = millis();

  }else {
    client.stop();
  }
}

//////////////////////////////////
// data buffer management

void buffer_append(char c){
  data_buffer[data_buffer_pos] = c;
  data_buffer_pos += 1;
  if(data_buffer_pos >= max_data)
    data_buffer_pos = 0;
}

void buffer_reset(){
  data_buffer_pos = 0;
}

// moves last char in buffer to first, sets pos after
void buffer_cycle(){
  data_buffer[0]  = data_buffer[data_buffer_pos-1];
  data_buffer_pos = 1;
}

void buffer_print(){
  Serial.print("buf ");
  Serial.print(data_buffer_pos);
  Serial.print(": ");
  for(int p = 0; p < data_buffer_pos; p++)
    Serial.print(data_buffer[p]);
  Serial.println();
}

//////////////////////////////////
// http parsing / handling

// https://en.wikipedia.org/wiki/HTTP_message_body

int char_pos(char ch){
  for(int p = 1; p < data_buffer_pos; p++)
    if(data_buffer[p] == ch)
      return p;
  return -1;
}

int seperator_pos(){
  return char_pos(':');
}

int decimal_pos(){
  return char_pos('.');
}

boolean status_detected(){
  if(data_buffer_pos < 4) return false;
  int cr_pos    = data_buffer_pos - 3;
  int lf_pos    = data_buffer_pos - 2;
  int alpha_pos = data_buffer_pos - 1;

  // only upper case letters
  int alpha_begin = 65;
  int alpha_end   = 90;

  return data_buffer[cr_pos]    == CR          &&
         data_buffer[lf_pos]    == NL          &&
         data_buffer[alpha_pos] >= alpha_begin &&
         data_buffer[alpha_pos] <= alpha_end;
}

boolean header_detected(){
  if(data_buffer_pos < 5) return false;
  int cr_pos     = data_buffer_pos - 2;
  int lf_pos     = data_buffer_pos - 1;

  return seperator_pos()     != -1   &&
         data_buffer[cr_pos] == CR   &&
         data_buffer[lf_pos] == NL;
}

boolean is_header(char* name){
  int pos = 0;
  while(name[pos] != '\0'){
    if(name[pos] != data_buffer[pos])
      return false;
    pos++;
  }
  return true;
}

boolean body_detected(){
  if(data_buffer_pos < 4) return false;
  int first_cr  = data_buffer_pos - 4;
  int first_lf  = data_buffer_pos - 3;
  int second_cr = data_buffer_pos - 2;
  int second_lf = data_buffer_pos - 1;

  return (data_buffer[first_cr]  == CR &&
          data_buffer[first_lf]  == NL &&
          data_buffer[second_cr] == CR &&
          data_buffer[second_lf] == NL);
}

int extract_content_length(){
  int value_pos = seperator_pos() + 1;
  char content[data_buffer_pos - value_pos];
  for(int p = value_pos; p < data_buffer_pos; p++)
    content[p-value_pos] = data_buffer[p];
  return atoi(content);
}

void process_headers(){
  if(status_detected()){
    buffer_cycle();

  }
  else if(header_detected()){
    if(is_header(content_length_header))
      content_length = extract_content_length();
    buffer_reset();

  }
  else if(body_detected()){
    in_body = true;
    buffer_reset();
  }
}

int extract_new_btc(){
  int decimal  = decimal_pos();
  int buf_size = decimal == -1 ? data_buffer_pos - 1 : decimal;
  int iter_end = decimal == -1 ? data_buffer_pos     : decimal;

  char value[buf_size];
  for(int p = 0; p < iter_end; p++)
    value[p] = data_buffer[p];
  return atoi(value);
}

int extract_new_btc_decimal(){
  int decimal  = decimal_pos();
  if(decimal == -1 || decimal == data_buffer_pos - 1) return 0;
  int buf_size = data_buffer_pos - decimal - 1;
  int iter_start = decimal + 1;

  char value[buf_size];
  for(int p = iter_start; p < data_buffer_pos; p++)
    value[p - iter_start] = data_buffer[p];
  return atoi(value);
}

void process_body(){
  if(!in_body || data_buffer_pos < content_length) return;

  //buffer_print();
  process_new_btc(extract_new_btc(), extract_new_btc_decimal());

  content_length = -1;
  in_body = false;
  buffer_reset();
}

void process_response(){
  if(!in_body)
    process_headers();
  else
    process_body();
}

//////////////////////////////////
// target specific data processing

void print_btc(int btc, int btc_decimal){
  Serial.print(btc);
  Serial.print('.');
  Serial.print(btc_decimal);
  Serial.println();
}

boolean value_increased(int new_btc, int new_btc_decimal){
  return new_btc > current_btc || (new_btc == current_btc && new_btc_decimal > current_btc_decimal);
}

boolean value_decreased(int new_btc, int new_btc_decimal){
  return new_btc < current_btc || (new_btc == current_btc && new_btc_decimal < current_btc_decimal);
}

void process_new_btc(int new_btc, int new_btc_decimal){
  //print_btc(current_btc, current_btc_decimal);
  //print_btc(new_btc, new_btc_decimal);

  if(value_increased(new_btc, new_btc_decimal)){
    turn_on_blue();

  }
  else if(value_decreased(new_btc, new_btc_decimal)){
    turn_on_red();

  }
  else{
    turn_on_both();
  }

  current_btc = new_btc;
  current_btc_decimal = new_btc_decimal;
}

//////////////////////////////////
// pin output handling

boolean should_turn_off(){
  return lastLightingTime != -1 && (millis() - lastLightingTime > lightingInterval);
}

void lights(){
  if(should_turn_off()){
    turn_off_both();
    lastLightingTime = -1;
  }
}

void turn_on_blue(){
  lastLightingTime = millis();
  digitalWrite(blue_pin, HIGH);
}

void turn_off_blue(){
  digitalWrite(blue_pin, LOW);
}

void turn_on_red(){
  lastLightingTime = millis();
  digitalWrite(red_pin, HIGH);
}

void turn_off_red(){
  digitalWrite(red_pin, LOW);
}

void turn_on_both(){
  turn_on_blue();
  turn_on_red();
}

void turn_off_both(){
  turn_off_blue();
  turn_off_red();
}

The actual construction of the pole consists of a short length of PVC pipe capped at both ends. The text was spray painted over and a small hole drilled in the back for the power & network cables. The circuity was simply placed flat inside the pvc, no special mounting or attachments were used or needed.

The final setup was placed near the enterance of the Guild where anyone walking in / out could see it.

Barber enterance Barber closeup

All in all it was a fun project that took a bit longer than originally planned, but when is that not the case?! Microcontrollers always prove to be unique environments, and although in this case it just amounted to some C++ development, the restricted platform presented several interesting challenges I hadn't encountered since grad school. Going forward I'm contemplating looking into the Raspberry Pi platform for my next project as it seems to be a bit more flexible & has more address space, while still available at a great price point.

Sayonara!

Sep 15 2015 miq db

CloudForms v2 MiQ DB - 08/2015

Cfme db 0

Now that's a db! Created using Dia.

Relevant table reference / listing can be found here

Modeling is the first step towards Optimization.
Aug 9 2015 projects refs polisher sig315

Polished to a Resilience

Amazed
Long time since the last post, it's been quite an interesting year! We all move forward as do the efforts on various fronts. Just a quick update today on two projects previously discussed as things ramp up after some downtime (look for more updates going into the fall / winter). Polisher has received alot of work in the domains of refactoring and new tooling. The codebase is more modular and robust, test coverage has been greatly expanded, and as far as the new utilities:

These can be seen in action via the asciinema.org screencasts referenced above.


Resilience, our expiremental REFS parser, has also been polished. Various utilities previously written have been renamed, refactored, and expanded; and new tooling written to continue the analysis. Of particular notability are:

Also worthy to note are other ongoing efforts including updating ruby to 2.3 in rawhide and updating rails to 4.x in EPEL.

Finally, the SIG has been going through (another) transitionary period. While membership is still growing there are many logistical matters currently up in the air that need to be resolved. Look for updates on that front as well as many others in the near future.

Mar 15 2015 meditation

The Right Mind And The Confused Mind

Unfettered

From The Unfettered Mind which offers some great advice on the subject of meditation (take or leave what you will):

The Right Mind And The Confused Mind The Right Mind is the mind that does not remain in one place. It is the mind that stretches throughout the entire body and self. The Confused Mind is the mind that, thinking something over, congeals in one place. When the Right Mind congeals and settles in one place, it becomes what is called the Confused Mind. When the Right Mind is lost, it is lacking in function here and there. For this reason, it is important not to lose it. In not remaining in one place, the Right Mind is like water. The Confused Mind is like ice, and ice is unable to wash hands or head. When ice is melted, it becomes water and flows everywhere, and it can wash the hands, the feet or anything else. If the mind congeals in one place and remains with one thing, it is like frozen water and is unable to be used freely: ice that can wash neither hands nor feet. When the mind is melted and is used like water, extending throughout the body, it can be sent wherever one wants to send it. This is the Right Mind.

The Mind Of The Existent Mind And The Mind Of No-Mind The Existent Mind is the same as the Confused Mind and is literally read as the "mind that exists." It is the mind that thinks in one direction, regardless of subject. When there is an object of thought in the mind, discrimination and thoughts will arise. Thus it is known as the Existent Mind.

The No-Mind is the same as the Right Mind. It neither congeals nor fixes itself in one place. It is called No-Mind when the mind has neither discrimination nor thought but wanders about the entire body and extends throughout the entire self.

The No-Mind is placed nowhere. Yet it is not like wood or stone. Where there is no stopping place, it is called No-Mind. When it stops, there is something in the mind. When there is nothing in the mind, it is called the mind of No-Mind. It is also called No-Mind-No-Thought.

When this No-Mind has been well developed, the mind does not stop with one thing nor does it lack any one thing. It is like water overflowing and exists within itself. It appears appropriately when facing a time of need. The mind that becomes fixed and stops in one place does not function freely. Similarly, the wheels of a cart go around because they are not rigidly in place. If they were to stick tight, they would not go around. The mind is also something that does not function if it becomes attached to a single situation. If there is some thought within the mind, though you listen to the words spoken by another, you will not really be able to hear him. This is because your mind has stopped with your own thoughts.

If your mind leans in the directions of these thoughts, though you listen, you will not hear; and though you look, you will not see. This is because there is something in your mind. What is there is thought. If you are able to remove this thing that is there, your mind will become No-Mind, it will function when needed, and it will be appropriate to its use.

The mind that thinks about removing what is within it will by the very act be occupied. If one will not think about it, the mind will remove these thoughts by itself and of itself become No-Mind. If one always approaches his mind in this way, at a later date it will suddenly come to this condition by itself. If one tries to achieve this suddenly, it will never get there.

An old poem says:

To think, "I will not think"-
This, too, is something in one's thoughts.
Simply do not think
About not thinking at all.

Nov 2 2014 refs filesystems

ReFS Part II - May the Resilience Be With You

Not too long after the last post, it became aparent that the disk I was analyzing wasn't a valid filesystem. Possibly due to a transfer error, several bytes were missing resulting in disk structures that weren't aligned with the addresses where they should've resided. After generating a new image I was able to make alot of headway on the analysis. To start off a valid metadata page address became immediately aparent on page 0x1e. Recall that page 0x1e is the first metadata page residing at a fixed / known location after the start of the partition:
  bytes 0xA0-A7: 90 19 00 00 00 00 00 00
        0xA8-AF: 67 31 01 00 00 00 00 00
Pages 0x1990 and 0x13167 are valid metadata pages containing similar contents. Most likely one is a backup of the other. Assuming the first record is the primary copy (0x1990). Note this address appears at byte 0xA0 on page 0x1E. Byte 0xA0 is referenced earlier on in the page:
  byte 0x50: A0 00 00 00  02 00 00 00  B0 00 00 00  18 00 00 00
So it is possible that this page address is not stored at a static location but at a offset referenced earlier in the page.

The System Table

The word 6 value (previously refered to as virtual page number) of page 0x1990 is '0' indicating this is a critical table. Lets call this the System Table for the reasons found below. This page contains 6 rows of 24-byte entries, each containing a valid metadata partition, some flags, and a 16-byte unique id/checksum of some sort. Early on in the page the table header resides:
  byte 0x58: 06 00 00 00   98 00 00 00
             B0 00 00 00   C8 00 00 00
             E0 00 00 00   F8 00 00 00
             10 01 00 00
06 is the number of records and each dword after this contains the offset from the very start of the page to each table record:
  table offsets: 98, B0, C8, E0, F8, 110
Each table record has a page id, flags, and some other unique qword of some sort (perhaps an object id or checksum),
  page ids:      corresponding virtual page id values:
    2c2               2
     22               E
     28               D
     29               C
    2c8               1
    2c5               3
These correspond to the latest revisions of the critical system pages highlighted in previous analysis.

Directories

We've previously established Virtual Page 0x2 contains the object table and upon furthur examination of the keys (object id's) and values (page id's)' we see object 0x0000000000006000000000000000000 is the root directory (this is consistent across images). The format of a directory page varies depending its type. Like all metadata-pages the first 0x30 bytes contains the page metadata. This is followed by a attribute of unknown purpose (seems to be related to the page's contents, perhaps a generic bucket / container descriptor). This is followed by the table header attribute, 0x20 bytes in length. This attribute contains: Table Type Flags: Table records here work like any other table consisting of The semantics of the record values differ depending on the table type. Directory lists contain: B+ trees contain: When iterating over directory list records, the record flags seem to indicate record context. A value of '4' stored in the record flags seems to indicate a historical / old entry, for example an old directory name before it was renamed (eg 'New Folder'). The files / directories we are interested in contain '0' or '8' in the record flags. The intent of each matching directory list record can be furthur deduced by the first 4 bytes in its key which may be:
  0x00000010 - directory information
  0x00020030 - subdirectory - name will be the rest of the key
  0x00010030 - file - name will be the rest of the key
  0x80000020 - ???
In the case of subdirectories, the first 16 bytes of the record value will contain the directory object id. The object table can be used to look this up to access its page. For B+ trees the record values will contain the ids of pages containing directory records (and possibly more B+ levels though I didn't verify this). Full filesystem traversal can be implemented by iterating over the root tree, subdirs, and file records.

File Tables

File metadata is stored as a table embedded directly into the directory table which the file is under. Each file table always starts with an attribute 0xA8 length containing the file timestamps (4 qwords starting at byte 0x28 of this attribute) & file length (starting at byte 0x68 of this attribute). Note the actual units of time which the timestamps represent are still unknown. After this there exists several related metadata attributes. The second attribute (starting at byte 0xA8 of the file table):
  20 00 00 00 # length of this record
  A0 01 00 00 # length of this record + next record
  D4 00 00 00 # amount of padding after next record
  00 02 00 00 # table type / flags ?
  74 02 00 00 # next 'insert' address ?
  01 00 00 00 # number of records ?
  78 02 00 00 # offset to padding
  00 00 00 00
The next record looks like a standard table record as we've seen before:
  80 01 00 00 # length of this record, note this equals 2nd dword value of last record minus 0x20
  10 00 0E 00 # offset to key / key length
  08 00 20 00 # flags / offset to value
  60 01 00 00 # value length / padding
The key of this record starts at 0x10 of this attribute and is 0x0E length:
  60 01 00 00
  00 00 00 00
  80 00 00 00
  00 00 00
The value starts at attribute offset 0x20 and is of length 0x160. This value contains yet another embeded attribute:
  88 00 00 00 # length of attribute
  28 00 01 00
  01 00 00 00
  20 01 00 00
  20 01 00 00
  02 00 00 00
  00 00 00 00
  00 00 00 00
  00 00 00 00
  00 00 00 00
  01 00 00 00
  00 00 00 00
  00 00 00 00
  00 00 03 00
  00 00 00 00
  2C 05 02 00 # length of the file
  00 00 00 00
  2C 05 02 00 # length of the file
  00 00 00 00
  # 0's for the rest of this attribute
The file length is represented twice in this attribute (perhaps allocated & actual lengths) The next attribute is as follows:
  20 00 00 00 # length of attribute
  50 00 00 00 # length of this attribute + length of next attribute
  84 00 00 00 # amount of padding after this attribute
  00 02 00 00 # ?
  D4 00 00 00 # next insert address
  01 00 00 00 # ?
  D8 00 00 00 # offset to padding
  00 00 00 00
The format of this attribute looks similar to the second in the file (see above) and seems to contain information about the next record(s). Perhaps related to the 'bucket' concept discussed here At first glance the next attribute looks like another standard record but the key and value offsets are the same. This attribute contains the starting page # of the file content
 30 00 00 00 # length of this record
 10 00 10 00 # key offset / length ?
 00 00 10 00 # flags / value offset ?
 20 00 00 00 # value length / padding ? 
 00 00 00 00
 00 00 00 00
 0C 00 00 00
 00 00 00 00
 D8 01 00 00 # starting page of the file
 00 00 00 00
 00 00 00 08
 00 00 00 00 
For larger files there are more records following this attribute, each of 0x30 length, w/ the same record header. Many of the values contain the pages containing the file contents, though only some have the same format as the one above. Other records may correspond to compressed / sparse attributes and have a different format. The remainder of this attribute is zero and closes out the third attribute in the file record. After this there is the amount of padding described by the second attribute in the file (see above) after which there are two more attributes of unknown purpose.

Other

After investigation it seems the ReFS file system driver doesn't clear a page when copying / overwriting shadow pages. Old data was aparent after valid data on newer pages. Thus a parser cannot rely on 0'd out regions to acts as deliminators or end markers. Using the above analysis I threw together a ReFS file lister that iterates over all directories and files from the root. It can be found on github here. Use it like so:
ruby rels.rb --image foo.image --offset 123456789
Rels

Next Steps

Besides verifying all of the above, the next major action items are to extract the pages / clusters containing file data as well as all file metadata.
Sep 13 2014 refs filesystems

ReFS - All Your Resilience Are Belong To Us

(grammer intentional) The last few months I've been looking into the Resilent File System (ReFS), which has only undergone limited analysis so far. Let's fix that shall we!

Before we begin, I've found these to be the best existing public resources so far concerning the FS, they've helped streamline the investigation greatly.

[1] blogs.msdn.com - Straight from the source, a msdn blog post on various concepts around the FS internals.

[2] williballenthin.com - An extended analysis of the high level layout and data structures in ReFS. I've verified alot of these findings using my image locally and expanded upon various points below. Aspects of the described memory structures can be seen in the images locally.

[3] forensicadventures.blogspot.com - Another good analysis, of particular interest is the ReFS / NTFS comparison graphic (here).

Note in general it's good to be familiar w/ generic FS concepts and ones such as B+ trees and journaling.

Also familiarity w/ the NTFS filesystem helps.

Also note I'm not guaranteeing the accuracy of any of this, there could be mistakes in the data and/or algorithm analysis.

Volume / Partition Layout

The size of the image I analyzed was 92733440 bytes with the ReFS formatted partition starting at 0x2010000.

The first sector of this partition looks like:

byte 0x00: 00 00 00 52   65 46 53 00   00 00 00 00   00 00 00 00
byte 0x10: 46 53 52 53   00 02 12 E8   00 00 3E 01   00 00 00 00
byte 0x20: 00 02 00 00   80 00 00 00   01 02 00 00   0A 00 00 00
byte 0x30: 00 00 00 00   00 00 00 00   17 85 0A 9A   C4 0A 9A 32

Since assumably some size info needs to be here, it is possible that:

vbr bytes 0x20-0x23 : bytes per sector    (0x0200)
vbr bytes 0x24-0x27 : sectors per cluster (0x0080)

Thus:

1 sector = 0x200 bytes = 512 bytes
0x80 sectors/cluster * 0x200 bytes/sector = 0x10000 bytes/cluster = 65536 = 64KB/cluster

Clusters are broken down into pages which are 0x4000 bytes in size (see [2] for page id analysis).

In this case:

  0x10000 (bytes / cluster) / 0x4000 (bytes/page) = 4 pages / cluster

Also:

  0x4000 (bytes/page) / 0x200 (bytes/sector) = 0x20 = 32 sectors per page

VBR bytes 0-0x16 are the same for all the ReFS volumes I've seen.

This block is followed by 0's until the first page.

Pages

According to [1]:

"The roots of these allocators as well as that of the object table are reachable from a well-known location on the disk"

On the images I've seen the first page id always is 0x1e, starting 0x78000 bytes after the start of the partition.

Metadata pages all have a standard header which is 0x30 (48) bytes in length:

byte 0x00: XX XX 00 00   00 00 00 00   YY 00 00 00   00 00 00 00
byte 0x10: 00 00 00 00   00 00 00 00   ZZ ZZ 00 00   00 00 00 00
byte 0x20: 01 00 00 00   00 00 00 00   00 00 00 00   00 00 00 00
bytes 0/1 (XX XX) is the page id which is sequential and corresponds to the 0x4000 offset of the page
byte 2 (YY) is the sequence number
byte 0x18 (ZZ ZZ) is the virtual page number

The page id is unique for every page in the FS. The virtual page number will be the same between journals / shadow pages though the sequence is incremented between those.

From there the root page has a structure which is still unknown (likely a tree root as described [1] and indicated by the memory structures page on [2]).

The 0x1f page is skipped before pages resume at 0x20 and follow a consistent format.

Page Layout / Tables

After the page header, metadata pages consist of entries prefixed with their length. The meaning of these entities vary and are largely unknown but various fixed and relational byte values do show consistency and/or exhibit certain patterns.

To parse the entries (which might be refered to a records or attributes), one could:

The four bytes following the length often takes on one of two formats depending on the type of entity:

If the entry is a table record,

These values can be seen in the memory structures described in [2]. An example record looks like:

bytes 0-3: 50 00 00 00 # attribute length
bytes 4-7: 10 00 10 00 # key offset / key length
bytes 8-B: 00 00 20 00 # flags / value offset
bytes C-F: 30 00 00 00 # value length / padding

bytes 10-1F: 00 00 00 00   00 00 00 00   20 05 00 00   00 00 00 00 # key (@ offset 0x10 and of length 0x10)
bytes 20-2F: E0 02 00 00   00 00 00 00   00 00 02 08   08 00 00 00 # -|
bytes 30-3F: 1F 42 82 34   7C 9B 41 52   00 00 00 00   00 00 00 00 #  |-value (@ offset 0x20 and length 0x30)
bytes 40-4F: 08 00 00 00   08 00 00 00   00 05 00 00   00 00 00 00 # -|

Entries

Various attributes and values in them take on particular meaning.

Special Pages

Particular pages seem to take on specified connotations:

The rest of the pages were either filled with 0's or non-metadata pages containing content. Of particular note is pages 0x2d0 - 0x2d7 containing the upcase table (as seen in ntfs).

Parser

I've thrown together a simple ReFS parser using the above assumpions and threw it upon github via a gist.

To utilize download it, and run it using ruby:

ruby resilience.rb -i foo.image --offset 123456789 --table --tree

You should get output similar to the following:

Resilience

Of course if it doesn't work it could be because there are differences between our images that are unaccounted for, in which case if you drop me a line we can tackle the issue together!

Next Steps

The next steps on the analysis roadmap are to continue diving into the page allocation and addressing mechanisms, there is most likely additional mechanisms to navigate to the critical data structures immediately from the first sector or page 0x1e (since the address of that is known / fixed). Also continuing to investigate each page and analyzing it's contents, especially in the scope of various file and system changes should go a long ways to revealing semantics.

Jul 26 2014 hopex conferences

HOPE X Lightning Track

Hopex icon 192x192

Planning to be in the city this weekend? Want to give a short presentation at one of the world's biggest hacker/maker conferences? I'm helping organize the Lightning Talks Track at this year's HOPE X. All are welcome to present, topics can be on anything relevant / interesting.

If you're interested simply email me or add yourself to the wiki page and we'll be in touch with more information. Even if you don't want to give a talk, I encourage you to checkout the conf schedule, it's looking to be a great lineup!

Happy hacking!


***update 07-26***: Conference was great and talks were awesome.

Besides a few logistical glitches, all went well. The final speaker / presentation lineup can be seen on the HOPE Wiki (also attached below). Many thanks to all that participated!

Friday: Saturday:
Jun 16 2014 omega simulator

Omega - A Universe Over IP

Omega large logo 2

Anyone following my github repo over the last few years may have noticed alot of activity around the Omega Project for which a post about is long overdue. Omega is a Universe Simulator that serves requests over the JSON-RPC protocol, enabling the widest variety of clients implemented in any language, running on any platform, to access and manipulate the universe over any transport (TCP/IP, HTTP, Websockets, AMQP, and more).

Omega consists of several subsystems implementing a variety of functionally from 3D location tracking, to user management and access control, to querying and manipulating hierarchies of cosmic entities and player-controlled ships and stations. Omega provides a complete scriptable interface through which an admin can setup and customize sequences of server side events and workflows as well as access high-level simulation statistics and user-ranking mechanisms.

The Omega Project is not only just that though, also included are a slew of utility binaries to setup and configure a custom universe, many examples of tailored environments, a full featured client DSL, as well as a web frontend based on middleman and three.js. A hosted demo instance is available at megaverse.net


Omega has been a labour of love over the last several years, it's gone under several disparate revisions, having been started shortly after I finished grad school in '08 and slowly being improved since. The structure and runtime is finally stable enough to promote / share with others to hopefully grow the community. Credit goes to many various people who've been helping along the way, from Nolan and Nico who have been playing around with getting it up an running on different platforms, reporting various issues, and submitting various patches, to Nitesh who is working on packaging up the project with dependencies, to Mark/Ken/Dan from the local hackerspace who helped organize / run the first Omega meetup a few months back, to the many others who I have collaborated with and tossed around ideas with along the road.

As far as what's next I aim to plan keep hacking away at the repo, wrapping up loose ends and starting to implement the next round of features on the list. As it stands Omega currently ships with a distributed example in which two separate nodes interact with each other, permitting a ship to jump across a distributed universe. I am to continue expanding this, hopefully setting up a network of these simulations at some point in the near future. Also various other clients (such as Android and native desktop clients), integration with the Occulus Rift, and more is on the backlog.

If you are interested in following progress make sure to join the newly created mailing list (blog coming soon). The project is still in its early / alpha stages, so contributors are more than welcome. There is plenty TODO from coding, testing, documentation, marketing / promotion, and more. There is something for anyone! If you're interested in the project or just want to find out more info feel free to drop me a line anytime.

Other than that, be sure to check the repo often for updates as there's alot more coming!

Happy Star-Travelling!

Apr 10 2014

Notacon, The Ruby Platform, and More

Simulacra and simulation

Long time, no post. This blog certainly has been on the backburner recently. Have been pushing plenty of updates to the Omega Blog as well as to the project itself so be sure to check out the new additions if it's been a while (and the latest changelog entry though there have been plenty of updates since!)

For those in the Cleveland, OH area this weekend and are free, be sure to swing by Notacon where I will be giving a presentation / demo of Omega. It will be held Saturday @3PM in Ballroom III at the Cleveland Mariott East. Of course there are plenty of other great happenings at Notacon, I attended a few years back. Unfortunately, this year may be the last time the event is run due to financial difficulties putting it together. Hope this is not the case as it's a great conf, but will have to see.

Work-wise we've been making a lot of headway on automating various aspects of the the Ruby post-publishing build process, Polisher has seen alot of improvements recently. For example it is now possible to easily run a diff of a forked gem on github via it's original rubygems.org source:

  fog = Polisher::Gem.from_rubygems 'fog', '1.19.0'
  other = Polisher::Git::Repo.new :url => 'https://github.com/ManageIQ/fog.git'
  other.clone unless other.cloned?

  diff = fog.diff(other)
Patches for a RPM spec could be generated from there:
  patches = Polisher::RPM::Patch.from diff
  puts patches.collect { |p| p.title.blue.bold + ":\r\n " + p.content[1..50] + '...'.blue.bold }

One can also use Polisher to quickly retrieve this list of tags which a rubygem has been built for in Koji or to cross reference the packages in a Koji tag:

  tags = Polisher::Koji.tagged_in('rails')
  diff = Polisher::Koji.diff('f21', 'f20')
bundler_ext has also seen alot updates recently including a new release. The BundlerExt API can now be used intechangibly w/ Bundler's setup and require features which will load the gems described in your Gemfile.
  BundlerExt.system_setup(gemfile_in, :default)
  BundlerExt.system_require(gemfile_in, :default, Rails.env)

In Bundler's case the user will be prompted to retrieve missing gems from rubygems.org (via bundle install). With BundlerExt the gems will be loaded from the local system, & it will enforce that contract, any missing ones / inconsistencies will result in a failure. In both cases the ruby load path is setup to contain the only the dependencies include paths so only the specified gemset is used.

Finally of no particular relevance to the above, I've picked up a copy of Simulacra and Simulation and while I haven't gotten too far into it, it's already quite an interesting read. In a nutshell the author's main argument seems to revolve around our defining our own realities by continuously simulating our behaviours and actions (whether concious of it or not) and by accepting endless progression of Simulacra, or images which first distort, then hide, and ultimately replace the reality of that which is being simulated. Ironically it is simulation that insists that reality is real, inorder for it to have some sort of baseline, but rather since all our experience is based on perspective (which in return is influenced by behaviour and vice-versa) simulation / simulacra is reality.

The argument has been posed before, afterall wise ones and sages have been advising about the fallacies of perspective and multitude of interpretation for thousands of years, though the author does present a well constructed formal argument so far. I have to finish reading the treatise before I can comment further but am looking forward to seeing where he goes w/ it from here.

Enjoy the simulation!

Nov 20 2013 bundler_ext rubygems ruby

Using Gem and System Installed Ruby Dependencies Side By Side

Gembundler

Be sure to checkout the latest updates to the bundler_ext project which now permits the activation of Gemfile dependencies that have been installed system wide (eg via yum, apt-get, or other).

bundler_ext is a simple library allowing Ruby developers to bypass the local vendored gem install process, rather relying on gems already installed by the end user or system administrator. By default after installing and importing bundler_ext into your project (see the README), the modules that are auto-required by the gems described in the project's Gemfile are pulled in and then nothing else will be done, vendored deps are not installed. Other dependencies that are required will be loaded on the fly from the latest versions installed system wide, as 'require' statements are encountered.

The end user is responsible for ensuring only the required versions of gems are installed and accessible, else incompatible dependencies may be pulled in at runtime. The latest patch to bundler_ext permits multiple versions to be available, as the user can now specify the BEXT_ACTIVATE_VERSIONS environment variable to instruct bundler_ext to lookup the versions of dependencies that have been install via the native package management system (yum, apt-get, or other), and activate those ahead of time. Therefore any Ruby application can now easily switch between the system-wide/distro-supported stack, the latest stack of dependencies available, or the traditional bundler-installed vendored dependencies by toggling a few flags.

On the backend, bundler_ext uses linux_admin to query the native package system to lookup the versions of dependencies (linux_admin is pulled in as a soft-dependency, and isn't required unless BEXT_ACTIVATE_VERSIONS is set true). The patches adding the necessary functionality have been merged into the bundler_ext and linux_admin upstream repos with a couple of small style and metadata changes in flight before new releases are pushed to rubygems.org.

Hope this helps Ruby developers looking for a supported stack while still maintaining the flexibility that rubygems and bundler provides.