retro gaming
raspberry pi
3d printing
book review
google code in
design patters
engineering expo
massive attack

Oct 7 2017

Why I still choose Ruby

With the plethora of languages available to developers, I wanted to do a quick follow-up post as to why given my experience in many different environments, Ruby is still the goto language for all my computational needs!

Prg mtn

While different languages offer different solutions in terms of syntax support, memory managment, runtime guarantees, and execution flows; the underlying arithmatic, logical, and I/O hardware being controlled is the same. Thus in theory, given enough time and optimization the performance differences between languages should go to 0 as computational power and capacity increases / goes to infinity (yes, yes, Moore's law and such, but lets ignore that for now).

Of course different classes of problem domains impose their own requirements,

But given the rapid rate of hardware performance in recent years, whole classes of problems which were previously limited to 'lower-level' languages such as C and C++ are now able to be feasbily implemented in higher level languages.

Computer power


Thus we see high performance financial applications being implemented in Python, major websites with millions of users a day being implemented in Ruby and Javascript, massive data sets being crunched in R, and much more.

So putting the performance aspect of these environments aside we need to look at the syntactic nature of these languages as well as the features and tools they offer for developers. The last is the easiest to tackle as these days most notable languages come with compilers/interpreters, debuggers, task systems, test suites, documentation engines, and much more. This was not always the case though as Ruby was one of the first languages that pioneered builtin package management through rubygems, and integrated dependency solutions via gemspecs, bundler, etc. CPAN and a few other language-specific online repositories existed before, but with Ruby you got integration that was a core part of the runtime environment and community support. Ruby is still known to be on the leading front of integrated and end-to-end solutions.

Syntax differences is a much more difficult subject to objectively dicuss as much of it comes down to programmer preference, but it would be hard to object to the statement that Ruby is one of the most Object Oriented languages out there. It's not often that you can call the string conversion or type identification methods on ALL constructs, variables, constants, types, literals, primitives, etc:

  > 1.to_s
  => "1"
  > 1.class
  => Integer

Ruby also provides logical flow control constructs not seen in many other languages. For example in addition to the standard if condition then dosomething paradigm, Ruby allows the user to specify the result after the predicate, eg dosomething if condition. This simple change allows developers to express concepts in a natural manner, akin to how they would often be desrcibed between humans. In addition to this, other simple syntax conveniences include:

Expanding upon the last, blocks are a core concept in Ruby, once which the language nails right on the head. Not only can any function accept an anonymous callback block, blocks can be bound to parameters and operated on like any other data. You can check the number of parameters the callbacks accept by invoking block.arity, dynamically dispatch blocks, save them for later invokation and much more.

Due to the asynchronous nature of many software solutions (many problems can be modeled as asynchronous tasks) blocks fit into many Ruby paradigms, if not as the primary invocation mechanism, then as an optional mechanism so as to enforce various runtime guarantees:

  File.open("/tmp/foobar"){ |file|
    # do whatever with file here

  # File is guaranteed to be closed here, we didn't have to close it ourselves!

By binding block contexts, Ruby facilitates implementing tightly tailored solutions for many problem domains via DSLs. Ruby DSLs exists for web development, system orchestration, workflow management, and much more. This of course is not to mention the other frameworks, such as the massively popular Rails, as well as other widely-used technologies such as Metasploit

Finally, programming in Ruby is just fun. The language is condusive to expressing complex concepts elegantly, jives with many different programming paradigms and styles, and offers a quick prototype to production workflow that is intuitive for both novice and seasoned developers. Nothing quite scratches that itch like Ruby!

Doge ruby
Sep 21 2017

Data & Market Analysis in C++, R, and Python

In recent years, since efforts on The Omega Project and The Guild sort of fizzled out, I've been exploring various areas of interest with no particular intent other than to play around with some ideas. Data & Financial Engineering was one of those domains and having spent some time diving into the subject (before once again moving on to something else altogether) I'm sharing a few findings here.

My journey down this path started not too long after the Bitcoin Barber Shop Pole was completed, and I was looking for a new project to occupy my free time (the little of it that I have). Having long since stepped down from the SIG315 board, but still renting a private office at the space, I was looking for some way to incorporate that into my next project (besides just using it as the occasional place to work). Brainstorming a bit, I settled on a data visualization idea, where data relating to any number of categories would be aggregated, geotagged, and then projected onto a virtual globe. I decided to use the Marble widget library, built ontop of the QT Framework and had great success:


The architecture behind the DataChoppa project was simple, a generic 'Data' class was implemented using smart pointers ontop of which the Facet Pattern was incorporated, allowing data to be recorded from any number of sources in a generic manner and represented via convenient high level accessors. This was all collected via synchronization and generation plugins which implement a standarized interface whose output was then fed onto a queue on which processing plugings were listening, selecting the data that they were interested in to be operated on from there. The Processors themselves could put more data onto the queue after which the whole process was repeated ad inf., allowing each plugin to satisfy one bit of data-related functionality.

Datachoppa arch

Core Generic & Data Classes

namespace DataChoppa{
  // Generic value container
  class Generic{
      Map<std::string, boost::any> values;
      Map<std::string, std::string> value_strings;

  namespace Data{
    /// Data representation using generic values
    class Data : public Generic{
        Data() = default;
        Data(const Data& data) = default;

        Data(const Generic& generic, TYPES _types, const Source* _source) :
          Generic(generic), types(_types), source(_source) {}

        bool of_type(TYPE type) const;

        Vector to_vector() const;

        TYPES types;

        const Source* source;
    }; // class Data
  }; // namespace Data
}; // namespace DataChoppa

The Process Loop

  namespace DataChoppa {
    namespace Framework{
      void Processor::process_next(){
        if(to_process.empty()) return;
        Data::Data data = to_process.first();
        Plugins::Processors::iterator plugin = plugins.begin();
        while(plugin != plugins.end()) {
          Plugins::Meta* meta = dynamic_cast<Plugins::Meta*>(*plugin);
          //LOG(debug) << "Processing " << meta->id;
          }catch(const Exceptions::Exception& e){
            LOG(warning) << "Error when processing: " << e.what()
                         << " via " << meta->id;
    }; /// namespace Framework
  }; /// namespace DataChoppa

The HTTP Plugin (abridged)

namespace DataChoppa {
  namespace Plugins{
    class HTTP : public Framework::Plugins::Syncer,
                 public Framework::Plugins::Job,
                 public Framework::Plugins::Meta {
        /// ...

        /// sync - always return data to be added to queue, even on error
        Data::Vector sync(){
          String _url = url();
          Network::HTTP::SyncRequest request(_url, request_timeout);

          for(const Network::HTTP::Header& header : headers())

          int attempted = 0;
          Network::HTTP::Response response(request);

          while(attempts == -1 || attempted &lt; attempts){


              if(attempted == attempts){
                Data::Data result = response.to_error_data();
                result.source = &source;
                return result.to_vector();

              if(attempted == attempts){
                Data::Data result = response.to_error_data();
                result.source = &source;
                return result.to_vector();

              Data::Data result = response.to_data();
              result.source = &source;
              return result.to_vector();

          /// we should never get here
          return Data::Vector();
  }; // namespace Plugins
}; // namespace DataChoppa

Overall I was pleased with the result (and perhaps I should have stopped there...). The application collected and aggregated data from many sources including RSS feeds (google news, reddit, etc), weather sources (yahoo weather, weather.com), social networks (facebook, twitter, meetup, linkedin), chat protocols (IRC, slack), financial sources, and much more. While exploring the last I discovered the world of technical analysis and began incorporating many various market indicators into a financial analysis plugin for the project.

The Market Analysis Architecture

Datachoppa extractors Datachoppa annotators

Aroon Indicator (for example)

namespace DataChoppa{
  namespace Market {
    namespace Annotators {
      class Aroon : public Annotator {
          double aroon_up(const Quote& quote, int high_offset, double range){
            return ((range-1) - high_offset) / (range-1) * 100;

          DoubleVector aroon_up(const Quotes& quotes, const Annotations::Extrema* extrema, int range){
            return quotes.collect<DoubleVector>([extrema, range](const Quote& q, int i){
                     return aroon_up(q, extrema->high_offsets[i], range);

          double aroon_down(const Quote& quote, int low_offset, double range){
            return ((range-1) - low_offset) / (range-1) * 100;

          DoubleVector aroon_down(const Quotes& quotes, const Annotations::Extrema* extrema, int range){
            return quotes.collect<DoubleVector>([extrema, range](const Quote& q, int i){
                     return aroon_down(q, extrema->low_offsets[i], range);

          AnnotationList annotate() const{
            const Quotes& quotes = market->quotes;
            if(quotes.size() < range) return AnnotationList();

            const Annotations::Extrema* extrema = aroon_extrema(market, range);
                    Annotations::Aroon* aroon = new Annotations::Aroon(range);
                                        aroon->upper = aroon_up(market->quotes, extrema, range);
                                        aroon->lower = aroon_down(market->quotes, extrema, range);
            return aroon->to_list();
      }; /// class Aroon
    }; /// namespace Annotators
  }; /// namespace Market
}; // namespace DataChoppa

The whole thing worked great, data was pulled in both real time and historical from yahoo finance (until they discontinued it... from then it was google finance), the indicators were run, and results were output. Of course, making $$$ is not as simple as just crunching numbers, and being rather naive I just tossed the results of the indicators into weighted "buckets" and backtested based on simple boolean flags based on the computed signals against threshold values. Thankfully I backtested though as the performance was horrible as losses greatly exceed profits :-(

At this point I should take a step back and note that my progress so far was the result of the availibilty of alot of great resources (we really live in the age of accelerated learning). Specifically the following are indispensible books & sites for those interested in this subject:

Beyond candlesticksNerds on wallstreetHigh prob trading

The last book is an excellent resource which conveys the importance of money and risk management, as well as the necessity to combine in all factors, or as many factors as you can, when making financial decisions. In the end, I feel this is the gist of it, it's not soley a matter of luck (though there is an aspect of that to this), but rather patience, discipline, balance, and most importantly focus (similar to Aikido but that's a topic for another time). There is no shorting it (unless you're talking about the assets themselves!), and if one does not have / take the necessary time to research and properly plan and out and execute strategies, they will most likely fail (as most do according to the numbers).

It was at this point that I decided to take a step back and restrategize, and having reflected and discussed it over with some acquaintances, I hedged my bets, cut my losses (tech-wise) and switched from C++ to another platform which would allow me prototype and execute ideas quicker. A good amount of time has gone into the C++ project and it worked great, but it did not make sense to continue via a slower development cycle when faster options are available (and afterall every engineer knows time is our most precious resource).

Python and R are the natural choices for this project domain, as there is extensive support in both languages for market analysis, backtesting, and execution. I have used Python at various, points in the past so it was easy to hit the ground running; R was new but by this time no language really poses a serious surprise, the best way I can describe it is spreadsheets on steroids (not exactly, as rather than spreadsheets, data frames and matrixes are the core components, but one can imagine R as being similar to the central execution environment behind Excel, Matlab, or other statistical-software).

I quickly picked up quantmod and prototyped some volatility, trend-following, momentum, and other analysis signal generators in R, plotting them using the provided charting interface. R is a great language for this sort of data manipulation, one can quickly load up structured data from CSV files or online resources, splice it and dice it, chunk it and dunk it, organize it and prioritize it, according to any arithmatic, statistical, or linear/non-linear means which they desire. Quickly loading a new 'view' on the data is as simply as a line of code, and operations can quickly be chained together at high performance.

Volatility indicator in R (consolidated)

quotes <- load_default_symbol("volatility")

quotes.atr <- ATR(quotes, n=ATR_RANGE)

quotes.atr$tr_atr_ratio <- quotes.atr$tr / quotes.atr$atr
quotes.atr$is_high      <- ifelse(quotes.atr$tr_atr_ratio > HIGH_LEVEL, TRUE, FALSE)

# Also Generate ratio of atr to close price
quotes.atr$atr_close_ratio <- quotes.atr$atr / Cl(quotes)

# Generate rising, falling, sideways indicators by calculating slope of ATR regression line
atr_lm       <- list()
atr_lm$df    <- data.frame(quotes.atr$atr, Time = index(quotes.atr))
atr_lm$model <- lm(atr ~ poly(Time, POLY_ORDER), data = atr_lm$df) # polynomial linear model

atr_lm$fit   <- fitted(atr_lm$model)
atr_lm$diff  <- diff(atr_lm$fit)
atr_lm$diff  <- as.xts(atr_lm$diff)

# Current ATR / Close Ratio
quotes.atr.abs_per <- median(quotes.atr$atr_close_ratio[!is.na(quotes.atr$atr_close_ratio)])

# plots
addTA(quotes.atr$tr, type="h")
addTA(as.xts(as.logical(quotes.atr$is_high), index(quotes.atr)), col=col1, on=1)

While it all works great, the R language itself offers very little syntactic sugar for operations not related to data-processing. While there are libraries for most common functionality found in many other execution environments, languages such as Ruby and Python, offer a "friendlier" experience to both novice and seasoned developers alike. Furthermore the process of data synchronization was a tedious step, I was looking for something that offered the flexability of DataChoppa to pull in and process live and historical data from a wide variety of sources, caching results on the fly, and using those results and analysis for subsequent operations.

This all led me to developing a series of Python libraries targeted towards providing a configurable high level view of the market. Intelligence Amplification (IA) as opposed to Artifical Intelligence (AI) if you will (see Nerds on Wall Street).

marketquery.py is a high level market querying library, which implements plugins used to resolve generic market queries for ticker time based data. One can used the interface to query for the lastest quotes or a specific range of them from a particular source, or allow the framework to select one for you.

Retrieve first 3 months of the last 5 years of GBPUSD data

  from marketquery.querier        import Querier
  from marketbase.query.builder   import QueryBuilder
  sym = "GBPUSD"
  first_3mo_of_last_5yr = (QueryBuilder().symbol(sym)
  querier = Querier()
  res     = querier.run(first_3mo_of_last_5yr)
  for query, dat in res.items():
      print(dat.raw[:1000] + (dat.raw[1000:] and '...'))

Retrieve last two month of hourly EURJPY data

  from marketquery.querier        import Querier
  from marketbase.query.builder   import QueryBuilder
  sym = "EURJPY"
  two_months_of_hourly = (QueryBuilder().symbol(sym)
  querier = Querier()
  res     = querier.run(two_months_of_hourly).raw()
  print(res[:1000] + (res[1000:] and '...'))

This provides a quick way to both lookup market data according to specific criteria, as well as cache it so that network resources are used effectively. All caching is configurable, and the user can define timeouts based on the target query, source, and/or data retrieved.

From there the next level up is the technical analysis is was trivial to whip up the tacache.py module which uses the marketquery.py interface to retrieve raw data before feeding it into TALib caching the results. The same caching mechanisms offering the same flexability is employed, if one needs to process a large data set and/or subsets multiple times in a specified period, computational resources are not wasted (important when running on a metered cloud)

Computing various technical indicators

  from marketquery.querier       import Querier
  from marketbase.query.builder  import QueryBuilder
  from tacache.runner            import TARunner
  from tacache.source            import Source
  from tacache.indicator         import Indicator
  from talib                     import SMA
  from talib                  import MACD
  res = Querier().run(QueryBuilder().symbol("AUDUSD")
  ta_runner = TARunner()
  analysis  = ta_runner.run(Indicator(SMA),
  analysis  = ta_runner.run(Indicator(MACD),
  macd, sig, hist = analysis.raw

Finally ontop of all this I wrote a2m.py, a high level querying interface consisting of modules reporting on market volatility and trends as well as other metrics; python scripts which I could quickly execute to report the current and historical market state, making used of the underlying cached query and technical analysis data, periodically invalidated to pull in new/recent live data.

Example using a2m to compute volatility

  sym = "EURUSD"
  self.resolver  = Resolver()
  self.ta_runner = TARunner()

  daily = (QueryBuilder().symbol(sym)

  hourly = (QueryBuilder().symbol(sym)

  current = (QueryBuilder().symbol(sym)

  daily_quotes   = resolver.run(daily)
  hourly_quotes  = resolver.run(hourly)
  current_quotes = resolver.run(current)

  daily_avg  = ta_runner.run(Indicator(talib.SMA, timeperiod=120),  query_result=daily_quotes).raw[-1]
  hourly_avg = ta_runner.run(Indicator(talib.SMA, timeperiod=30),  query_result=hourly_quotes).raw[-1]

  current_val    = current_quotes.raw()[-1]['Close']
  daily_percent  = current_val / daily_avg  if current_val &lt; daily_avg  else daily_avg  / current_val
  hourly_percent = current_val / hourly_avg if current_val &lt; hourly_avg else hourly_avg / current_val
Awesome to the max

I would go onto use this to execute some Forex trades, again not in an algorithmic / automated manner, but rather based on combined knowledge from fundamentals research, as well as the high level technical data, and what was the result...

Poor squidward

I jest, though I did lose a little $$$, it wasn't that much, and to be honest I feel this was due to lack of patience/discipline and other "novice" mistakes as discussed above. I did make about 1/2 of it back, and then lost interest. This all requires alot of focus and time, and I had already spent 2+ years worth of free time on this. With many other interests pulling my strings, I decided to sideline the project(s) alltogether and focus on my next crazy venture.


After some of consideration, I decided to release the R code I wrote under the MIT license. They are rather simple expirements though could be useful as a starting point for others new to the subject. As far as the Python modules and DataChoppa, I intended to eventually release them but aim to take a break first to focus on other efforts and then go back to the war room, to figure out the next stage of the strategy.

And that's that! Enough number crunching, time to go out for a hike!

Hiking meme
Jul 5 2017 refs filesystems

ReFS Part III - Back to the Resilience

We've made some great headway on the ReFS filesystem anaylsis front to the point of being able to implement a rudimentary file extraction mechanism (complete with timestamps).

First a recap of the story so far:

Before going into details, we should note this analysis is based on observations against ReFS disks generated locally, without extensive sequential cross-referencing and comparison of many large files with many changes. Also it is possible that some structures are oversimplified and/or not fully understood. That being said, this should provide a solid basis for additional analysis, getting us deep into the filesystem, and allowing us to poke and prod with isolated bits to identify their semantics.

Now onto the fun stuff!

- A ReFS filesystem can be identified with the following signature at the very start of the partition:

    00 00 00 52  65 46 53 00  00 00 00 00  00 00 00 00 ...ReFS.........
    46 53 52 53  XX XX XX XX  XX XX XX XX  XX XX XX XX FSRS

- The following Ruby code will tell you if a given offset in a given file contains a ReFS partition:

    # Point this to the file containing the disk image

    # Point this at the start of the partition containing the ReFS filesystem

    # FileSystem Signature we are looking for
    FS_SIGNATURE  = [0x00, 0x00, 0x00, 0x52, 0x65, 0x46, 0x53, 0x00] # ...ReFS.

    img = File.open(File.expand_path(DISK), 'rb')
    img.seek ADDRESS
    sig = img.read(FS_SIGNATURE.size).unpack('C*')
    puts "Disk #{sig == FS_SIGNATURE ? "contains" : "does not contain"} ReFS filesystem"

- ReFS pages are 0x4000 bytes in length

- On all inspected systems, the first page number is 0x1e (0x78000 bytes after the start of the partition containing the filesystem). This is inline w/ Microsoft documentation which states that the first metadata dir is at a fixed offset on the disk.

- Other pages contain various system, directory, and volume structures and tables as well as journaled versions of each page (shadow-written upon regular disk writes)

- The first byte of each page is its Page Number

- The first 0x30 bytes of every metadata page (dubbed the Page Header) seem to follow a certain pattern:

    byte  0: XX XX 00 00   00 00 00 00   YY 00 00 00   00 00 00 00
    byte 16: 00 00 00 00   00 00 00 00   ZZ ZZ 00 00   00 00 00 00
    byte 32: 01 00 00 00   00 00 00 00   00 00 00 00   00 00 00 00

- Multiple pages may share a virtual page number (byte 24/dword 6) but usually don't appear in sequence.

- The following Ruby code will print out the pages in a ReFS partition along w/ their shadow copies:

    # Point this to the file containing the disk image
    # Point this at the start of the partition containing the ReFS filesystem
    FIRST_PAGE = 0x1e
    img = File.open(File.expand_path(DISK), 'rb')
    page_id = FIRST_PAGE
    img.seek(ADDRESS + page_id*PAGE_SIZE)
    while contents = img.read(PAGE_SIZE)
      id = contents.unpack('S').first
      if id == page_id
        pos = img.pos
        start = ADDRESS + page_id * PAGE_SIZE
        img.seek(start + PAGE_SEQ)
        seq = img.read(4).unpack("L").first
        img.seek(start + PAGE_VIRTUAL_PAGE_NUM)
        vpn = img.read(4).unpack("L").first
        print "page: "
        print "0x#{id.to_s(16).upcase}".ljust(7)
        print " @ "
        print "0x#{start.to_s(16).upcase}".ljust(10)
        print ": Seq - "
        print "0x#{seq.to_s(16).upcase}".ljust(7)
        print "/ VPN - "
        print "0x#{vpn.to_s(16).upcase}".ljust(9)
        img.seek pos
      page_id += 1

- The object table (virtual page number 0x02) associates object ids' with the pages on which they reside. Here we an AttributeList consisting of Records of key/value pairs (see below for the specifics on these data structures). We can lookup the object id of the root directory (0x600000000) to retrieve the page on which it resides:

   50 00 00 00 10 00 10 00 00 00 20 00 30 00 00 00 - total length / key & value boundries
   00 00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 - object id
   F4 0A 00 00 00 00 00 00 00 00 02 08 08 00 00 00 - page id / flags
   CE 0F 85 14 83 01 DC 39 00 00 00 00 00 00 00 00 - checksum
   08 00 00 00 08 00 00 00 04 00 00 00 00 00 00 00

^ The object table entry for the root dir, containing its page (0xAF4)

- When retrieving pages by id of virtual page number, look for the ones with the highest sequence number as those are the latest copies of the shadow-write mechanism.

- Expanding upon the previous example we can implement some logic to read and dump the object table:

    ATTR_START = 0x30
    def img
      @img ||= File.open(File.expand_path(DISK), 'rb')
    def pages
      @pages ||= begin
        _pages = {}
        page_id = FIRST_PAGE
        img.seek(ADDRESS + page_id*PAGE_SIZE)
        while contents = img.read(PAGE_SIZE)
          id = contents.unpack('S').first
          if id == page_id
            pos = img.pos
            start = ADDRESS + page_id * PAGE_SIZE
            img.seek(start + PAGE_SEQ)
            seq = img.read(4).unpack("L").first
            img.seek(start + PAGE_VIRTUAL_PAGE_NUM)
            vpn = img.read(4).unpack("L").first
            _pages[id] = {:id => id, :seq => seq, :vpn => vpn}
            img.seek pos
          page_id += 1
    def page(opts)
      if opts.key?(:id)
        return pages[opts[:id]]
      elsif opts[:vpn]
        return pages.values.select { |v|
          v[:vpn] == opts[:vpn]
        }.sort { |v1, v2| v1[:seq] <=> v2[:seq] }.last
    def obj_pages
      @obj_pages ||= begin
        obj_table = page(:vpn => 2)
        img.seek(ADDRESS + obj_table[:id] * PAGE_SIZE)
        bytes = img.read(PAGE_SIZE).unpack("C*")
        len1 = bytes[ATTR_START]
        len2 = bytes[ATTR_START+len1]
        start = ATTR_START + len1 + len2
        objs = {}
        while bytes.size > start && bytes[start] != 0
          len = bytes[start]
          id  = bytes[start+0x10..start+0x20-1].collect { |i| i.to_s(16).upcase }.reverse.join()
          tgt = bytes[start+0x20..start+0x21].collect   { |i| i.to_s(16).upcase }.reverse.join()
          objs[id] = tgt
          start += len
    obj_pages.each { |id, tgt|
      puts "Object #{id} is on page #{tgt}"

We could also implement a method to lookup a specific object's page:

    def obj_page(obj_id)

    puts page(:id => obj_page("0000006000000000").to_i(16))

This will retrieve the page containing the root directory

- Directories, from the root dir down, follow a consistent pattern. They are comprised of sequential lists of data structures whose length is given by the first word value (Attributes and Attribute Lists).

List are often prefixed with a Header Attribute defining the total length of the Attributes that follow that consititute the list. Though this is not a hard set rule as in the case where the list resides in the body of another Attribute (more on that below).

In either case, Attributes may be parsed by iterating over the bytes after the directory page header, reading and processing the first word to determine the next number of bytes to read (minus the length of the first word), and then repeating until null (0000) is encountered (being sure to process specified padding in the process)

- Various Attributes take on different semantics including references to subdirs and files as well as branches to additional pages containing more directory contents (for large directories); though not all Attributes have been identified.

The structures in a directory listing always seem to be of one of the following formats:

- Base Attribute - The simplest / base attribute consisting of a block whose length is given at the very start.

An example of a typical Attribute follows:

      a8 00 00 00  28 00 01 00  00 00 00 00  10 01 00 00  
      10 01 00 00  02 00 00 00  00 00 00 00  00 00 00 00  
      00 00 00 00  00 00 00 00  a9 d3 a4 c3  27 dd d2 01  
      5f a0 58 f3  27 dd d2 01  5f a0 58 f3  27 dd d2 01  
      a9 d3 a4 c3  27 dd d2 01  20 00 00 00  00 00 00 00  
      00 06 00 00  00 00 00 00  03 00 00 00  00 00 00 00  
      5c 9a 07 ac  01 00 00 00  19 00 00 00  00 00 00 00  
      00 00 01 00  00 00 00 00  00 00 00 00  00 00 00 00  
      00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  
      00 00 00 00  00 00 00 00  01 00 00 00  00 00 00 00  
      00 00 00 00  00 00 00 00

Here we a section of 0xA8 length containing the following four file timestamps (more on this conversion below)

       a9 d3 a4 c3  27 dd d2 01 - 2017-06-04 07:43:20
       5f a0 58 f3  27 dd d2 01 - 2017-06-04 07:44:40
       5f a0 58 f3  27 dd d2 01 - 2017-06-04 07:44:40
       a9 d3 a4 c3  27 dd d2 01 - 2017-06-04 07:43:20

It is safe to assume that either

The following is a method which can be used to parse a given Attribute off disk, provided the img read position is set to its start:

    def read_attr
      pos = img.pos
      packed = img.read(4)
      return new if packed.nil?
      attr_len = packed.unpack('L').first
      return new if attr_len == 0

      img.seek pos
      value = img.read(attr_len)
      Attribute.new(:pos   => pos,
                    :bytes => value.unpack("C*"),
                    :len   => attr_len)

- Records - Key / Value pairs whose total length and key / value lengths are given in the first 0x20 bytes of the attribute. These are used to associated metadata sections with files whose names are recorded in the keys and contents are recorded in the value.

An example of a typical Record follows:

    40 04 00 00   10 00 1A 00   08 00 30 00   10 04 00 00   @.........0.....
    30 00 01 00   6D 00 6F 00   66 00 69 00   6C 00 65 00   0...m.o.f.i.l.e.
    31 00 2E 00   74 00 78 00   74 00 00 00   00 00 00 00   1...t.x.t.......
    A8 00 00 00   28 00 01 00   00 00 00 00   10 01 00 00   ¨...(...........
    10 01 00 00   02 00 00 00   00 00 00 00   00 00 00 00   ................
    00 00 00 00   00 00 00 00   A9 D3 A4 C3   27 DD D2 01   ........©Ó¤Ã'ÝÒ.
    5F A0 58 F3   27 DD D2 01   5F A0 58 F3   27 DD D2 01   _ Xó'ÝÒ._ Xó'ÝÒ.
    A9 D3 A4 C3   27 DD D2 01   20 00 00 00   00 00 00 00   ©Ó¤Ã'ÝÒ. .......
    00 06 00 00   00 00 00 00   03 00 00 00   00 00 00 00   ................
    5C 9A 07 AC   01 00 00 00   19 00 00 00   00 00 00 00   \..¬............
    00 00 01 00   00 00 00 00   00 00 00 00   00 00 00 00   ................
    00 00 00 00   00 00 00 00   00 00 00 00   00 00 00 00   ................
    00 00 00 00   00 00 00 00   01 00 00 00   00 00 00 00   ................
    00 00 00 00   00 00 00 00   20 00 00 00   A0 01 00 00   ........ ... ...
    D4 00 00 00   00 02 00 00   74 02 00 00   01 00 00 00   Ô.......t.......
    78 02 00 00   00 00 00 00 ...(cutoff)                   x.......

Here we see the Record parameters given by the first row:

Naturally, the Record finishes after the value, 0x410 bytes after the value start at 0x30, or 0x440 bytes after the start of the Record (which lines up with the total length).

We also see that this Record corresponds to a file I created on disk as the key is the File Metadata flag (0x10030) followed by the filename (mofile1.txt).

Here the first attribute in the Record value is the simple attribute we discussed above, containing the file timestamps. The File Reference Attribute List Header follows (more on that below).

From observation Records w/ flag values of '0' or '8' are what we are looking for, while '4' occurs often, this almost always seems to indicate a Historical Record, or a Record that has since been replaced with another.

Since Records are prefixed with their total length, they can be thought of a subclass of Attribute. The following is a Ruby class that uses composition to dispatch record field lookup calls to values in the underlying Attribute:

    class Record
      attr_accessor :attribute

      def initialize(attribute)
        @attribute = attribute

      def key_offset
        @key_offset ||= attribute.words[2]

      def key_length
        @key_length ||= attribute.words[3]

      def flags
        @flags ||= attribute.words[4]

      def value_offset
        @value_offset ||= attribute.words[5]

      def value_length
        @value_offset ||= attribute.words[6]

      def key
        @key ||= begin
          ko, kl, vo, vl = boundries

      def value
        @value ||= begin
          ko, kl, vo, vl = boundries

      def value_pos
        attribute.pos + value_offset

      def key_pos
        attribute.pos + key_offset
    end # class Record

- AttributeList - These are more complicated but interesting. At first glance they are simple Attributes of length 0x20 but upon further inspection we consistently see it contains the length of a large block of Attributes (this length is inclusive, as it contains this first one). After parsing this Attribute, dubbed the 'List Header', we should read the remaining bytes in the List as well as the padding, before arriving at the next Attribute

   20 00 00 00   A0 01 00 00   D4 00 00 00   00 02 00 00 <- list header specifying total length (0x1A0) and padding (0xD4)
   74 02 00 00   01 00 00 00   78 02 00 00   00 00 00 00
   80 01 00 00   10 00 0E 00   08 00 20 00   60 01 00 00
   60 01 00 00   00 00 00 00   80 00 00 00   00 00 00 00
   88 00 00 00  ... (cutoff)

Here we see an Attribute of 0x20 length, that contains a reference to a larger block size (0x1A0) in its third word.

This can be confirmed by the next Attribute whose size (0x180) is the larger block size minute the length of the header (0x1A0 - 0x20). In this case the list only contains one item/child attribute.

In general a simple strategy to parse the entire case would be to:

It also seems that:

Here is a class that represents an AttributeList header attribute by encapsulating it in a similar manner to Record above:

    class AttributeListHeader
      attr_accessor :attribute

      def initialize(attr)
        @attribute = attr

      # From my observations this is always 0x20
      def len
        @len ||= attribute.dwords[0]

      def total_len
        @total_len ||= attribute.dwords[1]

      def body_len
        @body_len ||= total_len - len

      def padding
        @padding ||= attribute.dwords[2]

      def type
        @type ||= attribute.dwords[3]

      def end_pos
        @end_pos ||= attribute.dwords[4]

      def flags
        @flags ||= attribute.dwords[5]

      def next_pos
        @next_pos ||= attribute.dwords[6]

Here is a method to parse the actual Attribute List assuming the image read position is set to the beginning of the List Header

    def read_attribute_list
      header        = Header.new(read_attr)
      remaining_len = header.body_len
      orig_pos      = img.pos
      bytes         = img.read remaining_len
      img.seek orig_pos

      attributes = []

      until remaining_len == 0
        attributes    << read_attr
        remaining_len -= attributes.last.len

      img.seek orig_pos - header.len + header.end_pos

      AttributeList.new :header     => header,
                        :pos        => orig_pos,
                        :bytes      => bytes,
                        :attributes => attributes

Now we have most of what is needed to locate and parse individual files, but there are a few missing components including:

- Directory Tree Branches: These are Attribute Lists where each Attribute corresponds to a record whose value references a page which contains more directory contents.

Upon encountering an AttributeList header with flag value 0x301, we should

Additional files and subdirs found on the referenced pages should be appended to the list of current directory contents.

Note this is the (an?) implementation of the BTree structure in the ReFS filesystem described by Microsoft, as the record keys contain the tree leaf identifiers (based on file and subdirectory names).

This can be used for quick / efficient file and subdir lookup by name (see 'optimization' in 'next steps' below)

- SubDirectories: these are simply Records in the directory's Attribute List whose key contains the Directory Metadata flag (0x20030) as well as the subdir name.

The value of this Record is the corresponding object id which can be used to lookup the page containing the subdir in the object table.

A typical subdirectory Record

    70 00 00 00  10 00 12 00  00 00 28 00  48 00 00 00  
    30 00 02 00  73 00 75 00  62 00 64 00  69 00 72 00  <- here we see the key containing the flag (30 00 02 00) followed by the dir name ("subdir2")
    32 00 00 00  00 00 00 00  03 07 00 00  00 00 00 00  <- here we see the object id as the first qword in the value (0x730)
    00 00 00 00  00 00 00 00  14 69 60 05  28 dd d2 01  <- here we see the directory timestamps (more on those below)
    cc 87 ce 52  28 dd d2 01  cc 87 ce 52  28 dd d2 01  
    cc 87 ce 52  28 dd d2 01  00 00 00 00  00 00 00 00  
    00 00 00 00  00 00 00 00  00 00 00 10  00 00 00 00

- Files: like directories are Records whose key contains a flag (0x10030) followed by the filename.

The value is far more complicated though and while we've discovered some basic Attributes allowing us to pull timestamps and content from the fs, there is still more to be deduced as far as the semantics of this Record's value.

- The File Record value consists of multiple attributes, though they just appear one after each other, without a List Header. We can still parse them sequentially given that all Attributes are individually prefixed with their lengths and the File Record value length gives us the total size of the block.

- The first attribute contains 4 file timestamps at an offset given by the fifth byte of the attribute (though this position may be coincidental an the timestamps could just reside at a fixed location in this attribute).

In the first attribute example above we see the first timestamp is

       a9 d3 a4 c3  27 dd d2 01

This corresponds to the following date

        2017-06-04 07:43:20

And may be converted with the following algorithm:

          tsi = TIMESTAMP_BYTES.pack("C*").unpack("Q*").first
          Time.at(tsi / 10000000 - 11644473600)

Timestamps being in nanoseconds since the Windows Epoch Data (11644473600 = Jan 1, 1601 UTC)

- The second Attribute seems to be the Header of an Attribute List containing the 'File Reference' semantics. These are the Attributes that encapsulate the file length and content pointers.

I'm assuming this is an Attribute List so as to contain many of these types of Attributes for large files. What is not apparent are the full semantics of all of these fields.

But here is where it gets complicated, this List only contains a single attribute with a few child Attributes. This encapsulation seems to be in the same manner as the Attributes stored in the File Record value above, just a simple sequential collection without a Header.

In this single attribute (dubbed the 'File Reference Body') the first Attribute contains the length of the file while the second is the Header for yet another List, this one containing a Record whose value contains a reference to the page which the file contents actually reside.

      | ...                                  |
      | File Entry Record                    |
      | Key: 0x10030 [FileName]              |
      | Value:                               |
      | Attribute1: Timestamps               |
      | Attribute2:                          |
      |   File Reference List Header         |
      |   File Reference List Body(Record)   |
      |     Record Key: ?                    |
      |     Record Value:                    |
      |       File Length Attribute          |
      |       File Content List Header       |
      |       File Content Record(s)         |
      | Padding                              |
      | ...                                  |

While complicated each level can be parsed in a similar manner to all other Attributes & Records, just taking care to parse Attributes into their correct levels & structures.

As far as actual values,


And that's it! An example implementation of all this logic can be seen in our expiremental 'resilience' library found here:


The next steps would be to

And once we have done so robustly, we can start looking at optimization, possibly rolling out some expiremental production logic for ReFS filesystem support!

... Cha-ching $ £ ¥ ¢ ₣ ₩ !!!!

Jun 24 2017 retro gaming raspberry pi

RetroFlix / PI Switch Followup

I've been trying to dedicate some cycles to wrapping up the Raspberry PI entertainment center project mentioned a while back. I decided to abandon the PI Switch idea as the original controller which was purchased for it just did not work properly (or should I say only worked sporadically/intermitantly). It being a cheap device bought online, it wasn't worth the effort to debug (funny enough I can't find the device on Amazon anymore, perhaps other people were having issues...).

Not being able to find another suitable gamepad to use as the basis for a snap together portable device, I bought a Rii wireless controller (which works great out of the box!) and dropped the project (also partly due to lack of personal interest). But the previously designed wall mount works great, and after a bit of work the PI now functions as a seamless media center.

Unfortunately to get it there, a few workarounds were needed. These are listed below (in no particular order).

Finally RetroFlix received some tweaking & love. Most changes were visual optimizations and eye candy (including some nice retro fonts and colors), though a workers were also added so the actual downloads could be performed without blocking the UI. Overall it's simple and works great, a perfect portal to work on those high scores!

That's all for now, look for some more updates on the ReFS front in the near future!

May 14 2017 retro gaming raspberry pi sinatra

RetroFlix - A Weekend Project

Now that we have the 'mount' component of the PI Sw1tch, and an awesome way to playgames through the PI on our TV, we need a collection of games to play! It goes without saying that Nethack was installed (combined w/ ssh X11 forwarding = persistent graphical nethack anywhere = epicness!!!). But I also happen to have a huge box of Retro video games (dating back to my childhood), which would be good to load onto the device. But unfortunately cloning so many games would take some time, and there are already online databases of these backups, so I opted to write a small web app to download and manage the collection.

It can be found here and you can see some screens below:

My library Game info Game previews Game list

It was built as a Sinatra web service, simply acting as a frontend to a popular emulator database, allowing the user to navigate & preview app for various systems, and download / run them locally. The RetroFlix application itself is offered as a lightweight Microservice simply acting as a proxy to the required various underlying components. It's fairly simple to setup & install (see the README), and builds upon existing emulators & components the user has locally.

As with everything else, it's still a work in progress, but it already sufficing to relive classic memories!

May 10 2017 3d printing raspberry pi

3D Printing Fun

The PI Sw1tch project took a bit of a setback recently when we discovered the 5200mAH usb battery we had wasn't sufficient to drive the PI + display for any extensive period of time. I've ordered a higher-capacity battery from Amazon, but until it arrives the working prototype was redesigned into a snappable wall mount:

Wall pi2 Wall pi1

The current implementation can be found on thingiverse for all your 3D printing needs!

Additionally, we threw together a design for a wall mount for my last smartphone, the Samsung Intercept, which has been sitting on my shelf since I upgraded to the Huawei Union. The Intercept was a great phone, and it still works well, albiet a bit slow compared to modern devices (not to mention it only runs Android 2.2). But it more than suffices for a "smart" entertainment hub, and having mounted & wired up to my stereo system, I now have easy access to all the albums in the world (...that are available via youtube...). The device supposidly can be rooted, though I was not able to accomplish that myself and don't really care to spend more time figuring that out (really wouldn't gain much). But just goes to show how a little inguinity + some design work can go along way at reducing e-waste.

Wall intercept2 Wall intercept1

Now time to figure out what to do w/ my ancient Droid (the original A855!)

Keep on hacking!

Apr 29 2017

A New Site, A Fresh Start

I started this blog 10 years ago. How the world has changed... (and yet is still the same...)

Recently I noticed my site was unaccessible. No 404, no error response, just a blank page. After a brief moment of panic, I ssh'd into my host provider and breathed a sign of relief upon discovering all db & fs entities intact, including the instance of Drupal which my site was based on (a horribly outdated instance mind you, and note I said was based). In any case, I suspect my (cheap) host provider updated their version of PHP or some critical library w/ the effect of my Drupal instance not working.


Having struggled w/ PHP & Drupal many times over the years, I was finally ready to go cold turkey, and migrated the blog to Middleman, which brings the awesomeness of Rails to static site generation. I am very much in love with Middleman right now, it's the perfect tool for this problem domain, it's incredibly easy to setup a new site, use any high level templating / markup / styling language to customize your frontend, throw in any js or other framework to handle dynamic interactions (including emscripten to run C in the browser), and you're good to go. Tailoring things on the fly is a cinch due to the convenient embedded webserver sporting live-reloading, and when you're ready to push to production it's a single command to build the static html. A quick rsync -azP synchronizes it w/ your webserver and now your site is available to the world at blazing speeds!

Anyways, enough Middleman gushing (but seriously check it out!). In addition to the port, I rethemed the site, be sure to also check out the new design if your reading this via rss. Note mobile browser UI's aren't currently supported, so no angry emails if you can't read things on your phone! (I know their coming...)

Be sure to stay subscribed to github for updates. I'm hoping virtfs-refs will see some love soon if I can figure out how to extend the current fs parsing mechanisms w/ file content retrieval. We've also been prototyping designs for the PI Switch project I mentioned a while back, more updates on that soon as things progress.

Keep surfing!!!

Apr 9 2017 nethack gcc compiler fedora

Compiling / Playing NetHack 3.6.0 on Fedora

The following are the simplest instructions required to compile NetHack 3.6.0 for Fedora 25.

Why might you want to compile NetHack from source, instead of simply installing the package (sudo dnf install nethack)? For many reasons. Applying patches for custom game mechanics. Running an alternate frontend. And more!

While the official Linux instructions are complete, they are pretty involved and must be followed exactly for things to work. To give the dev team credit, they’ve been supporting a plethora of platforms and environments for 20+ years (and the number is still increasing). While a consolidated guide was written for compiling NetHack from scratch on Ubuntu/Debian but nothing exists for Fedora… until now!

# On a fresh Fedora installation (with updates) install the dependencies:

$ sudo dnf install ncurses-devel libXt-devel libXaw-devel byacc flex

# Download the NetHack (3.6.0) source tarball from the official site and unpack it:

$ tar xzvf [download]
$ cd nethack-3.6.0/

# Run the base setup utility for Linux:

$ cd sys/unix
$ ./setup.sh hints/linux
$ cd ../..

# Edit [include/unixconf.h] to uncomment the following line…

#define LINUX

# Edit [include/config.h] to uncomment the following line…

#define X11_GRAPHICS

# Edit [src/Makefile] and update the following lines…


# …to look like so


# Edit [Makefile] to uncomment the following line

VARDATND = x11tiles NetHack.ad pet_mark.xbm pilemark.xpm rip.xpm

# In previous line, apply this bugfix by changing…


# …to


# Build and install the game

$ make all
$ make install

# Finally create [~/.nethackrc] config file and populate it with the following: OPTIONS=windowtype:x11

# To play:

$ ~/nh/install/games/nethack

Go get that Amulet!

Mar 4 2017 virtfs filesystems

VirtFS New Plugin Guide

Having recently extracted much of the FS interface from MiQ into virtfs plugins, it was a good time to write a guide on how to write a new plugin from scratch. It is attached below.

This document details the process of writing a new VirtFS plugin from scratch.

Plugins may be written for many targets, from traditional filesystems (EXT, FAT, XFS), to filesystem-like entities, such as databases and object repositories, to things completely unrelated all together. Once written, VirtFS will use the plugin to expose the underlying component via the Ruby Filesystem API. Simply issue File & Dir calls to files under the specified mountpoint, and VirtFS will take care of the remaining details.

This guide assumes basic familiarity with the Ruby language and gem project format, in this tutorial we will be creating a new gem called virtfs-hellofs for our ‘hello’ filesystem, based on a simple JSON map.

Note, the end result can be seen at virtfs-hellofs

Initial Project Layout

Create a new working directory with the following contents:


TODO: a generator [patches are welcome!]

Required Components

The following components are required to define a full-fledged filesystem plugin:

Upon instantiation, a fs-specific ‘blocklike device’ is often required so as to provide block-level seek/read/write operations (such as from a physical disk, disk image, or other).

Eventually this will be implemented via a separate abstraction hierarchy, but for the time being virt-disk provides basic functionality to read simple file-based “devices”. Since we are only using a simply in-memory JSON based fs, we do not need to pull in virt_disk here.

Core functionality

First we will define the FS class providing our filesystem interface:


  module VirtFS::HelloFS
    class FS
      include DirClassMethods
      include FileClassMethods

      attr_accessor :mount_point, :superblock

      # Return bool indicating if device contains
      # a HelloFS instance
      def self.match?(device)
          Superblock.new(self, device)
          return true
        rescue => err
          return false

      # Initialze new HelloFS instance w/ the
      # specified device
      def initialize(device)
        @superblock  = Superblock.new(self, device)

      # Return root directory of the filesystem
      def root_dir

      def thin_interface?

      def umount
        @mount_point = nil
    end # class FS
  end # module VirtFS::HelloFS

Here we see a few things, particularly the inclusion of the Directory and File class methods satisfying the VirtFS API (more on those later) and the instantiation of a HelloFS specific Superblock construct.

In the #match? method, We verify the superblock of the underlying device matches that required by hellofs and we specify various core callbacks needed by VirtFS (particularly the #unmount and #thin_interface? methods, see this for more details on thin vs. thick interfaces).

The superblock class for HelloFS is simple, we implement our ‘filesystem’ through a simple json map, passed into virtfs on instantiation


module VirtFS::HelloFS
  # Top level filesystem construct.
  # In our case, we simply create a new
  # root directory from the HelloFS
  # json hash, but in most cases this
  # would parse / read top level metadata
  class Superblock
    attr_accessor :device

    def initialize(fs, device)
      @fs     = fs
      @device = device

    def root_dir
      Dir.new(self, device)
  end # class SuperBlock
end # module VirtFS::Hello


In the previous section the core fs class included two mixins, DirClassMethods and FileClassMethods implementing the VirtFS filesystem interface.


module VirtFS::HelloFS
  class FS
    # VirtFS Dir API implementation, dispatches
    # calls to underlying HelloFS constructs
    module DirClassMethods
      def dir_delete(p)

      def dir_entries(p)
        dir = get_dir(p)
        return nil if dir.nil?

      def dir_exist?(p)

      def dir_foreach(p, &block)
        r = get_dir(p).try(:glob_names)
                      .try(:each, &block)
        block.nil? ? r : nil

      def dir_mkdir(p, permissions)

      def dir_new(fs_rel_path, hash_args, _open_path, _cwd)


      def get_dir(p)
        names = p.split(/[\\\/]/)

        dir = get_dir_r(names)
        raise "Directory '#{p}' not found" if dir.nil?

      def get_dir_r(names)
        return root_dir if names.empty?

        # Check for this path in the cache.
        fname = names.join('/')

        name = names.pop
        pdir = get_dir_r(names)
        return nil if pdir.nil?

        de = pdir.find_entry(name)
        return nil if de.nil?

        Directory.new(self, superblock, de.inode)
    end # module DirClassMethods
  end # class FS
end # module VirtFS::HelloFS

This module implements the standard Ruby Dir Class operations including retrieving & modifying directory contents, and checking for file existence.

Particularly noteworthy is the get_dir method which returns the FS specific dir instance.


module VirtFS::HelloFS
  class FS
    # VirtFS file class implemention, dispatches requests
    # to underlying HelloFS constructs
    module FileClassMethods
      def file_atime(p)

      def file_blockdev?(p)

      def file_chardev?(p)

      def file_chmod(permission, p)
        raise "writes not supported"

      def file_chown(owner, group, p)
        raise "writes not supported"

      def file_ctime(p)

      def file_delete(p)

      def file_directory?(p)
        f = get_file(p)
        !f.nil? && f.dir?

      def file_executable?(p)

      def file_executable_real?(p)

      def file_exist?(p)

      def file_file?(p)
        f = get_file(p)
        !f.nil? && f.file?

      def file_ftype(p)

      def file_grpowned?(p)

      def file_identical?(p1, p2)

      def file_lchmod(permission, p)

      def file_lchown(owner, group, p)

      def file_link(p1, p2)

      def file_lstat(p)

      def file_mtime(p)

      def file_owned?(p)

      def file_pipe?(p)

      def file_readable?(p)

      def file_readable_real?(p)

      def file_readlink(p)

      def file_rename(p1, p2)

      def file_setgid?(p)

      def file_setuid?(p)

      def file_size(p)

      def file_socket?(p)

      def file_stat(p)

      def file_sticky?(p)

      def file_symlink(oname, p)

      def file_symlink?(p)

      def file_truncate(p, len)

      def file_utime(atime, mtime, p)

      def file_world_readable?(p)

      def file_world_writable?(p)

      def file_writable?(p)

      def file_writable_real?(p)

      def file_new(f, parsed_args, _open_path, _cwd)
        file = get_file(f)
        raise Errno::ENOENT, "No such file or directory" if file.nil?
        File.new(file, superblock)


        def get_file(p)
          dir, fname = VfsRealFile.split(p)

            dir_obj = get_dir(dir)
            dir_entry = dir_obj.nil? ? nil : dir_obj.find_entry(fname)
          rescue RuntimeError
    end # module FileClassMethods
  end # class FS
end # module VirtFS::HelloFS

The FileClassMethods module provides all the FS-specific funcality needed by Ruby to dispatch File Class calls (which contains a larger footprint than Dir, hence the need for more methods here).

Here we see many methods are not yet implemented. This is OK for the purposes of use in VirtFS but note any calls to the corresponding methods on a mounted filesystem will fail.

File and Dir classes

The final missing piece of the puzzle is the File and Dir classes. These provide standard interfaces which VirtFS can extract file and dir information.


module VirtFS::HelloFS
  # File class representation, responsible for
  # managing corresponding dir_entry attributes
  # and file content.
  # For HelloFS, files are simple in memory strings
  class File
    attr_accessor :superblock, :dir_entry

    def initialize(superblock, dir_entry)
      @sb        = superblock
      @dir_entry = dir_entry

    def to_h
      { :directory? => dir?,
        :file?      => file?,
        :symlink?   => false }

    def dir?

    def file?

    def fs

    def size
      dir? ? 0 : dir_entry.size

    def close
  end # class File
end # module VirtFS::HelloFS


module VirtFS::HelloFS
  # Dir class representation, responsible
  # for managing corresponding dir_entry
  # attributes
  # For HelloFS, dirs are simply nested
  # json maps
  class Dir
    attr_accessor :sb, :dir_entry

    def initialize(sb, dir_entry)
      @sb        = sb
      @dir_entry = dir_entry

    def close

    def glob_names

    def find_entry(name, type = nil)
      dir = type == :dir
      fle = type == :file

      return nil unless glob_names.include?(name)
      return nil if (dir && !dir_entry[name].is_a?(Hash)) ||
                    (fle && !dir_entry[name].is_a?(String))
      dir ? Dir.new(sb, dir_entry[name]) :
            File.new(sb, dir_entry[name])
  end # class Directory
end # module VirtFS::HelloFS

Again these are fairly straightforward, providing access to the underlying JSON map in a filesystem-like manner.


To finish, we’ll populate the project components required by every rubygem:


require "virtfs/hellofs.rb"


require "virtfs/hellofs/version"
require_relative 'hellofs/fs.rb'
require_relative 'hellofs/dir'
require_relative 'hellofs/file'
require_relative 'hellofs/superblock'


module VirtFS
  module HelloFS
    VERSION = "0.1.0"


lib = File.expand_path('../lib', __FILE__)
$LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
require 'virtfs/hellofs/version'

Gem::Specification.new do |spec|
  spec.name          = "virtfs-hellofs"
  spec.version       = VirtFS::HelloFS::VERSION
  spec.authors       = ["Cool Developers"]

  spec.summary       = %q{An HELLO based filesystem module for VirtFS}
  spec.description   = %q{An HELLO based filesystem module for VirtFS}
  spec.homepage      = "https://github.com/ManageIQ/virtfs-hellofs"
  spec.license       = "Apache 2.0"

  spec.files         = `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
  spec.bindir        = "exe"
  spec.executables   = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
  spec.require_paths = ["lib"]

  spec.add_dependency "activesupport"
  spec.add_development_dependency "bundler"
  spec.add_development_dependency "rake", "~> 10.0"
  spec.add_development_dependency "rspec", "~> 3.0"
  spec.add_development_dependency "factory_girl"


source 'https://rubygems.org'

gem 'virtfs', "~> 0.0.1",
    :git => "https://github.com/ManageIQ/virtfs.git",
    :branch => "master"

# Specify your gem's dependencies in virtfs-hellofs.gemspec

group :test do
  gem 'virt_disk', "~> 0.0.1",
      :git => "https://github.com/ManageIQ/virt_disk.git",
      :branch => "initial"


require "bundler/gem_tasks"
require "rspec/core/rake_task"


task :default => :spec

Packaging It Up

Building virtfs-hellofs.gem is as simple as running:

rake build

in the project directory.

The gem will be written to the ‘pkg’ subdir and is ready for subsequent use / upload to rubygems.


To verify the plugin, create a test module which simply mounts a FS instance and dumps the directory contents:


require 'json'
require 'virtfs'
require 'virtfs/hellofs'

PATH = JSON.parse(File.read('hello.fs'))

exit 1 unless VirtFS::HelloFS::FS.match?(PATH)
fs = VirtFS::HelloFS::FS.new(PATH)

VirtFS.mount fs, '/'
puts VirtFS::VDir.entries('/')

We can create a simple JSON filesystem for testing purposes:


  "f1" : "foobar",
  "f2" : "barfoo",
  "d1" : { "sf1" : "fignewton",
           "sd1" : { "t" : "s" } }

Run the script, and if the directory contents are printed, you verified your FS!


rspec and factory_girl were added as development dependencies to the project and testing the new filesystem is as simple as adding new unit tests.

For ‘real’ filesystems, the plugin author will need to generate a ‘blocklike device’ image and populate it w/ the necessary test data.

Because large block image files are not condusive to source repository systems and automated build systems, virtfs-camcorderfs can be used to record and playback disk interactions in local dev environment, recording text based ‘cassettes’ which may be used to replicate disk interactions. See virtfs-camcorderfs for usage details.

Next Steps

We added barebones basic VirtFS functionality for our hellofs filesystem backend. From here, we can continue expanding upon this, providing read, write, and query support. Once implemented, VirtFS will use this filesystem like every other, providing seamless interchangeabilty!

Feb 18 2017 project raspberry pi gaming

Project Idea - PI Sw1tch

While gaming is not high on my agenda anymore (... or rather at all), I have recently been mulling buying a new console, to act as much as a home entertainment center as a gaming system.

Having owned several generations PlayStation and Sega products, a few new consoles caught my eye. While the most "open" solution, the Steambox sort-of fizzled out, Nintendo's latest console Switch does seem to stand out of the crowd. The balance between power and portability looks like a good fit, and given Nintendo's previous successes, it wouldn't be surprising if it became a hit.

In addition to the separate home and mobile gaming markets, new entertainment mechanisms are needing to provide seamless integration between the two environments, as well as offer comprehensive data and information access capabilities. After all what'd be the point of a gaming tablet if you couldn't watch Youtube on it! Neal Stephenson recently touched on this at his latest TechCrunch talk, by expressing a vision of technology that is more integrated/synergized with our immediate environment. While mobile solutions these days offer a lot in terms of processing power, nothing quite offers the comfort or immersion that a console / home entertainment solution provides (not to mention mobile phones being horrendous interfaces for gaming purposes!)

Being the geek that I am, this naturally led me to thinking about developing a hybrid mechanism of my own, based on open / existing solutions, so that it could be prototyped and demonstrated quickly. Having recently bought a Raspeberry PI (after putting my Arduino to use in my last microcontroller project), and a few other odds and end pieces, I whipped up the following:

Pi sw1tch

The idea is simple, the Raspberry PI would act as the 'console', with a plethora of games and 'apps' available (via open repositories, steam, emulators, and many more... not to mention Nethack!). It would be anchorable to the wall, desk, or any other surface by using a 3D-printed mount, and made portable via a cheap wireless controller / LCD display / battery pack setup (tied together through another custom 3D printed bracket). The entire rig would be quickly assemblable and easy to use, simply snap the PI into the wall to play on your TV; remove and snap into the controller bracket to take it on the go.

I suspect the power component is going to be the most difficult to nail down, finding an affordable USB power source that is lightweight but offers sufficient juice to drive the Raspberry PI w/ LCD might be tricky. But if this is done correctly, all components will be interchangeable, and one can easily plug in a lower-power microcontroller and/or custom hardware component for a tailored experience.

If there is any interest, let me know via email. If 3 or so people commit, this could be done in a weekend! (stay tuned for updates!)