Wednesday, April 29, 2015

How to get a hash of a file in Exilir

As the next step in learning about elixir I wanted to add a file validator module to my elixgrep project. Of course this requires as a first step taking the cryptographic hash of a file. I found this handy blog entry, but it didn't answer the whole problem. For small files, you can just read in the whole file into a string and hash it.

iex> :crypto.hash(:sha256,!("./known_hosts.txt")) |> Base.encode16       "97368E46417DF00CB833C73457D2BE0509C9A404B255D4C70BBDC792D248B4A2" 

But there reaches a point were the file size is large enough that loading the whole contents in memory isn't performant and in some cases not feasible. My next idea was to use!
iex>!("./known_hosts.txt") |>
fn(line, acc) -> :crypto.hash_update(acc,line) end ) |>
:crypto.hash_final |> Base.encode16

However there is still a problem with this in that it assumes the file has appropriate line endings. For a cryptographic hash it makes more sense to divide up the file into equal byte length chunks.!/3 has two default arguements, modes and lines_or_bytes, if you want to stream in by byte_length use this form.

 iex>!("./known_hosts.txt",[],2048) |> Enum.reduce(:crypto.hash_init(:sha256),
fn(line, acc) -> :crypto.hash_update(acc,line) end ) |> :crypto.hash_final |> Base.encode16 "97368E46417DF00CB833C73457D2BE0509C9A404B255D4C70BBDC792D248B4A2" 

 Now the interesting question becomes is there an optimal byte size to use for this hashing? STAY TUNED...

Thursday, April 16, 2015

Elixir functions are not Functions

I was banging my head against a simple elixir test this morning. It's a test that's been in my code for months and that never worked for some reason. This is a boiled down example:

defmodule RaiseTest do
        def testraise do
                raise "This is an error"

defmodule RaiseTestTest do
  use ExUnit.Case
  test "assert_raise works" do
    assert_raise(RuntimeError, RaiseTest.testraise )

mix test

  1) test assert_raise works (RaiseTestTest)
     ** (RuntimeError) This is an error
       (raise_test) lib/raise_test.ex:4: RaiseTest.testraise/0

Finished in 0.03 seconds (0.03s on load, 0.00s on tests)
1 tests, 1 failures
My error was finally pointed out to me this morning on the elixir list and opened my eyes to a consistent flaw in my thinking about elixir. The fix of course is to actually provide a function to assert_raise, what I was actually providing was an expression that could return anything. 

 test "assert_raise works" do
    assert_raise(RuntimeError, fn -> RaiseTest.testraise end ) 

The key misunderstanding in my head was that function when used in the elixir context actually maps to what I think of as a function reference from my years of C programming. Module functions return expressions which can be anything. The test failed because it was calling TestRaisetest to see if it returned a function it could use in the test. 
When an elixir function requires a "function" as an argument, it really means a closure that can be executed.

Friday, April 3, 2015

Remember JSON is always valid YAML

While I don't mind YAML for relatively simple data files, I find it to be very difficult to use when there are more than one or two levels of nesting. If you have a relatively complex config file, then I find JSON to be a much easier to reason about and make any necessary changes.

For current versions of YAML, JSON files are a subset of the YAML standard. Thus any compliant YAML parser will also parson JSON.

The place where I use this trick the most often is in .kitchen.yml files used by TestKitchen when writing Chef cookbooks.  I start out by taking an existing .kitchen.yml file and converting it to
.kitchen.json using this website I then copy the
.kitchen.json to .kitchen.yml.