Wednesday, April 29, 2015

How to get a hash of a file in Exilir

As the next step in learning about elixir I wanted to add a file validator module to my elixgrep project. Of course this requires as a first step taking the cryptographic hash of a file. I found this handy blog entry, but it didn't answer the whole problem. For small files, you can just read in the whole file into a string and hash it.

iex> :crypto.hash(:sha256,File.read!("./known_hosts.txt")) |> Base.encode16       "97368E46417DF00CB833C73457D2BE0509C9A404B255D4C70BBDC792D248B4A2" 

But there reaches a point were the file size is large enough that loading the whole contents in memory isn't performant and in some cases not feasible. My next idea was to use File.stream!
iex> File.stream!("./known_hosts.txt") |>
Enum.reduce(:crypto.hash_init(:sha256),
fn(line, acc) -> :crypto.hash_update(acc,line) end ) |>
:crypto.hash_final |> Base.encode16
"97368E46417DF00CB833C73457D2BE0509C9A404B255D4C70BBDC792D248B4A2" 

However there is still a problem with this in that it assumes the file has appropriate line endings. For a cryptographic hash it makes more sense to divide up the file into equal byte length chunks. File.stream!/3 has two default arguements, modes and lines_or_bytes, if you want to stream in by byte_length use this form.

 iex> File.stream!("./known_hosts.txt",[],2048) |> Enum.reduce(:crypto.hash_init(:sha256),
fn(line, acc) -> :crypto.hash_update(acc,line) end ) |> :crypto.hash_final |> Base.encode16 "97368E46417DF00CB833C73457D2BE0509C9A404B255D4C70BBDC792D248B4A2" 


 Now the interesting question becomes is there an optimal byte size to use for this hashing? STAY TUNED...

Thursday, April 16, 2015

Elixir functions are not Functions

I was banging my head against a simple elixir test this morning. It's a test that's been in my code for months and that never worked for some reason. This is a boiled down example:


defmodule RaiseTest do
        def testraise do
                raise "This is an error"
        end
end

defmodule RaiseTestTest do
  use ExUnit.Case
  test "assert_raise works" do
    assert_raise(RuntimeError, RaiseTest.testraise )
  end
end

mix test

  1) test assert_raise works (RaiseTestTest)
     test/raise_test_test.exs:4
     ** (RuntimeError) This is an error
     stacktrace:
       (raise_test) lib/raise_test.ex:4: RaiseTest.testraise/0
       test/raise_test_test.exs:5


Finished in 0.03 seconds (0.03s on load, 0.00s on tests)
1 tests, 1 failures
My error was finally pointed out to me this morning on the elixir list and opened my eyes to a consistent flaw in my thinking about elixir. The fix of course is to actually provide a function to assert_raise, what I was actually providing was an expression that could return anything. 

 test "assert_raise works" do
    assert_raise(RuntimeError, fn -> RaiseTest.testraise end ) 
end

The key misunderstanding in my head was that function when used in the elixir context actually maps to what I think of as a function reference from my years of C programming. Module functions return expressions which can be anything. The test failed because it was calling TestRaisetest to see if it returned a function it could use in the test. 
When an elixir function requires a "function" as an argument, it really means a closure that can be executed.

Friday, April 3, 2015

Remember JSON is always valid YAML

While I don't mind YAML for relatively simple data files, I find it to be very difficult to use when there are more than one or two levels of nesting. If you have a relatively complex config file, then I find JSON to be a much easier to reason about and make any necessary changes.

For current versions of YAML, JSON files are a subset of the YAML standard. Thus any compliant YAML parser will also parson JSON.

The place where I use this trick the most often is in .kitchen.yml files used by TestKitchen when writing Chef cookbooks.  I start out by taking an existing .kitchen.yml file and converting it to
.kitchen.json using this website http://codebeautify.org/yaml-to-json-xml-csv. I then copy the
.kitchen.json to .kitchen.yml.