Friday, October 30, 2015

The Definitive All Dancing, All Complete, "WTF Happened to my nice list of integers in Elixir?" Post.

This question happens over and over on every forum where people just starting out with Elixir
go to ask questions. Especially with folks that start with the Euler challenges as a way to learn
Elixir. It generally starts like this.

iex> Enum.map(1..5, fn(x) -> x*x end )
[1, 4, 9, 16, 25]

iex> Enum.map(6..10, fn(x) -> x*x end )

'$1@Qd'

And the reaction is always more or less...

"WTF! where did my nice list of integers go?"

What is really happening is something that is generally so hidden for users of
languages like Ruby or Python is that they've forgotten it exists. Computers do
not store data as readable sections of text, but as binary sequences of ones and
zeros. Ruby et al, do a lot of work to turn those sequences of binary data into
readable strings that can be printed at a terminal. Elixir does the same, it
uses the Inspect Protocol to turn every result that is returned by Elixir
expressions into a readable string.

So lets look at the code that actually gets run when you type an Elixir expression in IEx that returns a List.

defimpl Inspect, for: List do
  def inspect([], _opts), do: "[]"

  def inspect(thing, %Inspect.Opts{char_lists: lists} = opts) do

    cond do
      lists == :as_char_lists or (lists == :infer and printable?(thing)) ->
        << ?', Inspect.BitString.escape(IO.chardata_to_string(thing), ?') :: binary, ?' >>
      keyword?(thing) ->
        surround_many("[", thing, "]", opts, &keyword/2)
      true ->
        surround_many("[", thing, "]", opts, &to_doc/2)
    end
  end

Now, the key thing understand here is that the default for the Inspect Protocol is

char_lists: infer

This means that the code will call the printable? function on the list. If every integer in the 
list corresponds to a printable ASCII character (i.e. is in the range  32..126 plus some integers that
represent various whitespace and newline characters. ), it will print the list as a string rather than
that nice list of integers separated by commas and surrounded by brackets that you were expecting.

Now you are asking yourself:

"Why does Elixir do this batshit transformation just because my list is small numbers?"

It's a good question to ask. Elixir has a perfectly fine String data type that doesn't look anything like
a List of small numbers. The reason Elixir has this logic in it's List Inspect implementation is that Elixir is built on top of Erlang and all Erlang functions are perfectly valid Elixir functions. This is
one of the great strengths of Elixir. Elixir users get 20+ years of solid software engineering that is used to manage a large percentage of the world's cell phone traffic.

Unfortunately, this comes with a price. The price is the list to string transformation above. You see in Erlang, strings are implemented in what was the style at the time:

Lists of Integers 





This means that if the Inspect Protocol has tell you about Erlang error messages or strings returned from Erlang functions, it must convert any List of small integers into an ASCII string. Well, lets see what happens if we turn that transformation off.

iex(15)> :application.info
[loaded: [{:iex, 'iex', '1.1.1'}, {:stdlib, 'ERTS  CXC 138 10', '2.6'},
  {:logger, 'logger', '1.1.1'}, {:compiler, 'ERTS  CXC 138 10', '6.0.1'},
  {:elixir, 'elixir', '1.1.1'}, {:kernel, 'ERTS  CXC 138 10', '4.1'}],
 loading: [],
 started: [logger: :temporary, iex: :temporary, elixir: :temporary,
  compiler: :temporary, stdlib: :permanent, kernel: :permanent],
 start_p_false: [],
 running: [logger: #PID<0.48.0>, iex: #PID<0.42.0>, elixir: #PID<0.35.0>,
  compiler: :undefined, stdlib: :undefined, kernel: #PID<0.9.0>], starting: []]

That's all very useful information. There are many Erlang functions that simply don't have Elixir aware wrappers to do the transformation from Erlang to Elixir strings. 

Now if we turn off the nasty :infer that destroyed our beautiful list of integers, what happens to all that useful information. 

iex> IEx.configure inspect: [ char_lists: false]
:ok
iex> :application.info
[loaded: [{:iex, [105, 101, 120], [49, 46, 49, 46, 49]},
  {:stdlib, [69, 82, 84, 83, 32, 32, 67, 88, 67, 32, 49, 51, 56, 32, 49, 48],
   [50, 46, 54]},
  {:logger, [108, 111, 103, 103, 101, 114], [49, 46, 49, 46, 49]},
  {:compiler, [69, 82, 84, 83, 32, 32, 67, 88, 67, 32, 49, 51, 56, 32, 49, 48],
   [54, 46, 48, 46, 49]},
  {:elixir, [101, 108, 105, 120, 105, 114], [49, 46, 49, 46, 49]},
  {:kernel, [69, 82, 84, 83, 32, 32, 67, 88, 67, 32, 49, 51, 56, 32, 49, 48],
   [52, 46, 49]}], loading: [],
 started: [logger: :temporary, iex: :temporary, elixir: :temporary,
  compiler: :temporary, stdlib: :permanent, kernel: :permanent],
 start_p_false: [],
 running: [logger: #PID<0.48.0>, iex: #PID<0.42.0>, elixir: #PID<0.35.0>,
  compiler: :undefined, stdlib: :undefined, kernel: #PID<0.9.0>], starting: []]

That's terrible, what do all those weird lists of number mean? Now imagine you got an error message that was just a list of integers. 

It really seems like there should be a way to straighten this out, but unfortunately all the solutions are worse than the problem. However, once Elixir Golf becomes a thing ( and it's just a matter of time) , you will be able to turn that nastily long list of integers into a cryptic string and your code will still work. 

iex> Enum.map( '%*&^%*&^%' , fn(x) -> x*x end )
[1369, 1764, 1444, 8836, 1369, 1764, 1444, 8836, 1369]


And finally for those that have made it this far, some real dancing.

via GIPHY


No comments: