Multiple pattern-matching concatenations for the single string

Pattern-matching is one of the finest elixir-lang features. Whoever knows the power of this tool once, will want to use it forever.
It's pretty easy to split and compare the string literal that way. Just like below:
iex(1)> "r" <> _ = "run"
"run"
The problem, though, appears as soon as you try to assign parts of the string twice in the single pattern-match clause. Just like here:
iex(2)> "f" <> _a <> "rre" <> _b <> "t" = "forrest"
** (ArgumentError) the left argument of <> operator inside a match should always be a literal binary because its size cant be verified.
Why? Because, simply, at that complexity level, you might imagine a case with more than one assignment problem solution. Just take a look at the beekeeper's problem example:
iex(3)> "a" <> hive1 <> "b" <> hive2 <> "c" = "abbbbbc"
How do you (and compiler/interpreter) know how many b's are assigned either to the hive1 and hive2? There is more than one possibility. Like
hive2 = "bbb"+hive2 = "b"hive2 = "bb"+hive2 = "bb"hive2 = "b"+hive2 = "bbb"
So, what to do? Bitstrings! Or, actually - binaries, which are just bitstrings having divisible by 8 number of bits. Using the power of bytes counting, you can now just use them inside the pattern-match clause.
iex(4)> "f" <> <<_o>> <> "rre" <> <<_s>> <> "t" = "forrest"
"forrest"
iex(5)> "f" <> <<_o, _r, _r>> <> "est" = "forrest"
"forrest"
In most of the cases, it will work well as above. Sometimes, though, you might encounter multi-byte characters. Just like e.g. ü, which fills two of them.
iex(6)> "f" <> <<_>> <> "rrest" = "forrest"
"forrest"
iex(7)> "f" <> <<_>> <> "rrest" = "fürrest"
** (MatchError) no match of right hand side value: "fürrest"
iex(7)> "f" <> <<_, _>> <> "rrest" = "fürrest"
"fürrest"
An easy workaround for that is just to use the ::utf8 modifier.
iex(8)> "f" <> <<_::utf8>> <> "rrest" = "forrest"
"forrest"
iex(9)> "f" <> <<_::utf8>> <> "rrest" = "fürrest"
"fürrest"
Happy hacking!
Related posts
Dive deeper into this topic with these related posts
You might also like
Discover more content from this category
Sometimes you need to do some database operations at once. A simple example: User-A transfers money to User-B. Updating just one balance at the time creates a risk of data desynchronization. What if the first DB operation goes well but updating the second user’s data fails? Sounds like a hard to catch vulnerability.
Did you know that it's possible to set default value in Javascript object destructuring?
Warnings in Elixir are usually an important sign of a problem in the codebase. There is an easy way to make them gone.
