-
-
Notifications
You must be signed in to change notification settings - Fork 442
How to handle errors in a functional way
- Background
- Optional results
- Alternative value types
- Pattern matching
-
The
Errortype - Traits for generalised error handling
- Adding error handling to higher-kinded types
In C# we have exactly one (official) way of handling errors: exceptions. Unfortunately exceptions are not declarative - that is, we can't tell from a function's type-signature whether it throws an exception or not. The exceptions are an unknowable side-effect that is not represented in the co-domain of the function.
A few definitions before we continue:
-
Function: I will use the term 'function' in the mathematical sense: a process that takes input arguments and must return a result
- Both static methods and instance methods are functions if they have a return type.
- Instance methods are just functions with an additional
thisparameter!
-
Domain: The set of all input values to a function
- That means all of the arguments and, for lambdas, any 'outer' variables that fall in-scope (known as 'free variables').
-
Co-domain: The set of all return values from a function
- This is simply captured by the return-type for any function.
In pure Functional Programming (pFP) we expect functions to be pure. Pure functions have no side-effects, or indeed any effects at all other than mapping the domain to a co-domain. That is, for all values in the domain of a function there should be a value in the co-domain. And, in the process, we should leave no trace on the world or have depended on any global state.
If we have no value to map to in the co-domain then we can't return a value and therefore must throw an exception, or worse, return no value at all. For pFP this is bad, we should always return a value as we work with expressions all the time and a function that doesn't return a value is a procedure: we're not here to write procedural code. This is why exceptions exist in C#: there is no way to easily augment the co-domain with an error and have that propagate through the system.
In pFP we need some way of augmenting the co-domain with a failure-value to enable the authoring of pure functions.
What we gain from imposing the pFP constraints on ourselves is genuine, leak-free, composition. If we compose two pure functions into a new function; that resulting function will also be pure. This is the pFP super power that leads to fewer bugs, easier refactoring, easier optimisation, faster feature addition, improved code clarity, parallelisation for free, and less cognitive load on ourselves.
So it's worth doing!
When we see signatures like the one below:
int ParseInt(string text)We know that something else must be going on. It is clearly possible to construct a string that can't be parsed into an int (ParseInt("Hello, World") for example). And therefore the signature is hiding something. It is not declarative.
A good way to think about pure functions is as a big switch expression: from the domain to the co-domain. Imagine something like this:
bool IsEven(uint value) =>
value switch
{
0 => true,
1 => false,
2 => true,
3 => false,
4 => true,
5 => false,
...
}Of course, if we continued, that would be the world's biggest switch expression! But the point should be clear: every domain value must map to a co-domain value.
If we can't do that, what do we do?
bool IsEven(uint value) =>
value switch
{
0 => true,
1 => false,
2 => true,
3 => false,
4 => true,
5 => false,
_ => ... // We have no valid value to return here, so classically we'd throw an exception
}Often in functional languages, you'll see function types written like this: Int -> Bool (with an arrow between each argument type and the final type being the return type). That arrow is meant to mean: "Int implies Bool". It doesn't mean "Int implies Bool, oh, and maybe an exception, and perhaps some global state changes, and ..."
In the case of the ParseInt function, we talked about earlier, we can imagine that the implementation would be something like this:
static int ParseInt(string text)
{
// Throw if the text is empty
if(text.Length == 0) throw new ArgumentException();
// Prepare to parse
var mul = raise(text.Length);
var val = 0;
// Parse each character. If it's a digit, then build up the number.
// Otherwise, throw because we can't continue
foreach (var ch in text)
{
if(ch is < '0' or > '9') throw new ArgumentException();
val += (ch - '0') * mul;
mul /= 10;
}
return val;
static int raise(int power)
{
// Find the power of 10 for index value
var x = 1;
for (var i = 1; i < power; i++) x *= 10;
return x;
}
}That will parse a string of text into an int as long as the characters are all digits. If we have an empty string or any non-digit character then it will throw an exception. Exceptions are side-effects, so we want to get rid of them and we want to move the 'failure' into the co-domain.
We can do better!
The first and most basic type for augmenting our co-domain is Option<A>. Option<A> is a little bit like Nullable<T> in that it augments the set of possible values with a value that means 'no value'. That extra value is called None for Option<A> and is null for nullables.
using static LanguageExt.Prelude;
Option<int> mx = Some(123); // Has a value
Option<int> my = None; // No value That means we can return None whenever we are unable to return a 'success' value (where in the past we would have thrown an exception).
For example, with the natural-numbers (uint), you could imagine that we augment the set of values like so:
0, 1, 2, 3, ..., uint.MaxValue, NoneAnd that's what Option<A> is, it's A + None, it models a discriminated-union - also known as a 'sum type' (because we sum up the possible states that the type can be). With Option<A> we've added one more possible state to A.
Let's update the function signature for ParseInt:
Option<int> ParseInt(string text)This is now declarative. A programmer who's never seen this function before won't have to go and look at the source-code to see what happens if text can't be parsed into an int. It's obvious. It also gives confidence that this function is unlikely to throw an exception (for non-exceptional reasons anyway)!
Obvious code that can only work one way reduces the cognitive load on the programmer. We owe it to ourselves to make our lives easier, especially as code-bases grow ever larger and more complex. We need coping strategies that acknowledge that we're not gaining any more grey matter any time soon!
Let's refactor the ParseInt function to be more functional and less imperative.
First, let's make a new type for representing digits (values from 0 to 9).
public readonly struct Digit
{
public readonly int Value;
Digit(int value) =>
Value = value;
public static Option<Digit> Make(int value) =>
value is >= 0 and <= 9
? new Digit(value)
: None;
public static implicit operator int(Digit digit) =>
digit.Value;
}This is good practice when authoring declarative functional solutions, because everything is about functions and how they 'talk to us'. Digit means something that a plain old int doesn't. It also allows us to narrow down the domain and co-domain of any functions we write. This narrowing down of the domains means there are fewer states your function can be in and fewer resulting paths your code can go down after you've got a result from a function. This reduces the complexity of your code!
We have a constructor function called Digit.Make that returns an Option<Digit>. That means the total set of resulting values are:
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, NoneThat's certainly fewer states than the 4,294,967,296 values that an Option<int> would hold.
Let's continue replacing the imperative implementation with a functional one:
Option<int> ParseInt(string text) =>
from digits in ParseDigits(text)
from number in MakeNumberFromDigits(digits)
select number;
Option<Seq<Digit>> ParseDigits(string text) =>
toSeq(text)
.Traverse(c => ParseDigit(c))
.As();
Option<Digit> ParseDigit(char ch) =>
char.IsDigit(ch)
? Digit.Make(ch - '0')
: None;
Option<int> MakeNumberFromDigits(Seq<Digit> digits) =>
digits.IsEmpty
? None
: digits.FoldBack(
(Total: 0, Scalar: 1),
(state, digit) => (Total: state.Total + digit * state.Scalar,
Scalar: state.Scalar * 10)).Total;Not only is the code smaller than before (barring the Digit definition), it's now much more declarative. We've split the previous function into four new functions. Each function stands alone and completely declares their input and output types (and are available for other adventures in composition). They also don't have any undeclared side-effects (apart from potential arithmetic overflows, but I'll leave that out for now).
The LINQ expression in ParseInt also highlights the 'short circuiting' behaviour of the Option type. This is fundamental to how we manage errors in pFP. If ParseDigits returns None then there's no value to assign to digits and so the Option type aborts the expression, returning None for the whole thing. This is like null checking for free.
'Free' as is in you don't have to check for
Nonemanually and you can't make a mistake by forgetting to check forNone(unlikenullwhere it's still possible to get a null-reference exception)
If we remove the implementations and present the four function prototypes on their own, then I would argue that they're all declarative and self documenting. There would be no need to 'look inside' to find out what they do:
Option<int> ParseInt(string text);
Option<Seq<Digit>> ParseDigits(string text);
Option<Digit> ParseDigit(char ch);
Option<int> MakeNumberFromDigits(Seq<Digit> digits);This technique of making small composable functions is a trait of pFP. We want a declarative wrapper around a pure expression. This allows us to use them as the building blocks of larger (also pure) functions.
The really nice thing is how readable ParseInt has become:
Option<int> ParseInt(string text) =>
from digits in ParseDigits(text)
from number in MakeNumberFromDigits(digits)
select number;Not only is the prototype obvious, so is the function body.
We could make it even more terse:
Option<int> ParseInt(string text) =>
ParseDigits(text).Bind(MakeNumberFromDigits);Which really drives home the nature of pure functional composition.
Anyway, back to error handling ...
When we see a function like ParseInt:
Option<int> ParseInt(string text)We could read the return-type slightly differently:
(None | int) ParseInt(string text)This obviously isn't valid C#, but it hints at a general idea. We can return EITHER an int OR a None value. So the Option type has OR semantics. What would happen if we chose something other than None? That's where the Either type comes in...
This type will return EITHER an L OR an R. L is for Left and R is for Right: named because of their order in the type, but also 'right' is a synonym of 'correct' and so the R value is the 'correct' value and the L value is the 'failure' value.
There is no obligation to follow the L and R fail/correct convention. There's no obligation for the types to represent anything other than 'there are two possible result values - and you only get one of them'. It is a general case sum-type with two generic cases. In fact Either<A, B> can be thought of as just as fundamental as the tuple type: (A, B). They are the dual of each other:
- The tuple type
(A, B)can holdAandB - The either type
Either<A, B>can holdAorB - So tuple represents AND and either represents OR.
- In type-theory they are known as the product-type (tuple) and co-product type (either).
The more you dig into functional programming the more type-theory, lambda calculus, logic, and category-theory starts raising its head above the parapet. You don't have to know any of those subjects to use pFP effectively, but what I like is that it hints at an underlying truth or correctness to what we're doing. This quote from Philip Wadler is particularly telling:
"Let’s say that we tried to communicate with aliens. The Voyager, had a plaque on it, trying to communicate with aliens. And you might try to communicate by sending a computer program and maybe we could send a program and see if aliens would be able to decipher it, maybe they would just find that too hard. I think if we sent them a program written in lambda calculus (PL: the fundamentals of functional programming languages), they would be able to decipher it, that they would probably have gotten to the point where they had discovered lambda calculus as well, because it is so fundamental and that they’d have a much easier time deciphering that then deciphering something written in C."
So these concepts are universal! Speaking personally, once I started seeing the underlying theories popping into my programming life, it gave me much more confidence as a programmer (that I was doing the right thing), you can use them as a guide as to whether what you're writing is correct. I think software engineers spend a large part of their programming-life wondering if they're doing things the right way; following the core ideas behind logic, lamdba-calculus, type-theory, and category-theory will pretty much always be a good guide and you don't even need to know these topics in any major depth to benefit.
Eitheris a fundamental data type as explained, but it also has the following traits implemented for it:Functor.Applicative, andMonad. Which impart the short-circuiting semantics highlighted in theOptionexample. All of the error carrying types (Option,Either, ...) are just data-types and you should think of them that way. It's just we have added behaviours after-the-fact (via traits) to allow us to control program-flow. Those program-flow behaviours are nothing to do with the data-type, even though they often appear to be a core feature.
We can think of Option<A> as being a more constrained Either<L, R>. We can replace the R with A and we replace the L with None: Either<None, A>. None isn't a type in its own right, so we can replace that with Unit. Unit is the 'singleton type': it can hold exactly one value: unit. So, Unit is isomorphic to None (also a singleton).
If Option<A> is isomorphic to Either<Unit, A>, do we really need Option<A> at all?
Well, unfortunately, C#'s generic-type inference is poor. We have two generic-types (L and R), but we're often only talking about one of them (L OR R). There are a number of situations where the type-system doesn't have enough information (or capability) to infer both L and R. That makes the Either type a little bit more tricky to use than Option. Not terrible, but if we have more focused types, we won't always have to incur that 'trickiness' cost.
Because of those limitations, there are a number of 'convenience' types like Option<A> (that already know their L type):
| Type | In Either form |
|---|---|
Option<A> |
Either<Unit, A> |
Fin<A> |
Either<Error, A> |
Try<A> |
Either<Exception, A> |
Validation<F, A> |
Either<F, A> where F : Monoid<F> |
Nullable<A> |
Either<null, A> |
These are your error-returning 'bread & butter types'. They can mostly be converted to each other (through natural transformations) and each is useful for a slightly different reason: Option for simple 'it has failed' semantics, Fin for Either like behaviour but with the built-in Error type (which is used in many places, I'll talk about that next), Try to leverage its exception catching behaviour, and Validation for collecting multiple errors using its applicative behaviours.
Nullable<T> (both for structs and references) is similar to Option<A> in that the bound-value type T is augmented. Except it's augmented with null rather than None. When this library was created, I was very much trying to deal with the null problem, amongst other issues I had with C#. Now that the language has the 'nullable references' feature we can be a bit more controlled about working with null.
So, is there really any place for Option any more? Yes, absolutely, Option is still extremely valuable. One big difference is that Option supports both references and structs, whereas in C# the nullable system is a bit of a hack, structs are Nullable<T> and references are, well, regular references but with some compiler help. It's still possible to produce null-reference exceptions, or to end up with default-implemented structs. So, I would still advise using Option for public interfaces. When you know the concrete type then I think nullables are fine (as long as you turn on the compiler checking).
All of the 'alternative value' types that have been mentioned so far have a method called Match which allows for pattern-matching the success and failure values, so each case can be reduced into a single value. For example:
Option<int> option = Pure(123);
Either<string, int> either = Pure(123);
Fin<int> fin = Pure(123);
Try<int> mtry = Pure(123);
Validation<StringM, int> valid = Pure(123);
var v1 = option.Match(None: () => 0,
Some: x => x * 2);
var v2 = either.Match(Left: l => 0,
Right: r => r * 2);
var v3 = fin.Match(Fail: e => 0,
Succ: x => x * 2);
var v4 = mtry.Match(Fail: e => 0,
Succ: x => x * 2);
var v5 = valid.Match(Fail: e => 0,
Succ: x => x * 2);These Match methods allow us to get out of the 'dual state' that we're in and collapse into a single concrete value. As you can see above, all failure values are just defaulting to 0. This may well be what you want, but a lot of the time it's much better to stay in the 'superposition' of being either failed or succeeded. We haven't collapsed the superposition, so we don't need to find a concrete value, yet!
Yes, I'm going there, I'm comparing the 'either type' to the wave-function in Quantum Mechanics!
What we want to do is stay in the 'lifted state' state until we are in a position to collapse it. So, if you're not able to collapse to a useful concrete value, use Map, BiMap, MapLeft, MapFail, Bind, etc. to work with the values before 'collapse'.
This is a vital piece of understanding when using these 'algebraic data-types'. Use the algebra to your advantage, don't rush to leave that space and collapse the value. I see this happening so much in questions asked in this repo. A general rule of thumb is that if you're doing lots of
Matchoperations, you're doing it wrong. They should occur sparingly and at the time when value collapse is meaningful.
One final thing to note. From v5 onwards, the various alternative-value types have been implemented like discriminated-unions that you can use the C# pattern-matching on:
Either<string, int> either = Fail("error message");
Fin<int> fin = Error.New("error message");
Validation<StringM, int> valid = Fail<StringM>("error message");
var v1 = either switch
{
Either.Left<string, int> (var l) => 0,
Either.Right<string, int> (var r) => r * 2,
};
var v2 = fin switch
{
Fin.Fail<int> (var e) => 0,
Fin.Succ<int> (var x) => x * x,
};
var v3 = valid switch
{
Validation.Fail<StringM, int> (var e) => 0,
Validation.Success<StringM, int> (var x) => x * x,
};These are a little more 'clunky', but they avoid the cost of allocating the lambdas when we use the Match method. So, if memory allocation is a concern then this is a valid approach. We can also use the when predicates, which has more flexibility.
The reason that Option and Try are not able to pattern-match like Either, Fin, and Validation is because Option is a struct (this may change in the future) and Try is a lazy unevaluated type. If you call mtry.Run() then you'll get a Fin that you can pattern-match.
Even with all that said, you can still do pattern-matching with Option, but you must do property-based pattern-matching. For example:
var mx = Some(100);
var value = mx switch
{
{ IsSome: true, Case: > 100 } => "Over 100",
{ IsSome: true, Case: > 50 } => "Over 50",
{ IsSome: true, Case: > 10 } => "Over 10",
_ => "Not set"
};It's not exactly pretty. Which is why I am considering whether I want to switch Option away from being a struct or not.
In the Background section of this article we stated that exceptions are bad because they're not declarative. But, not only are they not declarative, they are used for all types of error: expected errors and unexpected errors. Unexpected errors are truly exceptional. They include errors like OutOfMemoryException and other errors that are usually unrecoverable. But, most errors yielded by functions are for expected reasons that we should be handling.
If you agree with that premise then using
Exceptioneverywhere doesn't make sense.
Another issue with Exception is that for every new error we raise, we end up having to create a new type to carry its meaning (that is, if we want to catch them by type). This becomes quite onerous (although exception predicates are helpful here). What tends to happen is that everyone reuses existing exception types like ArgumentException or the like.
So, what I wanted to do was create a new Error type that could be used throughout the library and that could fill the following requirements:
- Must have distinct exceptional and expected cases
- No requirement to create new types for every new error introduced
- Only need to create new types when additional data payloads are needed
- Don't leak sensitive details (like stack traces) when serialised
- Support pattern-matching concepts (to make it trivial to work with error values in expressions)
- Handle multiple errors easily
The abstract Error type has the following sub-types:
-
Exceptional- An unexpected error -
Expected- An expected error -
ManyErrors- Many errors (possibly zero)
These are the key types that you would use to create your own error sub-types (if required). They indicate the 'flavour' of the error. For example, a 'user not found' error isn't something exceptional, it's something we expect to happen. An OutOfMemoryException however, is
exceptional - it should never happen, and we should treat it as such.
You may wonder why
ManyErrorscould be empty. That allows forErrors.None- which works a little likeOption.None. We're saying: "The operation failed, but we have no information on why; it just did".
Let's have some examples:
var err1 = Error.New("user not found"); // Expected
var err2 = Error.New(404, "page not found"); // Expected with an error-code
var err3 = Error.New(exception); // Exceptional
var err4 = Error.New("there was a problem", exception); // Exceptional with an alternative message
var err5 = err1 + err2 + err3 + err4; // ManyErrors containing four errorsMost of the time we want sensible handling of expected errors and should bail out completely for something exceptional. We also want to protect ourselves from information leakage. Leaking exceptional errors via public APIs is a sure-fire way to open up more information to hackers than you would like. When Exceptional is serialised: only the Message and Code component is serialised. There's no serialisation of the inner Exception or its stack-trace.
Deserialisation obviously means we can't recover the
Exception, but the state of theErrorwill still beExceptional- so it's possible to carry the severity of the error across domain boundaries without leaking too much information.
An Error is either created from an Exception or it isn't. This allows for expected errors to be represented without throwing exceptions, but also it allows for more principled error handling.
We can interrogate the Error with the following:
| Operation | Description |
|---|---|
Is(Error) |
Tests if this Error type (or one contained within) matches the Error provided. |
IsType<E>() where E : Error |
Tests if this Error type (or one contained within) is of type E
|
IsExceptional |
true for exceptional errors. For ManyErrors this is true if any of the errors are exceptional |
IsExpected |
true for non-exceptional/expected errors. For ManyErrors this is true if all of the errors are expected. |
IsEmpty |
true if there are no errors in a ManyErrors
|
HasCode(int) |
true if the Code property matches the provided int. For ManyErrors, true is returned if any error has the matching code |
HasException<E>() where E : Exception |
Tests if this error-type (or one contained within) is exceptional and contains a specific exceptional type E
|
Count |
1 for most errors, or n for the count of errors in a ManyErrors
|
Head |
To get the first error |
Tail |
To get the tail of multiple errors |
There are other primitives in language-ext that use these interrogation tools for 'pattern matching' the errors, so we don't often use these methods directly, but it's useful to know they're there as it helps understand how we can dig into the different flavours of error.
You can extend the set of error types (perhaps for passing through extra data) by creating a new record that inherits Exceptional or Expected:
public record BespokeError(bool MyData) : Expected("Something bespoke", 100, None); By default the properties of the new error-type won't be serialised. So, if you want to pass a payload over the wire, add the [property: DataMember] attribute to each member:
public record BespokeError([property: DataMember] bool MyData) : Expected("Something bespoke", 100, None); Using this technique it's trivial to create new error-types when additional data needs to be moved around, but also there's a ton of built-in functionality for the most common use-cases.
Let's now look at how errors might be managed in a real-world application. To do this, I'm going to copy in some examples from a compiler I'm working on right now. The compiler needs an extra payload with any raised errors (namely source-file information, including: path, column, and line-number), so I create a SourceError type that derives from Expected:
public record SourceError(Location Location, string Message, SourceErrorTag Tag) :
Expected(Message, (int)Tag, None)Notice the SourceErrorTag that I am passing to the Expected constructor as the error-code. It looks like this:
public enum SourceErrorTag
{
EmptyToken = SourceError.ErrCodeStart,
SyntaxError,
UnknownToken,
UnclosedComment,
UnclosedParen,
UnexpectedParen,
UnclosedBrace,
UnexpectedBrace,
UnclosedBracket,
UnexpectedBracket,
ExpectedEndOfFile,
ExpectedField,
UnexpectedField,
ExpectedTraitMember,
...
}The benefit of using an enum is that it automatically generates error codes sequentially.
I then have a load of static constructor functions in the SourceError type:
public record SourceError ...
{
public static SourceError EmptyToken(Location location) =>
new (location, "parse failure: empty token", SourceErrorTag.EmptyToken);
public static SourceError UnknownToken(Location location, string token) =>
new (location, $"unknown token: {token}", SourceErrorTag.UnknownToken);
public static SourceError UnclosedComment(Location location) =>
new (location, "unclosed comment", SourceErrorTag.UnclosedComment);
public static SourceError UnclosedParen(Location location) =>
new (location, "expected ')'", SourceErrorTag.UnclosedParen);
public static SourceError UnexpectedParen(Location location) =>
new (location, "unexpected ')'", SourceErrorTag.UnexpectedParen);
public static SourceError UnclosedBrace(Location location) =>
new (location, "expected '}'", SourceErrorTag.UnclosedBrace);
public static SourceError UnexpectedBrace(Location location) =>
new (location, "unexpected '}'", SourceErrorTag.UnexpectedBrace);
public static SourceError UnclosedBracket(Location location) =>
new (location, "expected ']'", SourceErrorTag.UnclosedBracket);
public static SourceError UnexpectedBracket(Location location) =>
new (location, "unexpected ']'", SourceErrorTag.UnexpectedBracket);
public static SourceError UxpectedEndOfFile(Location location) =>
new (location, "expected end of file", SourceErrorTag.ExpectedEndOfFile);
...
}For errors that don't need arguments you can use static readonly fields, which means we don't have to allocate an Error type for each use. This is a useful efficiency benefit.
What's nice about this approach, in general, is that you can build a centralised 'error module' which focuses all of the error definitions in one place. It makes long-term maintenance of errors, codes, messages, much easier. They're so easy to create and manage. Just add a new item to the enum and then add a new function or field to your app-error type. And if you want to localise your error-messages, then you could easily see a solution that yielded Error values based on a country-code or whatever other approach you needed.
On another project, which has several subsystems: API, CLI, DAT (data access tier), and Service; I have APIError, CLIError, DATError, ServiceError. This allows for the these subsystems and their errors to be independent and standalone. They all still derive from the base Exceptional or Expected error-types, but are completely disconnected from the other subsystems. Where one subsystem uses another subsystem (say Service calling into DAT to use the database), then it is expected that database errors are converted to service errors (or handled in-place). This is something you can unit-test for: to confirm correct escalation of errors.
You may think error-codes are a thing of the past. This is really a Java/C# 'thing' where the language authors decided every single thing on earth must be represented by an object. That's why we have exceptions as objects. It's just overkill for the vast majority of cases. When C# gets discriminated-unions then this might become a thing of the past, but for now all we need is a discriminator: the error code.
The Error type is used throughout language-ext to standardise error-handling and make their consuming types easier to use.
The following all use Error:
| Type | Comment |
|---|---|
Fin<A> |
Isomorphic to Either<Error, A>
|
FinT<M, A> |
Isomorphic to K<M, Either<Error, A>>
|
Try<A> |
Yields a Fin<A> when Run
|
TryT<M, A> |
Yields a K<M, Fin<A>> when Run
|
Eff<A> and Eff<RT, A>
|
Yields a Fin<A>
|
IO<A> |
Doesn't use Error internally, but you can call IO<A>.RunSafe() orIO<A>.RunSafeAsync() to catch exceptions and turn the resulting A into a Fin<A>
|
Consumer |
From LanguageExt.Streaming, composes with Pipe and Producer
|
Pipe |
From LanguageExt.Streaming, composes with Producer and Consumer
|
Producer |
From LanguageExt.Streaming, composes with Pipe and Consumer
|
Effect |
From LanguageExt.Streaming: result of composition of Producer, Pipe, and Consumer; yields an Eff, which in turn yields a Fin
|
Fallible<F> trait |
Allows for generalised functions that work with Error (see next section) |
In many ways the Error type in language-ext is old-skool object-orientation. We're leveraging inheritance to allow for the potentially never-ending growth of error-types. I think it's important to remember that object-orientation isn't always bad. In fact it's a tool, like any other feature of a programming-language, it's just that object-oriented language makers decided that every single thing must be an object and then object-oriented design-patterns grew out of this myopic viewpoint. C# has grown as a language to include many features from functional-programming languages, which gives us an opportunity to leverage new ways of writing code.
But sometimes the old tool is the right one for the job and it doesn't serve us best to be FP absolutists (just like OO absolutism isn't good either). Functional Programming is very much about working with functions and Algebraic Data types (ADTs). These ADTs are fixed upfront. The benefit from having a fixed set of terms, cases, etc. is that you can write total-functions. Total functions are complete and concrete. The types are complete and concrete. Your program becomes a proof. This obviously leads to more robust code.
Unsealed types in C# can never be concrete, they can never be proved because you can always create a new type that derives from a class or interface that breaks the proof. That inherently makes OO harder to reason about and makes it harder to write error-free code.
However, we should be pragmatic, we're not actually writing proofs, we're writing applications. And one place where inheritance works quite well is with an error-value. Why?
- All errors mean 'failure' - so it's not like any inherited type will change the semantics of that
- Most code that catches and handles a discrete set of known errors will not suddenly become worse because of the introduction of new error sub-type:
- The discrete set of known errors, that have distinct code paths handling them, will continue to work.
- Those code-paths are usually the only available options for handling those known errors; because we're dealing with 'known' failure conditions and providing bespoke responses.
- There's often a 'catch-all' case that any newly introduced errors will fall in to, so that failure propagates.
On Haskell projects I've done the thing of creating a sum-type for all of the possible errors in an application (or subsystem) and the experience is pretty similar. Catch and manage what you have bespoke failure code-paths for and then yield what you can't handle. Over time the sum-type grows to capture lots of new failure conditions and the old code usually doesn't need updating.
The one downside to both the Error standardisation and the Haskell approach, is sometimes a newly added error-type does matter. So, it's important to be cautious. If you modify a function to yield a new error-type, it's worth checking the code that leverages that function to see if the error-handling needs updating.
On the whole though I think this is a better approach for us in C# land. The poor generic type-inference story in C# means that any types that carry a bespoke error-type (like the L in Either<L, R> or F in Validation<F, A>) are going to be slightly more awkward to work with. That doesn't mean they're bad or wrong, but we have to consider what we gain from carrying a bespoke error type, rather than one derived from Error. In my mind, it's not much, but you may have a use-case where that differs.
So far we've discussed concrete types and concrete error-handling. With the new trait-system introduced in language-ext v5, we now have access to generalised error handling...
If you don't know what 'Higher-Kinded Traits' are in language-ext, then Paul has a primer series on his blog. This will give you the background on what's coming next.
/// <summary>
/// A semigroup on applicative functors
/// </summary>
/// <typeparam name="F">Applicative functor</typeparam>
public interface Choice<F> : Applicative<F>, SemigroupK<F>
where F : Choice<F>
{
/// <summary>
/// Where `F` defines some notion of failure or choice, this function picks the
/// first argument that succeeds. So, if `fa` succeeds, then `fa` is returned;
/// if it fails, then `fb` is returned.
/// </summary>
/// <param name="fa">First structure to test</param>
/// <param name="fb">Second structure to return if the first one fails</param>
/// <returns>First argument to succeed</returns>
static abstract K<F, A> Choose<A>(K<F, A> fa, K<F, A> fb);
/// <summary>
/// Where `F` defines some notion of failure or choice, this function picks the
/// first argument that succeeds. So, if `fa` succeeds, then `fa` is returned;
/// if it fails, then `fb` is returned.
/// </summary>
/// <param name="fa">First structure to test</param>
/// <param name="fb">Second structure to return if the first one fails</param>
/// <returns>First argument to succeed</returns>
static abstract K<F, A> Choose<A>(K<F, A> fa, Func<K<F, A>> fb);
}The Choice trait defines two Choose methods, which both do the same thing, except one has a lazy second parameter. Choice enables 'failure propagation'. So, if the first argument K<F, A> 'succeeds', then it will be returned, otherwise the second argument is returned. With the lazy version of Choose the second argument is only invoked if the first one 'fails'.
I'm putting 'succeeds' and 'fails' in quotes, because it is entirely up to the implementing type what those terms mean.
In the types that support the Choice trait, the | operator is usually overridden too. For example:
public static Option<A> operator |(Option<A> lhs, Option<A> rhs) =>
lhs.Choose(rhs).As();That means we can chain a series of Option types using the | operator and the first one that is in a Some state will return.
This skips past the mx and my to return Some(3):
Option<int> mx = None;
Option<int> my = None;
Option<int> mz = Some(3);
var mr = mx | my | mz; // Some(3)This skips past the mx to return Some(2):
Option<int> mx = None;
Option<int> my = Some(2);
Option<int> mz = Some(3);
var mr = mx | my | mz; // Some(2)This returns the first option Some(1):
Option<int> mx = Some(1);
Option<int> my = Some(2);
Option<int> mz = Some(3);
var mr = mx | my | mz; // Some(1)If all are None, you get None back!
Option<int> mx = None;
Option<int> my = None;
Option<int> mz = None;
var mr = mx | my | mz; // NoneWhat's nice about this propagation is we can use it to catch errors, provide defaults, or provide more contextual errors.
If we go back to the ParseInt example from earlier, that returned an Option<int>. Instead of matching on the result we could write:
var value = ParseInt("not a number") | Some(0);So, we're 'catching' the None value that comes out and providing a default Some value of 0. And because A is implicitly convertible to an Option<A>, we can even do this:
var value = ParseInt("not a number") | 0;Let's move onto a more complex type with a non-singleton failure-value: Fin<A>. It's failure-value type is Error. Let's refactor the ParseInt example from earlier to have meaningful errors:
public readonly struct Digit
{
public readonly int Value;
Digit(int value) =>
Value = value;
public static Fin<Digit> Make(int value) =>
value is >= 0 and <= 9
? new Digit(value)
: Error.New($"Not a valid digit number: {value}, should be 0 - 9");
public static implicit operator int(Digit digit) =>
digit.Value;
}
Fin<int> ParseInt(string text) =>
ParseDigits(text)
.Bind(MakeNumberFromDigits);
Fin<Seq<Digit>> ParseDigits(string text) =>
toSeq(text)
.Traverse(ParseDigit)
.As();
Fin<Digit> ParseDigit(char ch) =>
char.IsDigit(ch)
? Digit.Make(ch - '0')
: Error.New($"Not a valid digit: '{ch}'");
Fin<int> MakeNumberFromDigits(Seq<Digit> digits) =>
digits.IsEmpty
? Error.New("Number of digits cannot be zero")
: digits.FoldBack(
(Total: 0, Scalar: 1),
(state, digit) => (Total: state.Total + digit * state.Scalar,
Scalar: state.Scalar * 10)).Total;Instead of returning Option, we're now returning Fin. Let's now trying parsing some values:
var mr = ParseInt("fail");What we get back from ParseInt is:
[Not a valid digit: 'f', Not a valid digit: 'a', Not a valid digit: 'i', Not a valid digit: 'l']
Which really is an extremely detailed report of all the issues found with the input text; but what if this was part of some form validation where we're validating multiple form values? Are these error messages useful, at all!?
By the way, if you're wondering how on earth we got four error messages rather than just one, it's because we use
Traverse. It usesApplicative.Apply, so the terms are parsed independently. If you just want the first error, use:TraverseMinstead. It usesMonad.Bind, which is sequential.
To make some friendlier errors, we can use ParseInt with the Choice operator to provide something a little more useful for the user:
var mr = ParseInt(text) | Error.New("Field 'blah' is invalid");This overrides whatever error(s) are yielded by ParseInt and just provides a standard error. This is the most basic form of error overriding, the Fallible trait allows for more complex propagation behaviours. But this first step is good to understand, as many types in language-ext support the choice operator: the effect types (IO, Eff, etc.), the alternative value types (Option, Either, Try, Fin, Validation, etc.), even the collection types (Seq, Lst, Iterable, etc.) -- which treat an empty sequence as 'fail'.
public interface Alternative<F> : Choice<F>, MonoidK<F>
where F : Alternative<F>;The Alternative trait is simply a combination of Choice and MonoidK. So, we are adding 'identity' to the choice type:
public interface MonoidK<M> : SemigroupK<M>
where M : MonoidK<M>
{
/// <summary>
/// Identity
/// </summary>
[Pure]
public static abstract K<M, A> Empty<A>();
}The Empty<A>() method will get the identity value for the type. This is often a default failure value. For Option<A> it's None, for Seq<A> it's [], and for IO<A> it's IO.fail(Errors.None). When you write more generic code that relies on traits rather than concrete types then the ability to pull a default error value out of thin air becomes really useful.
For example, the oneOf function, in the Alternative module, takes a Seq<K<F, A>> where F : Alternative<F> values and returns the first one to succeed. But, if none succeed, then it can yield a default failure value: F.Empty<int>(). No exceptions needed!
var r = Alternative.oneOf(
ParseInt("foo"),
ParseInt("bar"),
ParseInt("123")); // Succ(123)By the way, the function above is equivalent to:
var r = ParseInt("foo") | ParseInt("bar") | ParseInt("123");The primary difference is that the function can handle a collection to test rather than a discrete set.
/// <summary>
/// Trait for higher-kinded structures that have a failure state `E`
/// </summary>
/// <typeparam name="E">Failure type</typeparam>
/// <typeparam name="F">Higher-kinded structure</typeparam>
public interface Fallible<E, F>
{
/// <summary>
/// Raise a failure state in the `Fallible` structure `F`
/// </summary>
/// <param name="error">Error value</param>
public static abstract K<F, A> Fail<A>(E error);
/// <summary>
/// Run the `Fallible` structure. If in a failed state, test the failure value
/// against the predicate. If it returns `true`, run the `Fail` function with
/// the failure value.
/// </summary>
/// <param name="fa">`Fallible` structure</param>
/// <param name="Predicate">Predicate to test any failure values</param>
/// <param name="Fail">Handler when in failed state</param>
/// <returns>Either `fa` or the result of `Fail` if `fa` is in a failed state and the
/// predicate returns true for the failure value</returns>
public static abstract K<F, A> Catch<A>(
K<F, A> fa,
Func<E, bool> Predicate,
Func<E, K<F, A>> Fail);
}The Fallible<E, F> trait and its specialisation that bakes in Error:
/// <summary>
/// Trait for higher-kinded structures that have a failure state `Error`
/// </summary>
public interface Fallible<F> : Fallible<Error, F>;...extend the Choice idea. Instead of a simple propagation of failure (as seen with the Choice.Choose method), we can now accept a failure value, E, into a predicate. If the result of invoking that predicate with the the E value is true, then a Fail function is invoked that also takes the E value and returns a new higher-kinded structure.
If you remember the earlier example that took the Fin returning ParseInt function and provided a default error response:
var mr = ParseInt("fail") | Error.New("Field 'blah' is invalid");Now we can use the .Catch method to do more complex error handling:
var mr = ParseInt("fail")
.Catch(e => e.IsExpected,
e => Error.New($"Field 'blah' is invalid, because: {e}", e)); What we're doing here is catching all expected errors (so, we ignore the exceptional errors) and adding some context to the errors. Notice how we're embedding the e in the message, but also passing it as the 'inner' error. That allows us to present a more contextual error but also keep the original context (creating an 'error stack').
That's certainly not as pretty as propagation approach, using the | operator. There are lots of variants of Catch that we get for free though.
We can catch-all errors like so:
var mr = ParseInt("fail")
.Catch(e => Error.New($"Field 'blah' is invalid, because: {e}", e)); We can also catch specific errors; here we catch timeout errors:
var mr = ReadFromWeb(url)
.Catch(Errors.TimedOut, e => Error.New($"request to {url} timed-out", e)); Or, you can provide a default-success value for certain failure conditions:
IO<Customer> FindCustomer(CustomerId id) => ...;
IO<Customer> CreateNewCustomer(CustomerDetails details) => ...;
var c = FindCustomer(id)
.Catch(AppErrors.CustomerNotFound, _ => CreateNewCustomer(details));The various catching capabilities align with many of the
Errorquerying methods mentioned earlier. Obviously,Fallible<F, E>is generic over allEerror-types, but for types that bake in theErrortype as theirE(so implement theFallible<F>trait), we get additional matching capabilities.
Chaining the Catch method in a fluent style is certainly idiomatic C#, but it's not that pretty compared the the | operator chaining that we saw in the Choice trait. To solve that, I introduce the CatchM<E, M, A> type:
/// <summary>
/// Used by `@catch`, `@exceptional`, `@expected` to represent the catching of errors
/// </summary>
public readonly record struct CatchM<E, M, A>(Func<E, bool> Match, Func<E, K<M, A>> Action)
where M :
Fallible<E, M>
{
public K<M, A> Run(E error, K<M, A> otherwise) =>
Match(error) ? Action(error) : otherwise;
public static K<M, A> operator |(K<M, A> lhs, CatchM<E, M, A> rhs) =>
lhs.Catch(rhs.Match, rhs.Action);
}It captures the arguments for an invocation of .Catch. The CatchM is has lots of constructor functions in the Prelude. The main ones are @catch and @catchOf.
Let's rewrite the examples above using @catch and the | operator:
Catching all errors:
var mr = ParseInt("fail")
| @catch(e => FinFail<int>(Error.New($"Field is invalid, because: {e}"))); Catching specific errors; here we catch timeout errors:
var mr = ReadFromWeb(url)
| @catch(Errors.TimedOut, e => IO.fail<int>(Error.New($"request to {url} timed-out")))
| @catch(Errors.NotFound, e => IO.fail<int>(Error.New($"not found: {url}")));Provide a default-success value for certain failure conditions:
var c = FindCustomer(id)
| @catch(AppErrors.CustomerNotFound, CreateNewCustomer(details));All types that implement the Fallible trait also have support for @catch methods and all of the variants. It's quite a deep set of capabilities for catching errors of different flavours, but the fundamentals are all here. And of course, you can make your own types Fallible and all of these features will just work.
What we can do now is completely generalise our ParseInt function (and its supporting functions) to work with any Fallible<F> type:
First, let's refactor the Digit struct:
public readonly struct Digit
{
public readonly int Value;
Digit(int value) =>
Value = value;
public static K<F, Digit> Make<F>(int value)
where F : Fallible<F>, Applicative<F> =>
value is >= 0 and <= 9
? pure<F, Digit>(new Digit(value))
: error<F, Digit>((Error)$"Not a valid digit number: {value}, should be 0 - 9");
public static implicit operator int(Digit digit) =>
digit.Value;
}Notice how the return-type of Make has changed to K<F, Digit> from Fin<Digit>. And we have two constraints: Fallible<F> and Applicative<F>. Fallible allows us to use the error function and Applicative allows us to use pure (to construct 'success' values):
Let's do the same for the other functions now:
static K<F, int> ParseInt<F>(string text)
where F : Fallible<F>, Monad<F> =>
from digits in ParseDigits<F>(text)
from number in MakeNumberFromDigits<F>(digits)
select number;
K<F, Seq<Digit>> ParseDigits<F>(string text)
where F : Fallible<F>, Applicative<F> =>
toSeq(text)
.Traverse(ParseDigit<F>);
K<F, Digit> ParseDigit<F>(char ch)
where F : Fallible<F>, Applicative<F> =>
char.IsDigit(ch)
? Digit.Make<F>(ch - '0')
: error<F, Digit>((Error)$"Not a valid digit: '{ch}'");
K<F, int> MakeNumberFromDigits<F>(Seq<Digit> digits)
where F : Fallible<F>, Applicative<F> =>
digits.IsEmpty
? error<F, int>((Error)"Number of digits cannot be zero")
: pure<F, int>(digits.FoldBack((Total: 0, Scalar: 1),
(state, digit) => (Total: state.Total + digit * state.Scalar,
Scalar: state.Scalar * 10)).Total);Again, we've swapped out Fin<A> for K<F, A> and added constraints for Fallible and Applicative. ParseInt is constrained to Monad because it uses Monad.Bind via LINQ.
With this, we can now use ParseInt with any type that's Fallible:
Fin<int> mr1 = ParseInt<Fin>("123").As();
IO<int> mr2 = ParseInt<IO>("456").As();
Eff<int> mr3 = ParseInt<Eff>("789").As();
Try<int> mr4 = ParseInt<Try>("fail").As();
...[TODO: talk about how Final is the functional equivalent of finally]
[TODO: Discuss adding error handling to types that don't have it built-in]
[TODO: Talk about OptionT, EitherT, FinT]