DI in Scala: Cake Pattern pros & cons

I’ve been looking at alternatives for java-style DI and DI containers which would use pure Scala; a promising candidate is the Cake Pattern (see my earlier blog post for information on how the Cake Pattern works). FP enthusiast also claim that they don’t need any DI frameworks, as higher-order functions are enough.

Recently Debasish Ghosh also blogged on a similar subject. I think his article is a very good introduction into the subject.

Below are some problems I encountered with the Cake Pattern. (Higher-order functions are coming up in the next post.) If you have solutions to any of them, let me know!

Parametrizing the system with a component implementation

First of all, it is not possible to parametrize a system with a component implementation. Supposing I have three components: DatabaseComponent, UserRepositoryComponent, UserAuthenticatorComponent with implementations, the top-level environment/entry point of the system would be created as follows:

1
2
3
val env = new MysqlDatabaseComponentImpl
   with UserRepositoryComponent
   with UserAuthenticatorComponent

Now to create a testing environment with a mock database, I would have to do:

1
2
3
val env = new MockDatabaseComponentImpl
   with UserRepositoryComponent
   with UserAuthenticatorComponent

Note how much of the code is the same. This isn’t a problem with 3 components, but if there are 20? All of them but one have to be repeated just to change the implementation of one component. This clearly leads to quite a lot of code duplication.

Component configuration

Quite often a component needs to be configured. Let’s say I have a UserAuthenticatorComponent which depends on UserRepositoryComponent. However, the authenticator component has an abstract val encryptionMethod, used to configure the encryption algorithm. How can I configure the component? There are two ways. The abstract val can be concretized when defining the env, e.g.:

1
2
3
4
5
val env = new MysqlDatabaseComponentImpl
   with UserRepositoryComponent
   with UserAuthenticatorComponent {
   val encryptionMethod = EncryptionMethods.MD5
}

But what if I want to re-use a configured component? An obvious answer is to extend the UserAuthenticatorComponent trait. However if that component has any dependencies (which, in the Cake Pattern, are expressed using self-types), they need to be repeated, as self-types are not inherited. So a reusable, configured component could look like this:

1
2
3
4
5
6
trait UserAuthenticatorComponentWithMD5 
         extends UserAuthenticatorComponent  {
   // dependency specification duplication!
   this: UserRepositoryComponent => 
   val encryptionMethod = EncryptionMethods.MD5
}

If we don’t repeat the self-types, the compiler will complain about incorrect UserAuthenticatorComponent usage.

No control over initialization order

A problem also related to configuration, is that there is no type-safe way to assure that the components are initialized in the proper order. Suppose as above that the UserAuthenticatorComponent has an abstract encryptionMethod which must be specified when creating the component. If we have another component that depends on UserAuthenticatorComponent:

1
2
3
4
5
trait PasswordEncoderComponent {
   this: UserAuthenticatorComponent =>
   // encryptionMethod comes from UserAuthenticatorComponent
   val encryptionAlgorithm = Encryption.getAlgorithm(encryptionMethod)
}

and initialize our system as follow:

1
2
3
4
5
6
val env = new MysqlDatabaseComponentImpl
   with UserRepositoryComponent
   with UserAuthenticatorComponent 
   with PasswordEncoderComponent {
   val encryptionMethod = EncryptionMethods.MD5
}

then at the moment of initialization of encryptionAlgorithm, encryptionMethod will be null! The only way to prevent this is to mix in the UserAuthenticatorComponentWithMD5 before the PasswordEncoderComponent. But the type checker won’t tell us that.

Pros

Don’t get me wrong that I don’t like the Cake Pattern – I think it offers a very nice way to structure your programs. For example it eliminates the need for factories (which I’m not a very big fan of), or nicely separates dependencies on components and dependencies on data (*). But still, it could be better ;).

(*) Here each code fragment has in fact two types of arguments: normal method arguments, which can be used to pass data, and component arguments, expressed as the self type of the containing component. Whether these two types of arguments should be treated differently is a good question :).

What are your experiences with DI in Scala? Do you use a Java DI framework, one of the approaches used above or some other way?

Adam

  • http://blog.juma.me.uk Ismael Juma

    Hi,

    For the initialization issue, you can (and probably should) use lazy vals. It’s a less error-prone approach.

    Regarding the repetition of dependencies issue, it can be worse than you show if there are multiple dependencies. In that case, you can use a type alias (using Scala’s type keyword) to make it less painful.

    Best,
    Ismael

  • http://www.warski.org Adam Warski

    True, a “pattern” to resolve the initialization issue would then be to use lazy vals *only* (for anything that is configurable or uses configurable values); lazy vals will automatically get initialized in the right order.

    Good idea with the type alias – a configurable component could then have a companion “MyComponentDependencies” type.

    Thanks!
    Adam

  • http://blog.joa-ebert.com/ Joa Ebert

    Regarding the “Parametrizing the system with a component implementation” issue.

    You can also override the specified implementation at a later point.

    E.g.:

    trait DefaultEnvironment extends DatabaseComponent with UserRepositoryComponent with UserAuthenticatorComponent {
    lazy val database = new MysqlDatabaseImpl
    lazy val userRepository = new UserRepositoryImpl
    lazy val userAuthenticator = new UserAuthenticatorImpl
    }

    val env1 = new DefaultEnvironment() { /* nothing to do here */ }
    val env2 = new OtherStuff with DefaultEnvironment

    Now if you want to allow to override a specific part at a later time you need to spefiy the abstract type once more so scalac will not set it to the this.type.

    E.g.:

    trait DefaultEnvironment extends DatabaseComponent with UserRepositoryComponent with UserAuthenticatorComponent {
    lazy val database: Database = new MysqlDatabaseImpl//see type annotation
    lazy val userRepository = new UserRepositoryImpl
    lazy val userAuthenticator = new UserAuthenticatorImpl
    }

    val env = new ProdEnv with DefaultEnvironment {
    override lazy val database = new MockDatabaseImpl
    //lazy val userAuthenticator = new UserAuthenticatorImpl//this would yield an error since type has not been specified again
    }

  • http://www.warski.org Adam Warski

    But that assumes that DatabaseComponent contains both MysqlDatabaseImpl and MockDatabaseImpl? Typically I would see that as two separate components?

    Adam

  • Eric Bowman

    Doesn’t making everything lazy mean that reflection is later used to access it? Might be better to use object instead of lazy val?

  • Stephen

    Personally, I’m not fond of either cake or traditional Spring/Guice DI approaches. As in this post and your object services post, I think you end up fighting them more often than they’re worth.

    I like making a single interface (basically a parameter object) for all app-wide services and just passing it around:

    http://www.draconianoverlord.com/2011/03/17/frameworkless-di.html

    My post used Java for its examples, but I’ve done the same thing in Scala and it’s even nicer as you can use val/def/lazy val.

    E.g.

    trait Services {
    def database: Database
    }

    class ProdServices extends Services {
    lazy val database = new RealDatabase
    }

    class TestServices extends Services {
    val database = new StubDatabase
    }

    I won’t say it’s perfect, but it’s simple, which I like.

  • http://pl.linkedin.com/in/przemyslawpokrywka Przemek Pokrywka

    In our commercial project we’ve used lazy vals (without the Cake pattern) extensively, with good results.

    I share your concern described in “Parametrizing the system with a component implementation”, as 20 dependencies mixed in using “with” keyword are clearly unmanageable and anti-DRY. But look, so would be 20 arguments of a constructor or 20 method parameters, wouldn’t they? The problem is not that you cannot parametrize the system, because you have already 20 knobs and levers for that. The problem is you have too many of them.
    But software engineering has already been solving such problems – look at the method parameters analogy. Long parameter list often indicates, that some Parameter Object craves to be extracted. You can apply similar approach to mixins.
    Now look for traits that (if put together) form a cohesive abstraction and create a trait that extends them all. If you manage to divide 20 traits into 4 cohesive entities, each aggregating 5 traits, you’ll end up with a system containing of 4 components instead of 20. That probably solves your problems, hopefully improving you design at the same time (it’s important to extract only meaningful, not-strictly-technical entities or troubles are to be expected).

    This proposal has its limits of course, but probably is worth considering when you stick to the Cake pattern.

  • http://www.warski.org Adam Warski

    But how would using an object look like? And I think lazy vals don’t use reflection to access the value later, but I may be mistaken.

    Adam

  • http://www.warski.org Adam Warski

    True, one of the solutions to that problem is to use “multi-level” cake pattern, that is to build up bigger components from smaller ones. This even goes well with the “scalable” in Scala ;)

    As for 20 method parameters, the situation is different – I can easily create a method which takes only one parameter and setups the system for me, e.g.:
    def setup(db: Database) = setup(db, userRepository, userAuthenticator, ...)
    where the userRepository etc values are defined before or just in the setup method itself. With the cake pattern, creating a function which takes an implementation of a component, and returns a system which uses that component, is not possible.

  • http://www.warski.org Adam Warski

    Simple approaches are very often the best ;) However for me a “con” would be a lack of auto-wiring (which you classify as a “pro”) – always some typing spared. Also, as you write the non-explicitness on what an object depends on (by just looking at the constructor signature) could be a problem, e.g. when writing tests – it is not so obvious which components should be mocked.

  • http://pl.linkedin.com/in/przemyslawpokrywka Przemek Pokrywka

    Yes @Adam, that’s not strictly the same but indeed very similar. Instead of writing:

    def setup(db: Database) = setup(db, userRepository, userAuthenticator, …)
    // and then
    val system = setup(myConcreteDb)

    you can just as easy write:

    trait Env extends UserRepository with UserAuthenticator with … { this: Database =>
    }
    // and then
    object system extends Env with MyConcreteDb

    with similar syntactic noise level and nearly the same result.

    In fact, the innocently-looking one-arg setup method actually contains a specific configuration of components in its implementation. Perhaps that configuration deserves to be named in some meaningful way and communicated outside. In such case, defining a trait has its merits.

  • http://www.warski.org Adam Warski

    Ah yes, true, I didn’t think of defining the dep which I wish to parametrize as another self-type parameter. Mixing it in later is in fact the same as mixing it with the others except that it impacts the linearization order, which doesn’t matte here.

    Still, I can’t write a method which e.g. computes me an env and returns it, but that would require the language to handle types as first-class values and support meta-programming.

    Thanks! :)
    Adam

  • Kristian Domagala

    I also found that the Cake Pattern doesn’t work well with non-singleton-type dependencies: http://stackoverflow.com/questions/5190328/can-the-cake-pattern-be-used-for-non-singleton-style-dependencies

  • http://www.warski.org Adam Warski

    I think in such cases parametrizing the service is the best way; besides, in your stack overflow question the http component really shouldn’t have any knowledge about the trade service or its address; this would be best encapsulated in the trade service, and an appropriate http service could be initialized during construction.

    Adam

  • http://pl.linkedin.com/in/przemyslawpokrywka Przemek Pokrywka

    @Adam I’m not sure if I understand your proposal, but you definitely shouldn’t “new” in constructor/field initializers (with exception of simple value objects). Object dependencies should be minimal, with profits for testability and design. Kristian’s design is OK with respect to that rule (except for HttpServiceImpl, but desire to keep the example clear could be the reason for that – is that guess correct @Kristian?).

    In short, Cake does not support multiple dependencies of one type. My advise would be to mix it with the plain-old constructor approach in such cases.

  • http://www.warski.org Adam Warski

    I think that cake supports something better – an easy way to create parametrized dependencies.
    The role of the HttpServiceComponent should be to provide a parametrized (by address) implementations of HttpService. So the component should define a method, instead of a val:

    def httpService(address: String)

    The default implementation would return a new HttpService. However, for testing etc we can easily swap that out. Now, if the http service instances are cached by the HttpServiceComponent, or by the user side, is another thing – depends on the kind of service of course, if it’s stateless or stateful etc. But even if the caching is done in the component, it shouldn’t “know” about which components are going to use it – very easy to introduce a bug when adding a new http service user.

    Adam

  • http://pl.linkedin.com/in/przemyslawpokrywka Przemek Pokrywka

    @Adam Essentially what you’ve proposed is HttpServiceComponent to become a Factory (in a sense of http://en.wikipedia.org/wiki/Factory_method_pattern), optionally equipped with caching. Nothing related particularly to Cake here, as Cake lets you manage various components, including Factories. Only HttpServiceComponent’s name should start to clearly communicate its purpose then – what about “HttpServiceFactoryComponent”?
    While in some cases it makes sense to inject a Factory to another component (so it could obtain references to new collaborators) in most cases it does more harm, than good.

    For example, consider what would happen, if Kristian would refactor his code according to your advice. Two nice and simple classes: CompanyServiceImpl and TradeServiceImpl now would need to know 3 new things:
    1. What is HttpServiceFactoryComponent
    2. How do you get HttpService from it
    3. What String to use to obtain correct HttpService instance

    More of almost identical code in those classes, more to worry.

    Should it really be responsibility of those two simple and nice classes? I think not at all. They should stay unpolluted with that concerns, remaining easier to test, less problematic with refactoring, more decoupled, basically designed better.

    Regarding your comment’s last sentence it’s not entirely clear to me what you’ve meant, but obviously the less assumptions components make, the less chances for bugs occur.

  • http://pl.linkedin.com/in/przemyslawpokrywka Przemek Pokrywka

    …and to conclude my last comment (from 19:04):

    1. Having non-singleton-type dependencies is better than having factories injected to the components in the majority of cases.
    2. Cake pattern doesn’t support this and we have to live with that.
    3. The best workaround to that is to mix the Cake with usage of plain-old constructors, like in my StackOverflow answer to Kristian’s question.

  • http://www.warski.org Adam Warski

    Exactly, it’s a factory. Though I wouldn’t agree with the other points ;).

    First of all, I don’t think factories are bad. If your code has to create, for example, rich data wrappers in a dynamic way (so without knowing upfront what the parameters will be), then factories are a perfect fit. In fact, why the additional name – some components just provide objects, others – way to create objects – that is methods. Methods are quite widely used ;). I think that because Java DI containers encourage one to create singleton dependencies (creating a factory requires some significant effort, e.g. in the number of characters that you’ve got to write), that’s what we do – most of our dependencies are singleton objects! And there’s much more power available when using methods/functions.

    As to that particular example, it may be better to have just a HttpService instance provided. It really is a question of whether the address is a configuration parameter or not (it probably is). In fact the address could just be an abstract val.

    Regarding your points (from the earlier comment):
    1. TradeServiceComp already knows about HttpServiceComp, so what’s the difference if it instead knows about HttpServiceFactoryComp?
    2. There’s only one method exposed by the component, so that’s probably how to obtain an HttpService. In the usual setting you also know “how to get a xxxService from it” – and the answer is read a val. Here it is invoke a method. A component is really built of some interfaces is provides and some methods to obtain instances of those interfaces.
    3. The String should be in fact a configuration parameter; if the TradeService knows as much that it must use a HttpService, a natural question is “what’s the address” – so maybe it isn’t necessary to hide it

    You could also create an intermediate component which could hold HttpService impl for the TradeService:


    trait TradeServiceHttpDataSourceComponent {
    this: HttpServiceComponent =>

    val tradeServiceHttpDataSource: HttpService
    }

    trait TradeServiceComponent {
    this: TradeServiceHttpDataSourceComponent =>
    ...
    }

    val appAssembly = new TradeServiceHttpDataSourceComponent
    with TradeServiceComponent
    with HttpServiceComponent {

    val tradeServiceHttpDataSource = HttpService("http://something")
    ...
    }

    Btw. it’s really a shame that I missed you on Java4People – Geecon maybe?

    Adam

  • Stephen

    Adam:

    > when writing tests–it is not so obvious which components should be mocked

    True. Two thoughts:

    1) People generally don’t worry about whether a servlet depends on a request’s parameters or a request’s headers–it just depends on the request.

    Testing would be easier if you knew exactly which parts of the request a servlet needed, but it’s generally accepted that the price of not knowing this is worth the benefit (simplicity, shorter parameter lists) of the Request abstraction.

    Same thing–I don’t worry which specific app-wide dependencies an object needs; it needs an Application. That’s good enough.

    2) Being ambivalent about which app-wide dependencies an object needs generally works best if you use stubs instead of mocks.

    Think of Spring’s misnamed MockHttpServletRequest, which uses state to make it easy to test servlets (you don’t need to re-mock every single detailed operation of the request–the dummy implementation is good enough).

    Same thing–I create one StubApplication class that each test instantiates. The StubApplication has stub versions of all the app’s dependencies, so it really doesn’t matter which dependencies the object under test does/does not need.

    This has the added benefit of keeping my tests clean of redundant mock setup code that, IMO, is not very dry when a dependency is used by lots of tests, but each one dedicates several LOC to mocking out its behavior. A stub can usually capture that behavior in just one place.

  • https://twitter.com/maciejb Maciej Biłas

    A short clarification on one of the issues raised on the side in previous comments:
    Scala does *not* use reflection for lazy vals. It uses the (slightly modified) double-checked initialization pattern in their accessor method.

  • http://www.warski.org Adam Warski

    Ah yes, thought so ;)

    Thanks,
    Adam

  • http://pl.linkedin.com/in/przemyslawpokrywka Przemek Pokrywka

    Sure @Adam, I can imagine, that factories provide great benefits in case of rich wrappers you’ve mentioned and I’m sure there is a lot more cases when they make a perfect fit. And I fully share your point of view about the common usage of DI frameworks, that leads to proliferation of singletons – a horrible pathology! Still in most cases I would use one factory per one object lifecycle/scope (like application, session, request), probably splitting them as they gain more responsibilities. The blog of Miško Hevery has opened my eyes with respect to that – I really recommend his posts.

    Whether the “Factory” should be a part of the component’s name or not is actually a minor concern for me. What’s important, the purpose of the component should be clearly communicated. Should it happen through its name or in some other way (like by a convention), this is a secondary issue to me, as long as things are kept consistent.

    Here I will go to “the beef”. I’m glad you’ve realised, that in the Kristian’s example injecting just a HttpService might be the proper solution. I would treat the address as a configuration parameter too. Only I’m not sure I understand your remark, that the address could be an abstract val. Because you need two HttpServices (each configured differently), you probably need two such vals.

    Addressing your remarks to my points. They would be correct if you could assume lack of further changes to the software. However the “soft” in software lets you expect, that you’ll be able to easily modify it. When you let TradeServiceComp know about HttpServiceFactoryComp, you basically increase its coupling. In practice, every change to the factory interface forces you to update TradeServiceComp and other coupled components. The 3 points I’ve listed relate to design decisions, that once changed, cause a kind of ripple effect through the factory’s clients.
    Now the critical question: should we care? What’s the probability the factory’s interface changes? I don’t know, I’m not really good at guessing the future. But I often witness the Murphy’s Law in action = when things can go wrong, they will. And I can imagine scenarios where that kind of coupling does hurt.

    Say my application was a success, it grew bigger and was split into several modules, each one handed out to a different team. Then it turned out, that misconfiguration of HTTP URLs was a common source of hard-to-diagnose customer problems. Besides taking other steps, I’ve decided to replace the type of HttpServiceFactoryComp method’s parameter from String to UrlPrefix – a value object, that validates the given URL and provides helpful messages in case of errors. So get down to code. First, I’d better check out all of the modules, because that change will affect most of them. Second, I do the change and fix the compilation errors. Now I run the tests – obviously some of them fail, because factory’s mocks expected a String and not a UrlPrefix. I fix the tests. Alright, this is it, right? Actually not – I wasn’t aware, that some team was working on one of modules I’ve changed – and they were faster with their check-in to the source control. When I synchronized with the repository a lot of conflicts came out. Happens, doesn’t it? I fix them – guys just have renamed a lot of their classes, so I move my changes to each of them. I’m ready now, I run the tests and check-in the code. The guys from yet another team, that are in the course of other refactoring will take care of their conflicts for themselves, right? Ah wait, our other source control branches have remained. According to our “propagate early and often” policy I should also put my changes to several other branches as well. So, a small switch in source control and here we go…
    Sounds unrealistic? Maybe, but sadly I’ve already experienced similar situations. People waste a lot of time for dealing with consequences of excessive coupling, that could have been minimized beforehand.

    Regarding your last proposal, it doesn’t solve the coupling problem, which is only moved to the new intermediary components. You could argue, that the coupling always has to be present in one or another place. That’s true, but given the choice: to have it in several modules or in one application-wide bootstrap factory, I would certainly prefer the latter. Then, should the HttpServiceFactoryComp interface change, only bootstrap factory needs changing.

    Summing up, factories make for great design choices sometimes, but you should always take care to use them properly – otherwise you may end up with troubles you haven’t expected. Introducing unnecessary coupling is an example of how you can make your life harder for yourself.
    The point of Kristian (that in Cake you cannot mix a component twice) still holds, although in practice you can work this around.
    I really recommend blog of Miško Hevery – it offers a lot of great advice on OO design.

    P.S. Unfortunately I wasn’t able to visit GeeCON this year. But will gladly meet you by another occasion :)

  • http://www.warski.org Adam Warski

    Sure, I meant two abstract vals, it was only a “mind short-cut” (can you say that in English? ;) )

    And of course I’m aware of the fact that software can change, that you should have loose coupling etc – that’s pretty basic ;). Although you always expose *some* interface, either the factory one (where the clients can create HttpServices based on URLs), or the plain one (where the client just get the HttpService). And the clients are vulnerable to changes in either of these. Which you choose depend on the use case.

    While the cake isn’t particularly well suited for double dependencies of the same type (it’s basically because each dependency has a globally unique name; this is a problem in most DI systems, solved e.g. by qualifier annotations in Guice/CDI or by giving different String names in Spring/Seam), it’s possible to work around as I guess we agreed; the nicest solution is just to use constructor parameters, but that has other drawbacks which are the reason we are not just using that but trying out different DI solutions.

    Also, if you introduce the intermediary component, nothing prevents it from being configured centrally in the application, and having the client isolated from the factory and the address that should be provided to the factory.

    Do you have any particular Misko Hevery articles in mind?

    Thanks for the discussion and a great sum-up! :)
    Adam

  • http://pl.linkedin.com/in/przemyslawpokrywka Przemek Pokrywka

    Now I’ve grasped your point about that intermediary component, I think. Having a HttpServiceComponent acting as a factory, but using it in the very place of final component assembly is definitely the right way to go, so coupling is minimized.
    Only then you don’t need to declare self-type of TSHDSComponent:

    trait TradeServiceHttpDataSourceComponent {
    this: HttpServiceComponent =>

    (I guess that it was just an oversight) – and the abstract vals are actually HttpServices and not mere Strings (that is evident in your code sample, but was not clear to me from the preceding description).

    Regarding Misko articles, the majority of them deserve to be known to even experienced OO developers, really. In this case, when I was talking about “one factory per one object lifecycle” I had http://misko.hevery.com/2008/09/10/where-have-all-the-new-operators-gone/ in mind. If you think of it, it quickly becomes evident, but even developers skilled in GoF design patterns happen to miss the basic rules in their daily practice, thus making their lives harder. Should more developers apply the basic principles wonderfully explained by Misko, the world would be a better place for all of us. Until then, never too much reminding about the basics :)

    Thanks for the discussion, too!

  • http://www.warski.org Adam Warski

    Thanks – didn’t read that one (yet). All the effort to remove “new” operators from the code makes you think if maybe we should be looking for a completely another concept (than OO programming). Somewhere where you don’t need an additional layer of abstraction to hide a new behind a factory.

    But that becomes philosophical ;)

    Adam

  • http://aloiscochard.blogspot.com Alois Cochard

    Hi!

    You could be interested by my IoC container project called “Sindi”:
    http://aloiscochard.github.com/sindi

    Currently in alpha stage but you can find a working example here:
    https://github.com/aloiscochard/sindi/tree/master/sindi-examples/demo

    Best regards,

    Alois Cochard

  • Pingback: Introducting Diesel – PHP Dependency Injection | The Box Developer Blog

  • http://twitter.com/rintcius Rintcius Blok (@rintcius)

    I blogged about a variant of DI that works quite nice in Scala: http://blog.rintcius.nl/post/di-on-steroids-with-scala-value-injection-on-traits.html