Better software through software architecture and devops

@jamessnape

Tag Archives: #deeply

  • Photo of a scuba diver swimming away from the camera at 30m depth. Blue hues throughout the photo due to light filtering at depth.

    Generally I try hard to avoid adding dependencies to a library project designed for reuse. Since Deeply is a Nuget package I have no idea how it might end up being used and for that reason I’m unwilling to add dependencies that might not fit with a user’s design. As a user of Deeply however, I’m finding that I have to add the same patterns repeatedly and would rather just use a pre-existing package.

    How to reconcile these opposing arguments? I’ve decided to add a new package to Nuget - Deeply.Extras. This assembly is free to on take whatever dependencies make sense. Initially this is going to be Autofac for its CommonServiceLocator implementation and CsvHelper to provide a CsvBulkRepository.

    This entry was posted in data-warehousing  and tagged #autofac #csvhelper #deeply #nuget  on .
    Discuss this on Twitter or LinkedIn
  • I’ve just pushed a new version of Deeply to nuget.org. This version provides just enough functionality to write some basic ETL jobs:

    • Parallel and Sequential Tasks
    • Execute SQL Task
    • Execute Process Task
    • Simple Dataflow Task

    The tasks are pretty self-explanatory. The key part it nearly all the setup is done in the constructor; once the structure is created then it is executed asynchronously.

    Data flows are a little harder to configure. You need a source, a target and a mapping function. A source is anything conforming to IEnumerable<T>, a target is class that accepts and IEnumerable<T> implemented in IBulkRepository<T> and finally a mapping function that maps the source to the target.

    The code for using a simple data flow looks a little like the pseudo-csharp below:

    var source = new CsvReader("C:\\sourcefile.csv");
    
    var connectionFactory = new SqlConnectionFactory("Data Source=(localdb)\\v11.0;");
    
    var columnMappings = new Dictionary<string, string>()
    {
        { "Id", "Id" },
        { "Name", "Name" },
        { "Created", "Created" }
    };
    
    var target = new SqlBulkRepository(
        "dbo.FactTable", connectionFactory, columnMappings);
    
    var dataflow = new SimpleDataflowTask<TSource, TTarget>(
        this.source, MappingFunctions.Identity, target);
    
    var context = new TaskContext();
    await dataflow.ExecuteAsync(context);

    If anyone would like to help write some examples and documentation I’d be immensely grateful but otherwise please let me know of your experiences using this package.

    This entry was posted in data-warehousing  and tagged #dataflow #deeply #etl #nuget  on .
    Discuss this on Twitter or LinkedIn