I’ve just pushed a new version of Deeply to nuget.org. This version provides just enough functionality to write some basic ETL jobs:
- Parallel and Sequential Tasks
- Execute SQL Task
- Execute Process Task
- Simple Dataflow Task
The tasks are pretty self-explanatory. The key part it nearly all the setup is done in the constructor; once the structure is created then it is executed asynchronously.
Data flows are a little harder to configure. You need a source, a target and a mapping function. A source is anything conforming to IEnumerable<T>
, a target is class that accepts and IEnumerable<T>
implemented in IBulkRepository<T>
and finally a mapping function that maps the source
The code for using a simple data flow looks a little like the pseudo-csharp below:
var source = new CsvReader("C:\\sourcefile.csv");
var connectionFactory = new SqlConnectionFactory("Data Source=(localdb)\\v11.0;");
var columnMappings = new Dictionary<string, string>()
{
{ "Id", "Id" },
{ "Name", "Name" },
{ "Created", "Created" }
};
var target = new SqlBulkRepository(
"dbo.FactTable", connectionFactory, columnMappings);
var dataflow = new SimpleDataflowTask<TSource, TTarget>(
this.source, MappingFunctions.Identity, target);
var context = new TaskContext();
await dataflow.ExecuteAsync(context);
If anyone would like to help write some examples and documentation I’d be immensely grateful but otherwise please let me know of your experiences using this package.