Flowman

Comfortable ETL with Apache Spark

Boost your ETL jobs with Spark by implementing Flowman.

Flowman Overview

The approach and idea of Flowman

Flowman is an open source project developed by dimajix that helps your company develop ETL jobs based on Apache Spark. The core idea of Flowman is to specify the data flow purely declaratively, and then have it executed by a flexible Spark application (also Flowman).

With this approach, you cleanly separate the business logic from all the technical details necessary for a productive operation. This allows you to focus on business logic, while Flowman, as a mature Spark application, takes care of the technical details to ensure a stable execution. This includes exporting relevant metrics for monitoring, consistent logging, support for clean reruns, and more.

The data flows themselves are stored in YAML files, and in contrast to classic Scala/Java code can also be traced by a business expert with only a short induction. In this way, you can involve the existing expert knowledge more closely in the development in order to detect technical errors at an early stage.

Product Features

The following features are provided by Flowman

100% Open Source (Apache Lizenz)

Based on Apache Spark

Flexible spezification of data flows

Automatic schema management (creation and migration of tables)

Versatile command line tool for execution

Integrated Metrics for Monitoring

Supports Hadoop and Kubernetes

Supports AWS and Azure (S3 and ABS)

Advantages

These advantages result from the use of Flowman

Open Source.

There are no license costs, and at the same time you benefit from further development. The liberal Apache license allows you to make internal changes without the obligation to publish them.

Extensibility.

A plugin interface allows you to develop missing functionality yourself without having to disclose it.

Relief for Developers.

By focusing on business logic, your developers can focus on the essentials while Flowman implements the technical details.

Uniform Solution.

Instead of a loose collection of different Spark applications, you can use a unified solution that covers all the essential satisfies. There are no parallel developments of several solutions to similar problems.

We are here to support you. Contact

Anfrage:

Flowman

Comfortable ETL with Apache Spark

Boost your ETL jobs with Spark by implementing Flowman.

Flowman Overview

Product Features

Advantages

Open Source.

Extensibility.

Relief for Developers.

Uniform Solution.

Contact

Our services

Newest stories

Newsletter

Flowman

Comfortable ETL with Apache Spark

Boost your ETL jobs with Spark by implementing Flowman.

Flowman Overview

Product Features

Advantages

Open Source.

Extensibility.

Relief for Developers.

Uniform Solution.

Contact

Our services

Newest stories

Newsletter

Cookies