Skip to content
This repository was archived by the owner on Feb 25, 2020. It is now read-only.
This repository was archived by the owner on Feb 25, 2020. It is now read-only.

Remove Kafka dependency #524

@jdegoes

Description

@jdegoes

Kafka is a large, complex piece of software which requires installation and maintenance. There are many ways for Kafka to fail, and Kafka requires ongoing management in order to prevent disk overflow and make tradeoffs between recoverability and resource usage.

While Kafka is very appropriate for a large scale distributed ingest system which has to keep up with fluctuating loads and be fully redundant, it is less appropriate for a single node analytics engine like Precog. When Precog becomes distributed, the focus will be on reading data from HDFS, and not on the ingest of that data, so even long-term, the direct use of Kafka in the Precog project is an unnecessary distraction.

In order to simplify the number of moving pieces in Precog, Kafka needs to be eliminated as a dependency.

Ingest can be as simple as batching up a chunk of data and writing it out to the (abstract) file system -- e.g. appending to the relevant file.

This ticket will be considered complete when Kafka is not a dependency of the project nor referenced or utilized anywhere in the source code, unit tests, or documentation.

See @nuttycom's comment below.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions