secfree - A comparison of Storm with Spark Streaming

A comparison of Storm with Spark Streaming

Feature	Spark Streaming	Storm
Latency	Few seconds	Sub-second
Data guarantee	Exactly once	At least once ¹
Process model	Batch	One
Fail recovery price	Low ²	High ³
Resource manager integration	YARN, Mesos	YARN, Mesos
Consistency break condition	Output operation failure	Replay an event
Popular ⁴	More ⁵	Less
Development cost	Less ⁶	More
Batch framework integration	Spark	N/A
Message Passing Layer	Netty, Akka	Netty or ZeroMQ
Implement Language	Scala	Clojure
Hadoop distribution support	Hortonworks, Cloudera, MapR	Hortonworks, MapR
Company support	Databricks	N/A
Origin	Uc Berkeley	BackType, Twitter
Production use	2013	2011

Actually, Storm’s Trident library also provides exactly once processing. But, it relies on transactions to update state, which is slower and often has to be implemented by the user.
Because of the dependency chain of Spark RDD, it’s easy to recovery from failure by relaying it from the source, need not to track every middle state.
Each individual record has to be tracked as it moves through the system
Judged by code commit velociy and issue velocity.
Spark also has a better ecosystem
With Spark, the same code base can be used for batch processing and stream processing.

A comparison of Storm with Spark Streaming by secfree was published on 2016-03-10