Keisan

Keisan Knowledge Base

Sharing the secrets of our success

Knowledge Base Article Listing

  Filtering By Tag - Topic

Real-Time Machine Learning Pipeline with Apache Spark
Real-Time Machine Learning Pipeline with Apache Spark


In a previous article entitled 'Real-Time Data Pipeline with Apache Kafka and Spark' I described how we can build a high-throughput, scalable, reliable and fault-tolerant data pipeline capable of fetching event-based data and eventually streaming those events to Apache Spark where we processed them. I ended the last article by simply using Apache Spark to consume the event-based data and printing them to the console. In my last article entitled

Read More
Real-Time Data Pipeline with Apache Kafka and Spark
Real-Time Data Pipeline with Apache Kafka and Spark


It was in 2012 when I first heard the terms 'Hadoop' and 'Big Data'. At the time, the two words were almost synonymous with each other - I would frequently attend meetings where clients wanted a 'Big Data' solution simply because it had become the latest buzz word, with little or no consideration as to whether their requirements and data actually warranted one. 'Big Data' is of course more than just Hadoop and as scalable technologies in both batch and real-time became more mature, so did our knowledge of them. However, for those of you…

Read More
TOP