Heroku Metrics: There and Back Again: "No system is perfect, and ours isn’t some magical exception. We’ve learned some things in this exercise that we think are worth pointing out.
Sharding and Partitioning Strategies Matter
Our strategy of using the owner column for our shard/partitioning key was a bit unfortunate, and now hard to change. While we don’t currently see any ill effects from this, there are hypothetical situations in which this could pose a problem. For now, we have dashboards and metrics which we watch to ensure that this doesn’t happen and a lot of confidence that the systems we’ve built upon will actually handle it in stride.
Even still, a better strategy, likely, would have been to shard on owner + process_type (e.g. web), which would have spread the load more evenly across the system. In addition to the more even distribution of data, from a product perspective it would mean that in a partial outage, some of an application’s metrics would remain available.
Extensibility Comes Easily with Kafka
The performance of our Postgres cluster doesn’t worry us. As mentioned, it’s acceptable for now, but our architecture makes it trivial to swap out, or simply add another data store to increase query throughput when it becomes necessary. We can do this by spinning up another Heroku app that uses shareable addons, starts consuming the summary topics and writes them to a new data store, with no impact to the Postgres store!
Our system is more powerful and more extensible because of Kafka."
'via Blog this'
Be warned that this is mostly just a collection of links to articles and demos by smarter people than I. Areas of interest include Java, C++, Scala, Go, Rust, Python, Networking, Cloud, Containers, Machine Learning, the Web, Visualization, Linux, System Performance, Software Architecture, Microservices, Functional Programming....
Showing posts with label kafka. Show all posts
Showing posts with label kafka. Show all posts
Friday, 27 May 2016
Wednesday, 18 May 2016
Stream processing, Event sourcing, Reactive, CEP… and making sense of it all — Martin Kleppmann’s blog
Stream processing, Event sourcing, Reactive, CEP… and making sense of it all — Martin Kleppmann’s blog: "Some people call it stream processing. Others call it Event Sourcing or CQRS. Some even call it Complex Event Processing. Sometimes, such self-important buzzwords are just smoke and mirrors, invented by companies who want to sell you stuff. But sometimes, they contain a kernel of wisdom which can really help us design better systems.
In this talk, we will go in search of the wisdom behind the buzzwords. We will discuss how event streams can help make your application more scalable, more reliable and more maintainable. Founded in the experience of building large-scale data systems at LinkedIn, and implemented in open source projects like Apache Kafka and Apache Samza, stream processing is finally coming of age."

'via Blog this'
In this talk, we will go in search of the wisdom behind the buzzwords. We will discuss how event streams can help make your application more scalable, more reliable and more maintainable. Founded in the experience of building large-scale data systems at LinkedIn, and implemented in open source projects like Apache Kafka and Apache Samza, stream processing is finally coming of age."

'via Blog this'
Saturday, 19 March 2016
Efficiency Over Speed: Getting More Performance Out of Kafka Consumer - SignalFx
Efficiency Over Speed: Getting More Performance Out of Kafka Consumer - SignalFx:
At the last Kafka meetup at LinkedIn in Mountain View, I presented some work we’ve done at SignalFx to get significant performance gains by writing our own consumer/client. This assume a significant baseline knowledge of how Kafka works. Note from the presentation are below along with the video embedded (start watching at 01:51:09). You can find the slides here (download for animations!).
The presentation was broken up into three parts:
At the last Kafka meetup at LinkedIn in Mountain View, I presented some work we’ve done at SignalFx to get significant performance gains by writing our own consumer/client. This assume a significant baseline knowledge of how Kafka works. Note from the presentation are below along with the video embedded (start watching at 01:51:09). You can find the slides here (download for animations!).
The presentation was broken up into three parts:
- Why we wrote a consumer
- How modern hardware works
- Three optimizations on the consumer
Subscribe to:
Posts (Atom)