Heroku Metrics: There and Back Again: "No system is perfect, and ours isn’t some magical exception. We’ve learned some things in this exercise that we think are worth pointing out.
Sharding and Partitioning Strategies Matter
Our strategy of using the owner column for our shard/partitioning key was a bit unfortunate, and now hard to change. While we don’t currently see any ill effects from this, there are hypothetical situations in which this could pose a problem. For now, we have dashboards and metrics which we watch to ensure that this doesn’t happen and a lot of confidence that the systems we’ve built upon will actually handle it in stride.
Even still, a better strategy, likely, would have been to shard on owner + process_type (e.g. web), which would have spread the load more evenly across the system. In addition to the more even distribution of data, from a product perspective it would mean that in a partial outage, some of an application’s metrics would remain available.
Extensibility Comes Easily with Kafka
The performance of our Postgres cluster doesn’t worry us. As mentioned, it’s acceptable for now, but our architecture makes it trivial to swap out, or simply add another data store to increase query throughput when it becomes necessary. We can do this by spinning up another Heroku app that uses shareable addons, starts consuming the summary topics and writes them to a new data store, with no impact to the Postgres store!
Our system is more powerful and more extensible because of Kafka."
'via Blog this'
No comments:
Post a Comment