Friday, 27 May 2016

Heroku Metrics: There and Back Again

Heroku Metrics: There and Back Again: "No system is perfect, and ours isn’t some magical exception. We’ve learned some things in this exercise that we think are worth pointing out.

Sharding and Partitioning Strategies Matter

Our strategy of using the owner column for our shard/partitioning key was a bit unfortunate, and now hard to change. While we don’t currently see any ill effects from this, there are hypothetical situations in which this could pose a problem. For now, we have dashboards and metrics which we watch to ensure that this doesn’t happen and a lot of confidence that the systems we’ve built upon will actually handle it in stride.
Even still, a better strategy, likely, would have been to shard on owner + process_type (e.g. web), which would have spread the load more evenly across the system. In addition to the more even distribution of data, from a product perspective it would mean that in a partial outage, some of an application’s metrics would remain available.

Extensibility Comes Easily with Kafka

The performance of our Postgres cluster doesn’t worry us. As mentioned, it’s acceptable for now, but our architecture makes it trivial to swap out, or simply add another data store to increase query throughput when it becomes necessary. We can do this by spinning up another Heroku app that uses shareable addons, starts consuming the summary topics and writes them to a new data store, with no impact to the Postgres store!
Our system is more powerful and more extensible because of Kafka."

'via Blog this'

Thursday, 26 May 2016

A Two Month Debugging Story | Kevin Burke

A Two Month Debugging Story | Kevin Burke: "Instead we had to ship whatever logs we needed as artifacts from the test. This required turning on database logging for Postgres, then copying logs to the directory where artifacts get exported. Database logs across multiple containers shared the same filename, which meant only database logs from the first container became available as artifacts, so we also had to add a unique ID to each log file so they'd get exported properly.

This was a lot of steps, and took just enough debugging/setup time that I procrastinated for a while, hoping a better stack trace or error message might reveal itself, or that the problem would go away."

'via Blog this'

Wednesday, 25 May 2016

Agile is Dead – Again

Agile is Dead – Again: "Miko identifies Continuous Delivery as the successor to agile in the market.

The paradigm of “Continuous Delivery (CD)” seems to be the logical successor to Agile. Continuous Delivery provides an umbrella term that does not specify methodology–and doesn’t require much of a manifesto. Everything you need to know is in the title–you just deliver shippable software in as continuous of a way as possible. This allows a team to pull whichever Agile principles and methods needed in order to achieve that goal. This addresses one of the complaints of Agile, which is that it became a religious movement with gurus–and that these highly paid Agile gurus would come with one-size-fits-all solutions for development teams which were hard to realistically fit to real-world development."

'via Blog this'

Scaling to 100M: MySQL is a Better NoSQL | Wix Engineering

Scaling to 100M: MySQL is a Better NoSQL | Wix Engineering: "The choice to use a NoSQL database is often based on hype, or a wrong assumption that relational databases cannot perform as well as a NoSQL database. Operational costs, as well as other stability and maturity concerns, are often overlooked by engineers when it comes to selecting a database. For more information about the limitations and shortcomings of different NoSQL (and SQL) engines, take a look at the Jepsen series of articles from Aphyr.
This post will explain why we’ve found that using MySQL for the key/value use case is better than most of the dedicated NoSQL engines, and provide guidelines to follow when using MySQL in this way."

'via Blog this'

Tuesday, 24 May 2016

Taking Docker to Production with Confidence | Voxxed

Taking Docker to Production with Confidence | Voxxed: "Many organizations developing software today use Docker in one way or another. If you go to any software development or DevOps conference and ask a big crowd of people “Who uses Docker?”, most people in the room will raise their hands. But if you now ask the crowd, “Who uses Docker in production?”, most hands will fall immediately. Why is it, that such a popular technology that has enjoyed meteoric growth is so widely used at early phases of the development pipeline, but rarely used in production?"

'via Blog this'

Sunday, 22 May 2016

The InfoQ Podcast: Uber's Chief Systems Architect = No Node.js, no JSON over HTTP

The InfoQ Podcast: Uber's Chief Systems Architect on their Architecture and Rapid Growth: "In this week's podcast QCon chair Wesley Reisz talks to Matt Ranney who is the Chief Systems Architect at Uber, where he's helping build and scale everything he can. Previously, Matt was a founder and CTO of Voxer, probably the largest and busiest deployment of Node.js.

Key takeaways

Expanding a company and team at this rate is genuinely hard. Lots of mistakes have been made along the way.
Microservices allow companies to grow rapidly but have a cost in terms of aggregate velocity.
Uber is gradually moving its marketplace development from Node.js to Go and Java. Java is used for the map services.
Aggressive failure testing is used extensively in Uber.
Some early design choices - like using JSON over HTTP - make formal verification basically impossible."

'via Blog this'

Thursday, 19 May 2016

Managing Technical Debt Using Total Cost of Ownership

Managing Technical Debt Using Total Cost of Ownership: "Total Cost of Ownership (TCO) can be used for investment decisions and financial benefit analysis. When applied to software it covers the initial development costs and subsequent maintenance costs until phase out of a product. TCO can support architectural decisions and management of technical debt.

Hans Sassenburg talked about total cost of ownership analysis at the Bits&Chips Software Engineering conference. In his talk Sassenburg showed several metaphors that can be used to "sell" the concept of TCO. The metaphor of technical dept is easy to understand, but people often find it difficult to see technical debt increasing during the product lifecycle and take action said Sassenburg.

InfoQ did an interview with Sassenburg about using the total cost of ownership concept for managing software development, the main causes of technical debt and how to reduce technical debt or prevent accumulation of debt, managing R&D investments and costs and improving the quality of software products."

'via Blog this'

Wednesday, 18 May 2016

Working with Akka Actors | Let's Do Big Data...

Working with Akka Actors | Let's Do Big Data...: "I am going to explain Akka actor model with a simple example fetching weather data from Yahoo, I am going to use akka scala API."

'via Blog this'

Java developer's Scala cheatsheet by mbonaci

Java developer's Scala cheatsheet by mbonaci: "Shamelessly ripped off from Programming in Scala, second edition. I did ask for permission, though.
Basically, while I'm going through the book, I'm taking notes and pushing them here, so I can later use this page as a Scala quick reference. If you, by some incredible chance, find any of this useful, please do buy the book. No, I don't get the kick back. As you can see, the book link is clean :)"

'via Blog this'

Understanding Consensus and Paxos in Distributed Systems – Chord Simple

Understanding Consensus and Paxos in Distributed Systems – Chord Simple: "Now we know that the Paxos algorithm was designed to try to solve the problem of distributed consensus between a network of computers in an asynchronous system. What does the actual algorithm say? How does it get this group of machines to achieve consensus in the face of unpredictable failures?

We’ve already answered these questions (sorta) but we’ll do so here more formally by taking a sneak peek at the actual specification. Fortunately, we’ve pretty much covered the entire algorithm while constructing our auction protocol so there are very few surprises here.

Paxos uses the concepts of Proposers, Acceptors and Learners as roles in which a machine can act.

Proposers are machines with opinions. They try to impose their opinion (value) to a set of acceptors. Analogous to our auction process, these proposers are simply bidders.
Acceptors accept values proposed by proposers à la auctioneers.
Learners decide on the agreed upon value based on the acceptors acceptances. We had our bidders decide the agreed upon value (the chosen bid) in our auction. The learners simply act as external agents that declare a value to be chosen."

'via Blog this'

Understanding caching in Postgres - An in-depth guide - Madusudanan

Understanding caching in Postgres - An in-depth guide - Madusudanan: "Caching can be considered an important aspect in tuning database system performance.

While this post is mainly focused on postgres, it can be easily compared and understood with other database systems. "

'via Blog this'

Stream processing, Event sourcing, Reactive, CEP… and making sense of it all — Martin Kleppmann’s blog

Stream processing, Event sourcing, Reactive, CEP… and making sense of it all — Martin Kleppmann’s blog: "Some people call it stream processing. Others call it Event Sourcing or CQRS. Some even call it Complex Event Processing. Sometimes, such self-important buzzwords are just smoke and mirrors, invented by companies who want to sell you stuff. But sometimes, they contain a kernel of wisdom which can really help us design better systems.

In this talk, we will go in search of the wisdom behind the buzzwords. We will discuss how event streams can help make your application more scalable, more reliable and more maintainable. Founded in the experience of building large-scale data systems at LinkedIn, and implemented in open source projects like Apache Kafka and Apache Samza, stream processing is finally coming of age."

Title: making sense of stream processing

Title: making sense of stream processing

'via Blog this'

MySQL EXPLAIN Explained

Mysql Explain Explained from Jeremy Coates

also:

https://www.sitepoint.com/using-explain-to-write-better-mysql-queries/

Serverless and Event Architectures — Medium

Serverless and Event Architectures — Medium: "I’m just starting to think these ideas out and implement these ideas into our systems. I know that there are existing architectures out there (SOA, EDA, BBQ… no… not the last one… I must be hungry) that deal with loosely coupled and distributed systems. IoT is also a space where this kind of an idea is a given too. However, I do think that Serverless is an iteration beyond the autoscaling PaaS and container based solutions and frameworks because none of those previous systems envisaged such a fragmented and loosely coupled codebase as well.
So maybe the core elements of Serverless are:

Loosely coupled managed services combined with stateless loosely coupled micro-functions

There will be issues
I’m sure. There are problems with the implementation of this, but it’s definitely something I’m going to try to figure out as I go forwards.
AWS Lambda isn’t everything, but combining it with all the other AWS services such as Cognito, API Gateway, SNS, SQS, S3… it does seem to have a lot of these ideas already there to experiment and try out new solutions.
Maybe the other providers need to focus less on the micro-compute side and more on the loosely coupled services. It’s not all about VMs and autoscaling any more.
Serverless is about flows and streams of data and events and messages.
This is the future (imho)"

'via Blog this'

Monday, 16 May 2016

C++ 11 Auto: How to use and avoid abuse - A CODER'S JOURNEY

C++ 11 Auto: How to use and avoid abuse - A CODER'S JOURNEY: "Fast forward 16 months and I now realize that my frustration with C++ 11 Auto keyword stemmed from the way it was used, and not the nature of the keyword itself. In fact, I’ve grown to be an advocate of using “auto” over the last year. Before I get into the reasons for being an “auto” convert , here’s a quick recap of what the “auto” keyword is.

Auto keyword simply tells the compiler to deduce the type of a declared variable from its initialization expression."

'via Blog this'

Top 10 dumb mistakes to avoid with C++ 11 smart pointers - A CODER'S JOURNEY

Top 10 dumb mistakes to avoid with C++ 11 smart pointers - A CODER'S JOURNEY: "I love the new C++ 11 smart pointers. In many ways, they were a godsent for many folks who hate managing their own memory. In my opinion, it made teaching C++ to newcomers much easier.

However, in the two plus years that I've been using them extensively, I've come across multiple cases where improper use of the C++ 11 smart pointers made the program inefficient or simply crash and burn. I've catalogued them below for easy reference. "

'via Blog this'

CompleteNewbiesClickHere - Linux Kernel Newbies

CompleteNewbiesClickHere - Linux Kernel Newbies: "First advice I got when I entered #kernelnewbies a year ago, was to download kernel version 0.0.1. It's a good start for someone who doesn't know jack about kernels, and wants to see a very basic one. [Dan Aloni]

[Sherilyn] The 0.01 kernel downloads to about 10000 lines of C and assembler, which is fairly manageable. Note that it's a barely functional UNIX with tons of bugs, but that doesn't stop it being useful if (like me) you just want to get a quick snapshot of the way in which a UNIX system boots up and starts executing processes, and a broad, uncluttered picture of the kernel system call API. I have written the following Wiki pages which may be of use as an introduction to kernel programming. Please add any corrections, etc."

'via Blog this'

Overcoming Swagger Annotation Overload by Switching to JSON - DZone Integration

Overcoming Swagger Annotation Overload by Switching to JSON - DZone Integration: "Despite taking the "Build First" approach, we were able to revert to a "Design First"-like state relatively painlessly. The resulting (largely) annotation-free code was much easier on the eyes and the single definition file worked much better for documentation purposes. Perhaps in the future, I'll build an API with a design more suited for Swagger annotations or use a true design first approach. But for this particular API, changing course was the right call."

'via Blog this'

Friday, 13 May 2016

What we learned from Google: code reviews aren’t just for catching bugs

http://blog.fullstory.com/2016/04/code-reviews-arent-just-for-catching-bugs/

A big chunk of the FullStory engineering team formerly worked at Google, where there is a famously strong emphasis on code quality. Perhaps the most important foundational tenet at the big G is a practice called code reviews, or, more precisely, “pre-commit” code reviews. We continue the practice at FullStory and hold it as sacrosanct.

Although much has been written about the benefits of code reviews, it isn’t yet a ubiquitous practice in the world of software development. But it should be, particularly for large engineering teams or teams with a flat management structure, e.g. no project managers or supervisors.

Contained herein are both the big, obvious engineering reasons you should adopt code reviews, as well as the more nuanced – but equally important – benefits to your customers and your own company culture.

How do code reviews work at FullStory?

While working on a new feature, Dave (for example) will cut a branch from the current version of our master product and work exclusively on that branch, a process with which I’m sure most of the coding world is intimately familiar. But before he can reintegrate the changes into master, Pratik or another qualified engineer must review his work and give him the stamp of approval: LGTM (looks good to me).

If Pratik has an issue with the way Dave has designed or coded the work, they’ll have a discussion (potentially with a long volley of back-and-forth reasoning) until they reach an agreement. Or, if Pratik has no issues, he can LGTM the work right away.

Introducing TAuth: Why OAuth 2.0 is bad for banking APIs and how we're fixing it

https://blog.teller.io/

"One of the biggest problems with OAuth 2.0 is that it delegates all security concerns to TLS but only the client authenticates the server (via it's SSL certificate), the server does not authenticate the client. This means the server has no way of knowing who is actually sending the request. Is it a bona fide user, or is an attacker tampering with the request? When an attacker is able to insinuate themselves between a legitimate user and the server, it's called a man-in-the-middle (MITM) attack.

It looks like this:

client attempts to connect to service
attacker successfully reroutes traffic to a host it controls
malicious host accepts connection from client
malicious host connects to service
service accepts connection from malicious host
client communicates with service proxied through malicious host, which can see and tamper with any data sent or received

You're probably thinking "hang on, isn't this the point of SSL?" Yes it is, but there are a number of ways to present a bogus certificate and a client accept it. The most realistic threat is the client developer not properly verifying the server certificate, i.e. was it ultimately signed by a trusted certificate authority?

Unfortunately a large number of developers think that disabling SSL peer verification is the correct fix to a SSL path validation error. There are many more that will offer the same advice with the caveat that it introduces a security issue that < 100% of readers will consider. As an API provider with a duty of care to our users we can't simply hope developers on our platform don't do this."

'via Blog this'

One API, Many Facades?

One API, Many Facades?: "Web APIs increasingly have several kinds of consumers with different needs. Microservice architectures can encourage us to deploy fine-grained API facades for those needs (the so-called experience APIs or BfF patterns), but this can become an anti-pattern if you have too many distinct consumers to please, especially if you’ve got only a small team to take care of all those front ends.

Be sure to do the math! Before going one way or another, you have to study the cost of your options and whether or not you can support them. Creating different variants of an API has a cost, for the implementor as well as for the consumer, that depends on the adopted strategy. Also, once you’ve unleashed your API and given it to its consumers, perhaps it’s also time to rethink and refactor this API, as maybe you didn’t take those special device or consumer requirements well enough into account during the design phase.

If you have dedicated teams for these API facades, then it’s an option to consider. When you don’t have that luxury, there are other ways to customize payloads for your consumers without the induced complexity, with simple tricks like field filtering or the Prefer header up to full-blown solutions like custom media types or specifications like GraphQL.

But you don’t necessarily need to fire the big guns, and could opt for a middle path: one main, full API plus one or two variants for mobile devices, and you are likely going to meet the requirements of all your consumers. Consider including a pinch of field filtering, and everybody will be happy with your APIs!"

'via Blog this'

Wednesday, 11 May 2016

Could PostgreSQL 9.5 be your next JSON database?

Could PostgreSQL 9.5 be your next JSON database?: "You can use PostgreSQL to create rich, complex JSON/JSONB documents within the database. But then if you are doing that, you may want to consider whether you are using PostgreSQL well. If the richness and complexity of those documents comes from relating the documents to each other then the relational model is often the better choice for data models that have intertwined data. The relational model also has the advantage that it handles that requirement without large scale duplication within the actual data. It also has literally decades of engineering expertise backing up design decisions and optimizations.

What JSON support in PostgreSQL is about is removing the barriers to processing JSON data within an SQL based relational environment. The new 9.5 features take down another barrier, adding just enough accessible, built-in and efficient functions and operators to manipulate JSONB documents.

PostgreSQL 9.5 isn't your next JSON database, but it is a great relational database with a fully fledged JSON story. The JSON enhancements arrive alongside numerous other improvements in the relational side of the database, "upsert", skip locking and better table sampling to name a few.

It may not be your next JSON database, but PostgreSQL could well be the next database you use to work with relational and JSON data side by side."

'via Blog this'

Friday, 6 May 2016

Microservice Threading Models and Their Tradeoffs

Microservice Threading Models and Their Tradeoffs: "Architects designing Micro-Service Architectures typically focus on patterns, topology, and granularity, but one of the most fundamental decisions to make is the choice of threading model. With the proliferation of so many viable open source tools, programming languages, and technology stacks, software architects have more choices to make now than ever before.

It is very easy to get lost in the details of nuanced language and/or library differences and lose sight of what is important.

Choosing the right threading model for your micro-services and how it relates to database connectivity can mean the difference between a solution that’s good enough and a product that’s amazing."

'via Blog this'

Wednesday, 4 May 2016

Gang of Four Patterns in a Functional Light: Part 2 | Voxxed

Gang of Four Patterns in a Functional Light: Part 2 | Voxxed: "In this second part, we will continue this process and revisit two other widely used GoF patterns: the Template and the Observer patterns, which can both be reimplemented through the Java 8 Consumer interface."

'via Blog this'

Tuesday, 3 May 2016

JAX-RS 2.0 : Server side Processing Pipeline | Thinking in Java EE (at least trying to!)

JAX-RS 2.0 : Server side Processing Pipeline | Thinking in Java EE (at least trying to!): "The inspiration for this post was the Processing Pipeline section in the JAX-RS 2.0 specification doc (Appendix C). I like it because of the fact that it provides a nice snapshot of all the modules in JAX-RS – in the form of a ready to gulp capsule !"

'via Blog this'

Voxxed: Microservices Versus SOA in Practice

Microservices Versus SOA in Practice
So microservices and SOA have a lot in common. They are both systems that typically contain sets of loosely coupled distributed components. However, the intention behind the two architectures is very different: SOA is trying to integrate applications and typically uses a central governance model to ensure that applications can interoperate. Microservices is trying to deploy new features and scale development organizations quickly and efficiently. It focuses on decentralization of governance, code reuse, and automation to do this.
To summarize:

Feature	SOA	Microservices
component size	large piece of business logic	single task or small piece of business logic
coupling	generally loosely coupled	always loosely coupled
organizational structure	any	small, dedicated cross-functional teams
governance	focus on centralized governance	focus on decentralized governance
goals	to ensure applications can interoperate	to deploy new features and scale development organizations quickly

Sunday, 1 May 2016

To become a good C programmer

To become a good C programmer: "Every once in a while I receive an email from a fellow programmer asking me what language I used for one of my games and how I learned it. Here is an entry that list the best things to read about C.
"

'via Blog this'

A cache miss is not a cache miss

A cache miss is not a cache miss: "When writing performant code, we are careful to avoid cache misses when possible. When discussing cache misses, however, we are usually content with counting the number of cache misses. In this post I will explain why this is not sufficient, as data dependency also make a huge difference.

Consider for example a vector>. If the vector is large, iterating over the integers will typically incur a cache miss per element, because they are stored at arbitrary locations in memory. If we have a list, iteration will also incur a cache miss for each element, because the nodes of the list are stored at arbitrary locations.

This could lead us to believe that cache misses have the same effect on performance for both of the above cases. That assumption is what I wanted to put to the test in this post"

'via Blog this'

C++ Has Become More Pythonic

C++ Has Become More Pythonic: "C++ has changed a lot in recent years. The last two revisions, C++11 and C++14, introduce so many new features that, in the words of Bjarne Stroustrup, “It feels like a new language.”

It’s true. Modern C++ lends itself to a whole new style of programming – and I couldn’t help noticing it has more of a Python flavor. Ranged-based for loops, type deduction, vector and map initializers, lambda expressions. The more you explore modern C++, the more you find Python’s fingerprints all over it.

Was Python a direct influence on modern C++? Or did Python simply adopt a few useful constructs before C++ got around to it? You be the judge."

'via Blog this'