Monday, 27 June 2016

Linux and Perfomance: Reducing latency spikes by tuning the CPU scheduler

Reducing latency spikes by tuning the CPU scheduler: "It turns out that this effect can be observed when scylla runs in a different cgroup from the java process, but not when they run in the same cgroup. The reason for this effect is yet to be discovered.



 To check to which group given process belongs to, one can run:

 cat /proc/$PID/cgroup

The cgroup subsystem named cpu is relevant for scheduling.



 A process can be forced under a certain cgroup at run time with cgclassify. For example, to force all scylla and java (scylla-jmx) processes under the root group (-w is important for the change to affect all threads):

 sudo cgclassify -g cpu:/ `pgrep -w scylla`

sudo cgclassify -g cpu:/ `pgrep -w java`



Linux has a feature called autogroup which will automatically associate each session with a different cgroup. Processes started in those sessions will inherit group affinity. Due to latency issues mentioned above, the scylla package will disable autogroup (See this commit).



 Forcing scylla and scylla-jmx under the same group and applying scheduler settings has the desired effect of reducing scheduling delays. Now the maximum delay is 1.8 ms:"



'via Blog this'

Sunday, 26 June 2016

storage: A ZFS developer’s analysis of the good and bad in Apple’s new APFS file system | Ars Technica

A ZFS developer’s analysis of the good and bad in Apple’s new APFS file system | Ars Technica: "I'm not sure Apple absolutely had to replace HFS+, but likely they had passed an inflection point where continuing to maintain and evolve the 30+ year old software was more expensive than building something new. APFS is a product born of that assessment.

Based on what Apple has shown I'd surmise that its core design goals were:




  • satisfying all consumers (laptop, phone, watch, etc.) 
  • encryption as a first-class citizen 
  • snapshots for modernized backup 




Those are great goals that will benefit all Apple users, and based on the WWDC demos APFS seems to be on track (though the macOS Sierra beta isn't quite as far along).



 In the process of implementing a new file system the APFS team has added some expected features. HFS was built when 400KB floppies ruled the Earth (recognized now only as the ubiquitous and anachronistic save icon). Any file system started in 2014 should of course consider huge devices, and SSDs—check and check. Copy-on-write (COW) snapshots are the norm; making the Duplicate command in the Finder faster wasn't much of a detour. The use case is unclear—it's a classic garbage can theory solution in search of a problem—but it doesn't hurt and it makes for a fun demo. The beach ball of doom earned its nickname; APFS was naturally built to avoid it.



 There are some seemingly absent or ancillary design goals: performance, openness, and data integrity. Squeezing the most IOPS or throughput out of a device probably isn't critical on watchOS, and it's relevant only to a small percentage of macOS users. It will be interesting to see how APFS performs once it ships (measuring any earlier would only misinform the public and insult the APFS team)."



'via Blog this'

Wednesday, 22 June 2016

The Fault in Our JARs: Why We Stopped Building Fat JARs

The Fault in Our JARs: Why We Stopped Building Fat JARs: "The Fault in Our JARs: Why We Stopped Building Fat JARs
JUN 16, 2016 / BY JONATHAN HABER

inShare
10
HubSpot’s backend services are almost all written in Java. We have over 1,000 microservices constantly being built and deployed. When it comes time to deploy and run one of our Java applications, its dependencies must be present on the classpath for it to work. Previously, we handled this by using the maven-shade-plugin to build a fat JAR. This takes the application and all of its dependencies and bundles them into one massive JAR. This JAR is immutable and has no external dependencies, which makes it easy to deploy and run. For years this is how we packaged all of our Java applications and it worked pretty well, but it had some serious drawbacks.

 Fat JAR Woes

 The first issue we hit is that JARs are not meant to be aggregated like this. There can be files with the same path present in multiple JARs and by default the shade plugin includes the first file in the fat JAR and discards the rest. This caused some really frustrating bugs until we figured out what was going on (for example, Jersey uses META-INF/services files to autodiscover providers and this was causing some providers to not get registered). Luckily, the shade plugin supports resource transformers that allow you to define a merge strategy when it encounters duplicate files so we were able to work around this issue. However, it’s still an extra "gotcha" that all of our developers need to be conscious of.

The other, bigger issue we ran into is that this process is slow and inefficient. Using one of our applications as an example, it contains 70 class files totalling 210KB when packaged as a JAR. But after running the shade plugin to bundle its dependencies, we end up with a fat JAR containing 101,481 files and weighing in at 158MB. Combining 100,000 tiny files into a single archive is slow. Uploading this JAR to S3 at the end of the build is slow. Downloading this JAR at deploy time is slow (and can saturate the network cards on our application servers if we have a lot of concurrent deploys)."



'via Blog this'

Python: Instagram Strikes a Sizable Blow in Silicon Valley’s Tabs Vs Spaces War | WIRED

Instagram Strikes a Sizable Blow in Silicon Valley’s Tabs Vs Spaces War | WIRED: "Rather than really moving away from Python or really trying to change the language, it found all sorts of small ways to tweak its Python code so that it could efficiently serve those 500 million people.

Most notably, using a tool called cprofile, Krieger and company worked to identify their slowest pieces of Python code. “We believe in measuring first before taking action,” says head of infrastructure Hui Ding. Then, using a second tool called Cython, they converted these pockets of slow code into C or C++. According to the company, this allows Instagram to run with 10 to 15 percent less processing power."



'via Blog this'

Monday, 20 June 2016

Scala thread overload meets Game of Thrones..

Sunday, 19 June 2016

joejag/dotfiles: Every dev should have some dotfiles, these are mine

joejag/dotfiles: Every dev should have some dotfiles, these are mine: "Every dev should have some dotfiles, these are mine"



'via Blog this'

Everyday Git Aliases

Everyday Git Aliases: "Git gives you as much flexibility in how you construct your VCS workflow as it does for the commands you use on your local repo. In your gitconfig file you can add alises for your favourite commands, in this article I'll talk about mine. You can see my gitconfig on github."



'via Blog this'

Anti-If: The missing patterns

Anti-If: The missing patterns: "If statements usually make your code more complicated. But we don’t want to outright ban them. I’ve seen some pretty heinous code created with the goal of removing all traces of if statements. We want to avoid falling into that trap.

For each pattern we’ll read about I’m going to give you a tolerance value for when to use it.

A single if statement which isn’t duplicated anywhere else is probably fine. It’s when you have duplicated if statements that you want your spider sense to be tingling.

At the outside of your code base, where you talk to the dangerous outside world, you are going to want to validate incoming responses and change your beahaviour accordingly. But inside our own codebases, where we behind those trusted gatekeepers, I think we have a great opportunity to use simple, richer and more powerful alternatives."



'via Blog this'

Linux directory structure chart

Friday, 17 June 2016

Don't Get Obsessed With Design Patterns - Simple Programmer

Don't Get Obsessed With Design Patterns - Simple Programmer:



"The cost of adding a design pattern

There are many different design patterns, but most of them have something in common: when you apply them, you need to lay out some structure. In other words, you need to add classes and/or interfaces to the code.

In the first example, this structure consists of an abstract class extended by two children classes. What’s more, in order for the old code to use these new classes, you also need to make some updates not directly related to the design pattern itself.

The moral of the story is: if you’re thinking about applying a design pattern, consider the cost of doing so and the potential benefit. Doing it only for the sake of doing it will make your code more complex than it needs to be."



'via Blog this'

Serverless Architectures

Serverless Architectures: "Like many trends in software there’s no one clear view of what ‘Serverless’ is, and that isn't helped by it really coming to mean two different but overlapping areas:






  1.  Serverless was first used to describe applications that significantly or fully depend on 3rd party applications / services (‘in the cloud’) to manage server-side logic and state. These are typically ‘rich client’ applications (think single page web apps, or mobile apps) that use the vast ecosystem of cloud accessible databases (like Parse, Firebase), authentication services (Auth0, AWS Cognito), etc. These types of services have been previously described as ‘(Mobile) Backend as a Service’, and I’ll be using ‘BaaS’ as a shorthand in the rest of this article. 
  2. Serverless can also mean applications where some amount of server-side logic is still written by the application developer but unlike traditional architectures is run in stateless compute containers that are event-triggered, ephemeral (may only last for one invocation), and fully managed by a 3rd party. (Thanks to ThoughtWorks for their definition in their most recent Tech Radar.) One way to think of this is ‘Functions as a service / FaaS’ . AWS Lambda is one of the most popular implementations of FaaS at present, but there are others. I’ll be using ‘FaaS’ as a shorthand for this meaning of Serverless throughout the rest of this article."


Let’s think about a traditional 3-tier client-oriented system with server-side logic. A good example is a typical ecommerce app (dare I say an online pet store?)
Traditionally the architecture will look something like this, and let’s say it’s implemented in Java on the server side, with a HTML / Javascript component as the client:

Figure 1
Figure 1
With this architecture the client can be relatively unintelligent, with much of the logic in the system - authentication, page navigation, searching, transactions - implemented by the server application.
With a Serverless architecture this may end up looking more like this:

Figure 2
Figure 2


'via Blog this'

Thursday, 16 June 2016

Sunday, 12 June 2016

How To Code Like The Top Programmers At NASA — 10 Critical Rules

How To Code Like The Top Programmers At NASA — 10 Critical Rules: "NASA’s 10 rules for writing mission-critical code:





  •  Restrict all code to very simple control flow constructs – do not use goto statements, setjmp or longjmp constructs, and direct or indirect recursion. 
  • All loops must have a fixed upper-bound. It must be trivially possible for a checking tool to prove statically that a preset upper-bound on the number of iterations of a loop cannot be exceeded. If the loop-bound cannot be proven statically, the rule is considered violated.
  • Do not use dynamic memory allocation after initialization. 
  • No function should be longer than what can be printed on a single sheet of paper in a standard reference format with one line per statement and one line per declaration. Typically, this means no more than about 60 lines of code per function. 
  • The assertion density of the code should average to a minimum of two assertions per function. Assertions are used to check for anomalous conditions that should never happen in real-life executions. Assertions must always be side-effect free and should be defined as Boolean tests. When an assertion fails, an explicit recovery action must be taken, e.g., by returning an error condition to the caller of the function that executes the failing assertion. Any assertion for which a static checking tool can prove that it can never fail or never hold violates this rule (I.e., it is not possible to satisfy the rule by adding unhelpful “assert(true)” statements). 
  • Data objects must be declared at the smallest possible level of scope. 
  • The return value of non-void functions must be checked by each calling function, and the validity of parameters must be checked inside each function. 
  • The use of the preprocessor must be limited to the inclusion of header files and simple macro definitions. Token pasting, variable argument lists (ellipses), and recursive macro calls are not allowed. All macros must expand into complete syntactic units. The use of conditional compilation directives is often also dubious, but cannot always be avoided. This means that there should rarely be justification for more than one or two conditional compilation directives even in large software development efforts, beyond the standard boilerplate that avoids multiple inclusion of the same header file. Each such use should be flagged by a tool-based checker and justified in the code. 
  • The use of pointers should be restricted. Specifically, no more than one level of dereferencing is allowed. Pointer dereference operations may not be hidden in macro definitions or inside typedef declarations. Function pointers are not permitted.
    All code must be compiled, from the first day of development, with all compiler warnings enabled at the compiler’s most pedantic setting. 
  • All code must compile with these setting without any warnings. All code must be checked daily with at least one, but preferably more than one, state-of-the-art static source code analyzer and should pass the analyses with zero warnings."




'via Blog this'

Saturday, 11 June 2016

Processes > People

Google Testing Blog: Flaky Tests at Google and How We Mitigate Them

Google Testing Blog: Flaky Tests at Google and How We Mitigate Them: "At Google, we run a very large corpus of tests continuously to validate our code submissions. Everyone from developers to project managers rely on the results of these tests to make decisions about whether the system is ready for deployment or whether code changes are OK to submit. Productivity for developers at Google relies on the ability of the tests to find real problems with the code being changed or developed in a timely and reliable fashion.

Tests are run before submission (pre-submit testing) which gates submission and verifies that changes are acceptable, and again after submission (post-submit testing) to decide whether the project is ready to be released. In both cases, all of the tests for a particular project must report a passing result before submitting code or releasing a project.

Unfortunately, across our entire corpus of tests, we see a continual rate of about 1.5% of all test runs reporting a "flaky" result. We define a "flaky" test result as a test that exhibits both a passing and a failing result with the same code. There are many root causes why tests return flaky results, including concurrency, relying on non-deterministic or undefined behaviors, flaky third party code, infrastructure problems, etc. We have invested a lot of effort in removing flakiness from tests, but overall the insertion rate is about the same as the fix rate, meaning we are stuck with a certain rate of tests that provide value, but occasionally produce a flaky result. Almost 16% of our tests have some level of flakiness associated with them! This is a staggering number; it means that more than 1 in 7 of the tests written by our world-class engineers occasionally fail in a way not caused by changes to the code or tests."



'via Blog this'

Tuesday, 7 June 2016

Yet Another Distributed Systems Reading List

"Yet Another Distributed Systems Reading List 

I started putting together a reading list for engineers fairly new to distributed systems and theory, with little hands-on experience yet."



'via Blog this'