Constants Are Changing : a Software and Technology scrapbook: April 2015

Friday, 24 April 2015

ioT: Internet of Things links

Messaging for IoT: Active MQ

Burr Sutter: An Enterprise Developer's Journey to Internet-of-Things (IoT)

This article by Alok Batra started me thinking about the unique differences between the Enterprise vs Mobile vs IoT development spaces and my own personal journey down this path. I am sure my thinking will change as my IoT skills and knowledge mature - this is just a moment in time - but I thought writing it all down would be valuable.

Source: https://www.linkedin.com/pulse/becoming-iot-developer-alok-batra

How B-tree indexing works in MySQL

http://stackoverflow.com/questions/2362667/how-b-tree-indexing-works-in-mysql

Q. When I create an index for a table in mysql, I see that the index_type is type BTREE. Now although I understand about btree(s), I do not quiet understand how it stores the index and how the database searches the records based on this.

I mean, btree is excellent for databases to perform read and writes large blocks of data, when we create an index for column type of Primary key, what I understand is, it creates a tree and splitting the values for the root based on the value type of the root.

Now, does it store only the the primary key ID under the trees or the whole data associated with that primary key?

After finding the wanted primary ID, how does the database extract the record?

A. The database stores the value indexed as a B-Tree key, and the record pointer as a B-Tree value.

Whenever you search for a record holding a certain value of an indexed column, the engine locates the key holding this value in the B-Tree, retrieves the pointer to the record and fetches the record.

What exactly is a "record pointer", depends on the storage engine.

In MyISAM, the record pointer is an offset to the record in the MYI file.
In InnoDB, the record pointer is the value of the PRIMARY KEY.

In InnoDB, the table itself is a B-Tree with a PRIMARY KEY as a B-Tree key. This is what called a "clustered index" or "index-organized table". In this case, all other fields are stored as a B-Treevalue.

In MyISAM, the records are stored without any special order. This is called "heap storage".

Wednesday, 22 April 2015

Basic Search Algorithms

TopCoder: Data Science Tutorials - Algorithms

Binary Search Tree

In computer science, binary search trees (BST), sometimes called ordered or sorted binary trees, are a class of data structures used to implement lookup tables and dynamic sets. They store data items, known as keys, and allow fast insertion and deletion of such keys, as well as checking whether a key is present in a tree.
Binary search trees keep their keys in sorted order, so that lookup and other operations can use the principle of binary search: when looking for a key in a tree (or a place to insert a new key), they traverse the tree from root to leaf, making comparisons to keys stored in the nodes of the tree and deciding, based on the comparison, to continue searching in the left or right subtrees. On average, this means that each comparison allows the operations to skip over half of the tree, so that each lookup/insertion/deletion takes time proportional to the logarithm of the number of items stored in the tree. This is much better than the linear time required to find items by key in an unsorted array, but slower than the corresponding operations on hash tables.

General Big Data, Big Analytics links

5 technologies that will help big data cross the chasm

Apache Spark

Spark’s popularity is aided by the YARN resource manager for Hadoop and the Apache Mesos cluster-management software, both of which make it possible to run Spark, MapReduce and other processing engines on the same cluster using the same Hadoop storage layer. I wrote in 2012 about the move away from MapReduce as one of five big trends helping us rethink big data, and Spark has stepped up as the biggest part of that migration

2015: Thomas W. Dinsmore: Predictions for Big Analytics

2015: Predictions for Big Analytics

Apache Spark usage will explode.
Analytics in the cloud will take off.
Python will continue to gain on R as the preferred open source analytics platform.
H2O will continue to win respect and customers in the Big Analytics market.
SAS customers will continue to seek alternatives.

Business Intelligence links

Gartner: Survey Analysis: Customers Rate Their BI Platform Vendor, 2014

High Scalability links

Paper: Staring Into The Abyss: An Evaluation Of Concurrency Control With One Thousand Cores

We implemented seven concurrency control algorithms on a main-memory DBMS and using computer simulations scaled our system to 1024 cores. Our analysis shows that all algorithms fail to scale to this magnitude but for different reasons. In each case, we identify fundamental bottlenecks that are independent of the particular database implementation and argue that even state-of-the-art DBMSs suffer from these limitations.

General database links

When is "ACID" ACID? Rarely.

tl;dr: ACID and NewSQL databases rarely provide true ACID guarantees by default, if they are supported at all. See the table.

Software Architecture links

Ben Morris: What role do architects have in agile development?

Enter the “master developer”

In this case, some kind of design authority is generally required to work across the teams to ensure the integrity of the overall system and spot issues before they become obstacles. This role shouldn’t be confused with governance, where design edicts are sent down to teams from the “ivory towers” and architecture boards. It can only be effective with the consent of the teams.

Akka links

Why do I hate akka?

If you have talked to me for more than a few minutes about the current state of the world of scala programming, you probably have learned that at some point I started hating akka. Why? There are many reasons, but the one I will name first is the one that many will name first. That a partial function Any => Unit is a horrible type to build a framework around.

General Computer Science links

2 + 2 = 5

Interesting attempts to make this work on various popular programming languages.

Not light reading!

How to Design Programs

Structure and Interpretation of Computer Programs

Top 30 computer science and programming blogs http://t.co/RSZ1SaRyBe
— Computer Science (@CompSciFact) January 21, 2015

Demystification of #functionalprogramming (immutable, stateless, ..) - http://t.co/9M4ALIXlkU
— Charles Moulliard (@cmoulliard) October 7, 2014

Dean Wampler: SQL Strikes Back! Recent Trends in Data Persistence and Analysis

http://www.infoq.com/presentations/sql-newsql-hybrid-fp

Traditional Data Warehouse: Pros
–Mature
–Rich SQL, analytics functions
–Scales to “mid-size” data
Cons
–Expensive per TB
–Can’t scale to Hadoop-sized data sets

Data Warehouse vs. Hadoop? : • Data Warehouse
+Mature
+Rich SQL, analytics
–Scalability
–$$/TB
• Hadoop
–Maturity vs. DWs
+Growing SQL
+Massive scalability
+Excellent $$/TB

Facebook had data in Hadoop. Facebook’s Data Analysts needed access to it...
so they created Hive...

Career Development links

Five Signs You Should Be a Low-Code Developer

If some or most of these traits resonate with your approach to work, then you’ve got what it takes to be a low-code developer. In this role, you’d understand that software development is about reaching the business goal and helping end users. You’d want to talk to users, understand their requirements and work closely with them in short, iterative cycles. Most of all, you’d advocate for the business value of IT and find great job satisfaction from making your customers and end users happy.

techcrunch: On Secretly Terrible Engineers

That is the transformation we need in engineering. We need to start with the assumption that engineers are smart learners eager to know more about their craft. No, an individual may not know the specific framework you use for front-end development, but then again, there are so many that it is hard to know all of them. Engage them! Mentor them! Buy them a god damn book!

We need to move beyond the algorithm bravado to engage more fundamentally with the craft. If people are wired for engineering logic and have programmed in some capacity in the past, they almost certainly can get up to speed in any other part of the field. Let them learn, or even better, help them learn.
I am not unbiased here, having gone through this process myself. I started programming in second grade. I wrote tens of thousands of lines of code in high school, programming games and my own web server. I got a Mathematical and Computational Science degree from Stanford and continued coding. I should have been a software developer, but after a series of interviews, I realized the field was never for me. So much hostility, so little love.

No one ever offered me a book. No one even offered advice, or suggestions on what was interesting in the field or what was not. No one ever said, “Here is how we are going to bring your skills to the next level and ensure you will be quickly productive on our team.” The only answer I ever got was, “We expect every employee to be ready on day one.” What a scary proposition! Even McDonalds doesn’t expect its burger flippers to be ready from day one.

That’s not typical in our economy, and as computer science expands in popularity, we need to ensure that the next generation of talent feels welcomed. There are far less secretly terrible engineers than we might expect if we give them mentorship and support to do great work. There is a whole group of secretly great engineers ready to be developed, if only we realized our field’s animosity.

Funny and true.
If Carpenters Were Hired Like Programmers

Facebook Coding Interview tips:
https://www.facebook.com/Engineering/videos/10153034561822200/

Aruoba/Fernández-Villaverde: Comparison of Programming Languages in Economics

Comparison of Programming Languages in Economics: code examples

http://economics.sas.upenn.edu/~jesusfv/comparison_languages.pdf

We solve the stochastic neoclassical growth model, the workhorse of modern macroeconomics, using C++11,Fortran 2008, Java, Julia, Python, Matlab, Mathematica, and R. We implement the same algorithm, value function iteration with grid search, in each of the languages. We report the execution times of the codes in a Mac and in a Windows computer and comment on the strength and weakness of each language

Maven, Git, Jenkins software build tool links

http://www.slideshare.net/tarkasteve/understanding-git-voxxed-vienna-2016

19 Tips For Everyday Git Use

I’ve been using git full time for the past 4 years, and I wanted to share the most practical tips that I’ve learned along the way. Hopefully, it will be useful to somebody out there.

If you are completely new to git, I suggest reading Git Cheat Sheet first. This article is aimed at somebody who has been using git for three months or more.

Table of Contexts:

GitHub Pull Requests

GitHub’s mission is to make it easier to work together than alone. Throughout the company’s history, they have worked toward this goal by providing an easy way to host Git repositories online and surrounding those repositories with a growing set of collaborative mechanisms that work in the browser and through Git itself.

Pull Requests may be the most important of these innovations. They have enabled increased open-source contributions, provided new ways for enterprise teams to work together, and offered a full-featured code review mechanism—all at the cost of a few Git commands and a simple web user interface. Let’s take a look at how pull requests work and how to use them in open-source and enterprise environments.

Neal Ford: Why Everyone (Eventually) Hates (or Leaves) Maven

Which is why every project eventually hates Maven. Maven is a classic contextual tool: it is opinionated, rigid, generic, and dogmatic, which is exactly what is needed at the beginning of a project. Before anything exists, it’s nice for something to impose a structure, and to make it trivial to add behavior via plug-ins and other pre-built niceties. But over time, the project becomes less generic and more like a real, messy project. Early on, when no one knows enough to have opinions about things like lifecycle, a rigid system is good. Over time, though, project complexity requires developers to spawn opinions, and tools like Maven don’t care.

General Python links

The Hitchhiker’s Guide to Python!

Understanding Python Decorators in 12 Easy Steps!

https://caniusepython3.com/

Google Style Guide for Python

Profiling Python in Production

The Elements of Python Style

Code Like a Pythonista: Idiomatic Python
Other languages have "variables", Python has "names"

Records: SQL for Humans™

Python utilities that should be builtins

Python Mocking 101: Fake It Before You Make It
An Introduction to Mocking in Python

General Software Development process links

Microsoft Research: Exploding Software-Engineering Myths

The logical assumption would be that more code coverage results in higher-quality code. But what Nagappan and his colleagues saw was that, contrary to what is taught in academia, higher code coverage was not the best measure of post-release failures in the field. Code coverage is not indicative of usage.

What the research team found was that the TDD teams produced code that was 60 to 90 percent better in terms of defect density than non-TDD teams. They also discovered that TDD teams took longer to complete their projects—15 to 35 percent longer.

The team observed a definite negative correlation: more assertions and code verifications means fewer bugs. Looking behind the straight statistical evidence, they also found a contextual variable: experience. Software engineers who were able to make productive use of assertions in their code base tended to be well-trained and experienced, a factor that contributed to the end results. These factors built an empirical body of knowledge that proved the utility of assertions.

zeroturnaround.com: Architecting Large Enterprise Java Projects with Markus Eisele

http://zeroturnaround.com/rebellabs/architecting-large-enterprise-java-projects-by-markus-eisele/

Developers built a lot of applications like that some time ago and even present day! These applications are still working and need maintenance. So we see them sometimes and call them legacy. They tend to have a release cycle of once or twice a year, depend on a proprietary application server environment and most importantly have a single database schema for all data. Naturally, you cannot move very fast with such a beast on your shoulders and must have a large team and QA department even just to maintain it.

The next step in the architecture design was the Enterprise Service Bus age. Understanding that changes have to be incorporated into even the oldest and the most legacy applications. We (Java developers) started breaking the huge apps into smaller ones. The biggest challenge was to integrate it all together, so the service bus seemed the best solution.
The change wasn’t that big for the operations teams, as they still have everything under their control and centralized, although it was a much more flexible approach. However, the same centralization that adds value, creates a raft of problems that the engineering team had to solve: most importantly challenges with testing and the single point of failure (SPOF).
We’re now we’re moving even further away from the monolithic apps and towards the trendingbuzzword of Microservices.

Then there’s a number of patterns you can use to organise the communication between your microservices, like the Aggregator or the Chain.

Visualisation, HTML5, UX and HTTP links

modeling-languages.com: 10 JavaScript libraries to draw your own diagrams

Comparative table of JavaScript drawing libraries

To finish here is a basic comparative table between the presented libraries.

Library License Language / infrastructure high/low level built-in editor Github (04/02/2015)

JointJS MPL HTML
Javascript
SVG high No 1388 stars
265 forks

Rappid Commercial
1 500,00 € HTML
Javascript
SVG high Yes

Mxgraph Commercial
4300.00 € HTML
Javascript
SVG high Yes

GoJS Commercial
$1,350.00 HTML
Canvas
Javascript High Yes

Raphael MIT HTML
Javascript
SVG low No 7105 stars
1078 forks

Draw2D GPL2
commercial HTML
Javascript
SVG medium No

D3 BSD HTML
Javascript
SVG low No 36218 stars
9142 forks

FabricJS MIT HTML
Canvas
javasript low No 4127 stars
705 forks

paperJS MIT HTML
Canvas
javascript low No 4887 stars
496 forks

JsPlumb MIT/GPL2 HTML
Javascript medium No 2161 stars
563 forks

Javascript links

The Original jQuery Source Code, Annotated by John Resig

The mind-boggling universe of JavaScript Module strategies

JavaScript's 'bind' Explained in 5 Minutes

Learn JS Data Data manipulation, munging, and processing in JavaScript

15 HELPFUL CHROME EXTENSIONS FOR DEVELOPERS YOU NEED TO KNOW

JavaScript Module Pattern: In-Depth

The module pattern is a common JavaScript coding pattern. It’s generally well understood, but there are a number of advanced uses that have not gotten a lot of attention.

The Path to Parallel JavaScript

Javascript for Java Developers

You’re Missing the Point of Server-Side Rendered JavaScript Apps

Attila Szegedi (@asm) - Nashorn and JVM Performance @jfokus http://t.co/2gKbe6px1w
— NightHacking (@_nighthacking) February 5, 2015

Five Traits of Well-Organized JavaScript: http://t.co/fs8DBpB06P
— JavaScript Daily (@JavaScriptDaily) December 13, 2014

Top #javascript frameworks you should learn - http://t.co/8N2AvsqzR1 (AngularJS, ...) @jbossdeveloper
— Charles Moulliard (@cmoulliard) September 4, 2014

A Quality Conscious JavaScript Quality Guide: https://t.co/bFA1hZTxa0 - a style guide aiming to provide ground rules for an app's JS code
— JavaScript Daily (@JavaScriptDaily) August 19, 2014

Dos and Don'ts in JavaScript: A Few Best Practices for Learners (video) - https://t.co/hJi6GiIgAH
— JavaScript Daily (@JavaScriptDaily) May 7, 2014

Learn JavaScript Essentials (for all skill levels)

General Java/JavaEE links

Java 8 Stream Tutorial

How to use flatMap() in Java 8 - Stream Example Tutorial

Continuous Delivery with Docker Containers and Java EE

Microservices, DevOps and PaaS - The Impact on Modern Java EE Architecture

What Would ESBs Look Like If They Were Done Today?

EJB and CDI - Alignment and Strategy

Upgrading to Java 8 at Scale

Productive Java EE 7 on Java 8 At Commerzbank

Basics of scaling Java EE applications

Design pattern samples in Java

Java 8’s Method References Put Further Restrictions on Overloading

Dismantling invokedynamic

Javascript for Java Developers

Attila Szegedi (@asm) - Nashorn and JVM Performance @jfokus http://t.co/2gKbe6px1w
— NightHacking (@_nighthacking) February 5, 2015

Iron Clad Java: 'a master class in secure Java design & coding, written for DEVs by guys who truly know their shit.’ http://t.co/eCrvWt4Lsv
— Jeremiah Grossman (@jeremiahg) January 29, 2015

I'm crying. -XX:SelfDestructTimer=1. It works.
— Anton Arhipov (@antonarhipov) January 20, 2015

Manipulating JARs, WARs, and EARs on the Command Line http://t.co/tQAS1OTR6r #Java
— Markus Eisele (@myfear) November 30, 2014

Work with JVM? Never miss a @BrianGoetz talk: https://t.co/IKL11Pb3BB
— Dean Wampler (@deanwampler) November 24, 2014

most j2ee patterns are useless today: @AdamBien: How To Deal With J2EE and Design Patterns http://t.co/womnGpxw98
— martin (@c4n70r) August 21, 2014

Dependency Injection, Annotations, and why #Java is Better Than you Think it is http://t.co/TKQCaKiRJr /via @cmoulliard
— Markus Eisele (@myfear) August 19, 2014

"Loose coupling with Context and Dependency Injection" http://t.co/5f4xJvaPX5 #CDI #JavaEE
— Markus Eisele (@myfear) August 1, 2014

"Some facts about Stateless EJB beans" from #JavaEESquade : http://t.co/r9l0fpdsYu
— Antonio Goncalves (@agoncal) July 17, 2014

A curated list of awesome Java frameworks, libraries and software

Java 8 Streams cheat sheet

Palladium: Predictive Analytics, Machine Learning framework

https://github.com/ottogroup/palladium

Palladium provides means to easily set up predictive analytics services as web services. It is apluggable framework for developing real-world machine learning solutions. It provides generic implementations for things commonly needed in machine learning, such as dataset loading, model training with parameter search, a web service, and persistence capabilities, allowing you to concentrate on the core task of developing an accurate machine learning model. Having a well-tested core framework that is used for a number of different services can lead to a reduction of costs during development and maintenance due to harmonization of different services being based on the same code base and identical processes. Palladium has a web service overhead of a few milliseconds only, making it possible to set up services with low response times.....

blog.xebialabs.com: Before You Go Over the Container Cliff with Docker, Mesos etc: Points to Consider

I’m personally really excited about the potential of microservices and containers, and typically recommend pretty emphatically that our users should research them. But I also add that doing research is absolutely not the same thing as deciding up front to go for full-scale adoption.

Given the incredibly rapid pace of change in this area, it’s essential to develop a clear understanding of the capabilities of the technology in your environment before making any decisions: production is not usually a good arena for R&D.

Based on what we have learned from our users and partners that have been undertaking such research, our own experiences (we use containers quite a lot internally) and lessons from companies such a eBay and Google, here are six important criteria to bear in mind when deciding whether to move from research to adoption....

Java Performance links

Includes some stuff on JVisualVM: Hunting Memory Leaks in Java

Comparing GC Collectors

Alex Zhitnitsky : Java Performance Tuning: Getting the Most Out of Your Garbage Collector

The main question here is this: What do you see as an acceptable criteria for the GC pause frequency and duration in your application? For example, a daily pause of 15 seconds might be acceptable, while a frequency of once in 30min would be an absolute disaster for the product. The requirements come from the domain of each system, where real time and high frequency trading systems would have the most strict requirements.
Overall, seeing pauses of 15-17 seconds is not a rare thing. Some systems might even reach 40-50 seconds pauses, and Haim also had a chance to see 5 minute pauses in a system with a large heap that did batch processing jobs. So pause duration doesn’t play a big factor there.

General Scala Links

4 Interview Questions for Scala Developers

What’s the difference between the following terms and types in Scala: ‘Nil,’ ‘Null,’ ‘None,’ ‘Nothing’?What is ‘Option’ and how is it used?Explain the difference between ‘concurrency’ and ‘parallelism,’ and name some constructs you can use in Scala to leverage both.Bonus Question: What is ‘Unit’ and ‘()’?

Scala DSLs:

short intro : http://www.scala-lang.org/old/node/1403
Using in Camel routes: http://camel.apache.org/scala-dsl.html
Good presentation, which describes Scala as more succinct than Java and therefore more appropriate: http://www.slideshare.net/indicthreads/using-scala-for-building-ds-ls-abhijit-sharma

Torsten Möller: Data Visualization Course

http://www2.cs.sfu.ca/~torsten/Teaching/Cmpt467/

Content Description:

Visualization deals with all aspects that are connected with the visual representation of data sets from scientific experiments, simulations, medical scanners, databases, web system, and the like in order to achieve a deeper understanding or a simpler representation of complex phenomena and to extract important information visually. To obtain these goals, both well-known techniques from the field of interactive computer graphics and completely new methods are applied. The objective of the course is to provide knowledge about visualization algorithms and data structures as well as acquaintance with practical applications of visualization. Through several projects the student is expected to learn methods to explore and visualize different kinds of data sets.

Introduction and historical remarks
Abstract visualization concepts and the visualization pipeline
Data acquisition and representation (sampling and reconstruction; grids and data structures).
Basic mapping concepts
Visualization of scalar fields (isosurface extraction, volume rendering)
Visualization of vector fields (particle tracing, texture-based methods, vector field topology)
Tensor fields, multi-attribute data, multi-field visualization
Human visual perception + Color
Space/Order + Depth/Occlusion
Focus+Context; Navigation+Zoom
Visualization of graphs and trees and high-dimensional data
Evaluation + Interaction models

Analysis of Algorithms

The Sedgewick/Wayne book covers very well (part of Coursera course):
Analysis of Algorithms

As people gain experience using computers, they use them to solve difficult problems or to process large amounts of data and are invariably led to questions like these:

How long will my program take?

Why does my program run out of memory?

Basic Sorting Algorithms (plus Python examples)

This site has animated visualizations of the main algorithms (h/t David R. Martin):
Sorting Algorithm Animations

The Sedgewick/Wayne book covers well (part of Coursera course):
Princeton: Sorting Applications

Quicksort is the fastest general-purpose sort.

In most practical situations, quicksort is the method of choice. If stability is important and space is available, mergesort might be best. In some performance-critical applications, the focus may be on just sorting numbers, so it is reasonable to avoid the costs of using references and sort primitive types instead.

Miko Matsumura: Data Science Is Dead

http://news.dice.com/2014/03/05/data-science-is-dead/

Yes, more and more companies are hoarding every single piece of data that flows through their infrastructure. As Google Chairman Eric Schmidt pointed out, we create more data in a single day today than all the data in human history prior to 2013.

Unfortunately, unless this is structured data, you will be subjected to the data equivalent of dumpster diving. But surfacing insight from a rotting pile of enterprise data is a ghastly process—at best. Sure, you might find the data equivalent of a flat-screen television, but you’ll need to clean off the rotting banana peels. If you’re lucky you can take it home, and oh man, it works! Despite that unappetizing prospect, companies continue to burn millions of dollars to collect and gamely pick through the data under respective roofs. What’s the time-to-value of the average “Big Data” project? How about “Never”?

If the data does happen to be structured data, you will probably be given a job title like Database Administrator, or Data Warehouse Analyst.

When it comes to sorting data, true salvation may lie in automation and other next-generation processes, such as machine learning and evolutionary algorithms; converging transactional and analytic systems also looks promising, because those methods deliver real-time analytic insight while it’s still actionable (the longer data sits in your store, the less interesting it becomes). These systems will require a lot of new architecture, but they will eventually produce actionable results—you can’t say the same of “data dumpster diving.” That doesn’t give “Data Scientists” a lot of job security: like many industries, you will be replaced by a placid and friendly automaton.

So go ahead: put “Data Scientist” on your resume. It may get you additional calls from recruiters, and maybe even a spiffy new job, where you’ll be the King or Queen of a rotting whale-carcass of data. And when you talk to Master Data Management and Data Integration vendors about ways to, er, dispose of that corpse, you’ll realize that the “Big Data” vendors have filled your executives’ heads with sky-high expectations (and filled their inboxes with invoices worth significant amounts of money). Don’t be the data scientist tasked with the crime-scene cleanup of most companies’ “Big Data”—be the developer, programmer, or entrepreneur who can think, code, and create the future.

Monday, 6 April 2015

Trey Causey: Getting started in data science: My thoughts

http://treycausey.com/getting_started.html

One of the primary things that separates a data scientist from someone just building models is the ability to think carefully about things like endogeneity, causal inference, and experimental and quasi-experimental design. Data scientists must understand and think about things like data generating processes and reason through how misspecdraw from their analyses.

It takes a long time and a lot of training for this to come naturally. I don't think I gave much thought to selecting on the dependent variable and how endemic it is until I got to grad school. Now it sticks out like a sore thumb everywhere I look. Similarly, thinking carefully about outliers (extreme values) or the process by which your data came to have missing values; these are things that often get swept aside by tutorials showing you how to use R. This isn't to say you have to go to grad school (you probably shouldn't) or even to college; it just means that data science is not simply a series of programs and tutorials that automatically make inferences from your data. Often times, what isn't in your data has significant implications for inference. Your software package isn't going to tell you what they are.

All this being said, I do think we live in an extremely exciting time for democratizing education. I hope some good comes out of it. Enough doom and gloom, and let's get on to the links.

Library	License	Language / infrastructure	high/low level	built-in editor	Github (04/02/2015)
JointJS	MPL	HTML Javascript SVG	high	No	1388 stars 265 forks
Rappid	Commercial 1 500,00 €	HTML Javascript SVG	high	Yes
Mxgraph	Commercial 4300.00 €	HTML Javascript SVG	high	Yes
GoJS	Commercial $1,350.00	HTML Canvas Javascript	High	Yes
Raphael	MIT	HTML Javascript SVG	low	No	7105 stars 1078 forks
Draw2D	GPL2 commercial	HTML Javascript SVG	medium	No
D3	BSD	HTML Javascript SVG	low	No	36218 stars 9142 forks
FabricJS	MIT	HTML Canvas javasript	low	No	4127 stars 705 forks
paperJS	MIT	HTML Canvas javascript	low	No	4887 stars 496 forks
JsPlumb	MIT/GPL2	HTML Javascript	medium	No	2161 stars 563 forks

Friday, 24 April 2015

Thursday, 23 April 2015

Wednesday, 22 April 2015

Apache Spark

Friday, 17 April 2015

Enter the “master developer”

Thursday, 16 April 2015

Comparative table of JavaScript drawing libraries

Wednesday, 15 April 2015

Tuesday, 7 April 2015

Monday, 6 April 2015