Showing posts with label mysql. Show all posts
Showing posts with label mysql. Show all posts

Wednesday, 25 May 2016

Scaling to 100M: MySQL is a Better NoSQL | Wix Engineering

Scaling to 100M: MySQL is a Better NoSQL | Wix Engineering: "The choice to use a NoSQL database is often based on hype, or a wrong assumption that relational databases cannot perform as well as a NoSQL database. Operational costs, as well as other stability and maturity concerns, are often overlooked by engineers when it comes to selecting a database. For more information about the limitations and shortcomings of different NoSQL (and SQL) engines, take a look at the Jepsen series of articles from Aphyr.
This post will explain why we’ve found that using MySQL for the key/value use case is better than most of the dedicated NoSQL engines, and provide guidelines to follow when using MySQL in this way."



'via Blog this'

Tuesday, 25 August 2015

Uber: Migrating from Postgres to Sharded Datastore on top of Mysql + Lambda Architecture (append only updates)

Project Mezzanine: The Great Migration


The first design decision was the choice of database for the tripstore. Our short list of requirements were:
  • Operationally robust (no data loss, supports backup, replication to secondary data centers, easy to troubleshoot, predictable, operational expertise).
  • Horizontally scalable both in storage capacity and IOPS.
  • High write-availability. We always want to be able to persist a trip to stable storage. It’s okay to trade-off short-term read-availability, as the backend is working mostly in a batch-oriented fashion.
  • Secondary index support. Trips are looked up by user, city, and summarized in various ways.
  • No downtime for any operation (expanding storage, backup, adding indexes, adding data, and so forth).
The last item in the list was addressing a very immediate pain point. The trips table in PostgreSQL had grown so big that any operation that needed to add a new column or add a new index caused downtime. This made it increasingly cumbersome to develop new features.
We decided that a column-oriented, schemaless approach where data (JSON blobs) are organized in a grid indexed by trip-UUID, column name, and optionally a timestamp would work well as an overall data model. The model lends itself naturally to horizontal scaling by partitioning the rows across multiple shards, and supports our rapid development culture by being schemaless. New columns can be added, and new fields can be added to a column with no reconfiguration.
We evaluated various NoSQL-style databases with the above characteristics. However, we didn’t feel confident they were a good fit for storing our trip data, because of either our operational experience or the product’s maturity.
Inspired by blog posts, such as this one from FriendFeed, we decided to build our own simple, sharded datastore on top of MySQL. The key characteristics of the system we built are:
  • Sharding: Rows are sharded into a fixed set of shards, decided at setup time. Typically, we use 4096. Each shard corresponds to a MySQL tablespace, and the shards are distributed across a number of MySQL servers. Shards can be moved between MySQL servers for load-balancing, and the capacity can be increased online. We typically expand by splitting each MySQL server in two.
  • Append-only (no updates) data model: It only supports an append-only data model where a cell can never be modified after it is written. This is very useful for a system that stores transactional data and wants to guard against data corruption. By being append-only, modifications are naturally idempotent and commutative. The latter means that we can replay updates in any order and get the same result. (We learned later that the append-only style is also advocated by the lambda architecture.)
  • Buffered writes. If the shard where a cell needs to be written to is unavailable (or slow), we write the data to a pending table in any other available MySQL server. These are then later replayed once the shard becomes available. Due to the idempotent and commutative data model, this is always safe and does not require cross-host coordination.
  • Sharded secondary indexes: Indexes can be created on multiple fields in the columns and are sharded on a specific key (e.g., user uuid). They are implemented as MySQL tables and backfilled in the background. In case we need to change the index (e.g., adding a field), we can create a new version, backfill it, and then switch to the new version by changing an index alias, all without application downtime.
The whole system, which we simply call Schemaless in homage to its design, is written in Python. The initial version took about 5 months from idea to production deployment, and we will describe specific implementation details in future blog posts


Thursday, 23 April 2015

How B-tree indexing works in MySQL

http://stackoverflow.com/questions/2362667/how-b-tree-indexing-works-in-mysql

Q.  When I create an index for a table in mysql, I see that the index_type is type BTREE. Now although I understand about btree(s), I do not quiet understand how it stores the index and how the database searches the records based on this.
I mean, btree is excellent for databases to perform read and writes large blocks of data, when we create an index for column type of Primary key, what I understand is, it creates a tree and splitting the values for the root based on the value type of the root.
Now, does it store only the the primary key ID under the trees or the whole data associated with that primary key?
After finding the wanted primary ID, how does the database extract the record?

A. The database stores the value indexed as a B-Tree key, and the record pointer as a B-Tree value.
Whenever you search for a record holding a certain value of an indexed column, the engine locates the key holding this value in the B-Tree, retrieves the pointer to the record and fetches the record.
What exactly is a "record pointer", depends on the storage engine.
  • In MyISAM, the record pointer is an offset to the record in the MYI file.
  • In InnoDB, the record pointer is the value of the PRIMARY KEY.
In InnoDB, the table itself is a B-Tree with a PRIMARY KEY as a B-Tree key. This is what called a "clustered index" or "index-organized table". In this case, all other fields are stored as a B-Treevalue.
In MyISAM, the records are stored without any special order. This is called "heap storage".