Is PostgreSQL good enough?

tldr; you can do jobs, queues, real time change feeds, time series, object store, document store, full text search with PostgreSQL. How to, pros/cons, rough performance and complexity levels are all discussed. Many sources and relevant documentation is linked to.

Your database is first. But can PostgreSQL be second?

Web/app projects these days often have many distributed parts. It's not uncommon for groups to use the right tool for the job. The right tools are often something like the choice below.
  • Redis for queuing, and caching.
  • Elastic Search for searching, and log stash.
  • Influxdb or RRD for timeseries.
  • S3 for an object store.
  • PostgreSQL for relational data with constraints, and validation via schemas.
  • Celery for job queues.
  • Kafka for a buffer of queues or stream processing.
  • Exception logging with PostgreSQL (perhaps using Sentry)
  • KDB for low latency analytics on your column oriented data.
  • Mongo/ZODB for storing documents JSON (or mangodb for /dev/null replacement) 
  • SQLite for embedded. 
  • Neo4j for graph databases.
  • RethinkDB for your realtime data, when data changes, other parts 'react'.
  • ...
For all the different nodes this could easily cost thousands a month, require lots of ops knowledge and support, and use up lots of electricity. To set all this up from scratch could cost one to four weeks of developer time depending on if they know the various stacks already. Perhaps you'd have ten nodes to support.

Could you gain an ops advantage by using only PostgreSQL? Especially at the beginning when your system isn't all that big, and your team size is small, and your requirements not extreme? Only one system to setup, monitor, backup, install, upgrade, etc.

This article is my humble attempt to help people answer the question...

Is PostgreSQL good enough?

Can it be 'good enough' for all sorts of different use cases? Or do I need to reach into another toolbox?

Every project is different, and often the requirements can be different. So this question by itself is impossible to answer without qualifiers. Many millions of websites and apps in the world have very few users (less than thousands per month), they might need to handle bursty traffic at 100x the normal rate some times. They might need interactive, or soft realtime performance requirements for queries and reports. It's really quite difficult to answer the question conclusively for every use case, and for every set of requirements. I will give some rough numbers and point to case studies, and external benchmarks for each section.

Most websites and apps don't need to handle 10 million visitors a month, or have 99.999% availability when 95% availability will do, ingest 50 million metric rows per day, or do 400,000 jobs per second, or query over TB's of data with sub millisecond response times.

Tool choice.

I've used a LOT of different databases over time. CDB, Elastic Search, Redis, SAP (is it a db or a COBOL?), BSDDB/GDBM, SQLite... Even written some where the requirements were impossible to match with off the shelf systems and we had to make them ourselves (real time computer vision processing of GB/second in from the network). Often PostgreSQL simply couldn't do the job at hand (or mysql was installed already, and the client insisted). But sometimes PostgreSQL was just merely not the best tool for the job.

A Tool Chest
Recently I read a book about tools. Woodworking tools, not programming tools. The whole philosophy of the book is a bit much to convey here... but The Anarchist's Tool Chest is pretty much all about tool choice (it's also a very fine looking book, that smells good too). One lesson it teaches is about when selecting a plane (you know the things for stripping wood). There are dozens of different types perfect for specific situations. There's also some damn good general purpose planes, and if you just select a couple of good ones you can get quite a lot done. Maybe not the best tool for the job, but at least you will have room for them in your tool chest. On the other hand, there are also swiss army knives, and 200 in one tools off teevee adverts. I'm pretty sure PostgreSQL is some combination of a minimal tool choice and the swiss army knife tool choice in the shape of a big blue solid elephant.

PostgreSQL is an elephant sized tool chest that holds a LOT of tools.

Batteries included?

Does PostgreSQL come with all the parts for full usability? Often the parts are built in, but maybe a bit complicated, but not everything is built in. But luckily there are some good libraries which make the features more usable ("for humans").

For from scratch people, I'll link to the PostgreSQL documentation. I'll also link to already made systems which already use PostgreSQL for (queues, time series, graphs, column stores, document data bases), which you might be able to use for your needs. This article will slanted towards the python stack, but there are definitely alternatives in the node/ruby/perl/java universes. If not, I've listed the PostgreSQL parts and other open source implementations so you can roll your own.

By learning a small number of PostgreSQL commands, it may be possible to use 'good enough' implementations yourself. You might be surprised at what other things you can implement by combining these techniques together. 

Task, or job queues.

Recent versions of PostgeSQL support a couple of useful technologies for efficient and correct queues.

First is the LISTEN/NOTIFY. You can LISTEN for events, and have clients be NOTIFY'd when they happen. So your queue workers don't have to keep polling the database all the time. They can get NOTIFIED when things happen.

The recent addition in 9.5 of the SKIP LOCKED locking clause to PostgreSQL SELECT, enables efficient queues to be written when you have multiple writers and readers. It also means that a queue implementation can be correct [2].

Finally 9.6 saw plenty of VACUUM performance enhancements which help out with queues.

Batteries included?

A very popular job and task system is celery. It can support various SQL backends, including PostgreSQL through sqlalchemy and the Django ORM. [ED: version 4.0 of celery doesn't have pg support]


A newer, and smaller system is called pq. It sort of models itself off the redis python 'rq' queue API. However, with pq you can have a transactional queue. Which is nice if you want to make sure other things are committed AND your job is in the queue. With a separate system this is a bit harder to guarantee.

Is it fast enough? pq states in its documentation that you can do 1000 jobs per second per core... but on my laptop it did around 2000. In the talk "Can elephants queue?" 10,000 messages per second are mentioned with eight clients.

More reading.
  1. http://www.cybertec.at/skip-locked-one-of-my-favorite-9-5-features/
  2. http://blog.2ndquadrant.com/what-is-select-skip-locked-for-in-postgresql-9-5/
  3. https://www.pgcon.org/2016/schedule/track/Applications/929.en.html 

Full text search.

Full text search Searching the full text of the document, and not just the metadata.
PostgreSQL has had full text search for quite a long time as a separate extension, and now it is built in. Recently, it's gotten a few improvements which I think now make it "good enough" for many uses.

The big improvement in 9.6 is phrase search. So if I search for "red hammer" I get things which have both of them - not things that are red, and things that are a hammer. It can also return documents where the first word is red, and then five words later hammer appears.

One other major thing that elastic search does is automatically create indexes on all the fields. You add a document, and then you can search it. That's all you need to do. PostgreSQL is quite a lot more manual than that. You need to tell it which fields to index, and update the index with a trigger on changes (see triggers for automatic updates).  But there are some libraries which make things much easier. One of them is sqlalchemy_searchable. However, I'm not aware of anything as simple and automatic as elastic search here.
  • What about faceted search? These days it's not so hard to do at speed. [6][7]
  • What about substring search on an index (fast LIKE)? It can be made fast with a trigram index. [8][9]
  • Stemming? Yes. [11
  • "Did you mean" fuzzy matching support? Yes. [11
  • Accent support? (My name is René, and that last é breaks sooooo many databases). Yes. [11]
  • Multiple languages? Yes. [11]
  • Regex search when you need it? Yes. [13]
If your main data store is PostgreSQL and you export your data into Elasticsearch (you should NOT use elastic search as the main store, since it still crashes sometimes), then that's also extra work you need to do. With elastic search you also need to manually set weighting of different fields if you want the search to work well. So in the end it's a similar amount of work.

Using the right libraries, I think it's a similar amount of work overall with PostgreSQL. Elasticsearch is still easier initially. To be fair Lucene (which elasticsearch is based on) is a much more advanced text searching system.

What about the speed? They are index searches, and return fast - as designed. At [1] they mention that the speed is ok for 1-2 million documents. They also mention 50ms search time. It's also possible to make replicas for read queries if you don't want to put the search load on your main database. There is another report for searches taking 15ms [10]. Note that elastic search often takes 3-5ms for a search on that same authors hardware. Also note, that the new asyncpg PostgreSQL driver gives significant latency improvements for general queries like this (35ms vs 2ms) [14].

Hybrid searches (relational searches combined with full text search) is another thing that PostgreSQL makes pretty easy. Say you wanted to ask "Give me all companies who have employees who wrote research papers, stack overflow answers, github repos written with the text 'Deep Learning' where the authors live with within 50km of Berlin. PostgreSQL could do those joins fairly efficiently for you.

The other massive advantage of PostgreSQL is that you can keep the search index in sync. The search index can be updated in the same transaction. So your data is consistent, and not out of date. It can be very important for some applications to return the most recent data.

How about searching across multiple human natural languages at once? PostgreSQL allows you to efficiently join across multiple language search results. So if you type "red hammer" into a German hardware website search engine, you can actually get some results.

Anyone wanting more in-depth information should read or watch this FTS presentation [15] from last year. It's by some of the people who has done a lot of work on the implementation, and talks about 9.6 improvements, current problems, and things we might expect to see in version 10. There is also a blog post [16] with more details about various improvements in 9.6 to FTS.


You can see the RUM index extension (which has faster ranking) at https://github.com/postgrespro/rum



More reading.
  1. https://blog.lateral.io/2015/05/full-text-search-in-milliseconds-with-postgresql/
  2. https://billyfung.com/writing/2017/01/postgres-9-6-phrase-search/
  3. https://www.postgresql.org/docs/9.6/static/functions-textsearch.html
  4. http://www.postgresonline.com/journal/archives/368-PostgreSQL-9.6-phrase-text-searching-how-far-apart-can-you-go.html
  5. https://sqlalchemy-searchable.readthedocs.io/
  6. http://akorotkov.github.io/blog/2016/06/17/faceted-search/
  7. http://stackoverflow.com/questions/10875674/any-reason-not-use-postgresqls-built-in-full-text-search-on-heroku  
  8. https://about.gitlab.com/2016/03/18/fast-search-using-postgresql-trigram-indexes/
  9. http://blog.scoutapp.com/articles/2016/07/12/how-to-make-text-searches-in-postgresql-faster-with-trigram-similarity
  10. https://github.com/codeforamerica/ohana-api/issues/139
  11. http://rachbelaid.com/postgres-full-text-search-is-good-enough/  
  12. https://www.compose.com/articles/indexing-for-full-text-search-in-postgresql/
  13. https://www.postgresql.org/docs/9.6/static/functions-matching.html  
  14. https://magic.io/blog/asyncpg-1m-rows-from-postgres-to-python/report.html
  15. https://www.pgcon.org/2016/schedule/events/926.en.html 
  16. https://postgrespro.com/blog/pgsql/111866
     



Time series.

Data points with timestamps.
Time series databases are used a lot for monitoring. Either for monitoring server metrics (like cpu load) or for monitoring sensors and all other manner of things. Perhaps sensor data, or any other IoT application you can think of.

RRDtool from the late 90s.
 To do efficient queries of data over say a whole month or even a year, you need to aggregate the values into smaller buckets. Either minute, hour, day, or month sized buckets. Some data is recorded at such a high frequency, that doing an aggregate (sum, total, ...) of all that data would take quite a while.

Round robin databases don't even store all the raw data, but put things into a circular buffer of time buckets. This saves a LOT of disk space.

The other thing time series databases do is accept a large amount of this type of data. To efficiently take in a lot of data, you can use things like COPY IN, rather than lots of individual inserts, or use SQL arrays of data. In the future (PostgreSQL 10), you should be able to use logical replication to have multiple data collectors.

Materialized views can be handy to have a different view of the internal data structures. To make things easier to query.

date_trunc can be used to truncate a timestamp into the bucket size you want. For example SELECT date_trunc('hour', timestamp) as timestamp.

Array functions, and binary types can be used to store big chunks of data in a compact form for processing later. Many time series databases do not need to know the latest results, and some time lag is good enough.

A BRIN index (new in 9.5) can be very useful for time queries. Selecting between two times on a field indexed with BRIN is much quicker.  "We managed to improve our best case time by a factor of 2.6 and our worst case time by a factor of 30" [7]. As long as the rows are entered roughly in time order [6]. If they are not for some reason you can reorder them on disk with the CLUSTER command -- however, often time series data comes in sorted by time.

Monasca can provide graphana and API, and Monasca queries PostgreSQL. There's still no direct support in grapha for PostgreSQL, however work has been in progress for quite some time. See the pull request in grafana.

Another project which uses time series in PostgreSQL is Tgres. It's compatible with statsd, graphite text for input, and provides enough of the Graphite HTTP API to be usable with Grafana. The author also blogs[1] a lot about different optimal approaches to use for time series databases.

See this talk by Steven Simpson at the fosdem conference about infrastructure monitoring with PostgreSQL. In it he talks about using PostgreSQL to monitor and log a 100 node system.

In an older 'grisha' blog post [5], he states "I was able to sustain a load of ~6K datapoints per second across 6K series" on a 2010 laptop.

Can we get the data into a dataframe structure for analysis easily? Sure, if you are using sqlalchemy and pandas dataframes, you can load dataframes like this...  
df = pd.read_sql(query.statement, query.session.bind)
This lets you unleash some very powerful statistics, and machine learning tools on your data. (there's also a to_sql).


Some more reading.
  1. https://grisha.org/blog/2016/12/16/storing-time-series-in-postgresql-part-ii/
  2. https://www.postgresql.org/docs/9.6/static/parallel-plans.html
  3. http://blog.2ndquadrant.com/parallel-aggregate/
  4. https://mike.depalatis.net/using-postgres-as-a-time-series-database.html  
  5. https://grisha.org/blog/2016/11/08/load-testing-tgres/
  6. http://dba.stackexchange.com/questions/130819/postgresql-9-5-brin-index-dramatically-slower-than-expected
  7. http://dev.sortable.com/brin-indexes-in-postgres-9.5/ 


Object store for binary data. 

Never store images in your database!
I'm sure you've heard it many times before. But what if your images are your most important data? Surely they deserve something better than a filesystem? What if they need to be accessed from more than one web application server? The solution to this problem is often to store things in some cloud based storage like S3.

BYTEA is the type to use for binary data in PostgreSQL if the size is less than 1GB.
CREATE TABLE files (
    id serial primary key,
    filename text not null,
    data bytea not null
)
Note, however, that streaming the file is not really supported with BYTEA by all PostgreSQL drivers. It needs to be entirely in memory.

However, many images are only 200KB or up to 10MB in size. Which should be fine even if you get hundreds of images added per day. A three year old laptop benchmark for you... Saving 2500 1MB iPhone sized images with python and psycopg2 takes about 1 minute and 45 seconds, just using a single core. (That's 2.5GB of data). It can be made 3x faster by using COPY IN/TO BINARY [1], however that is more than fast enough for many uses.

If you need really large objects, then PostgreSQL has something called "Large Objects". But these aren't supported by some backup tools without extra configuration.

Batteries included? Both the python SQL libraries (psycopg2, and sqlalchemy) have builtin support for BYTEA.

But how do you easily copy files out of the database and into it? I made a image save and get gist here to save and get files with a 45 line python script. It's even easier when you use an ORM, since the data is just an attribute (open('bla.png').write(image.data)).

A fairly important thing to consider with putting gigabytes of binary data into your PostgreSQL is that it will affect the backup/restore speed of your other data. This isn't such a problem if you have a hot spare replica, have point in time recovery(with WALL-e, pgbarman), use logical replication, or decide to restore selective tables.

How about speed? I found it faster to put binary data into PostgreSQL compared to S3. Especially on low CPU clients (IoT), where you have to do full checksums of the data before sending it on the client side to S3. This also depends on the geographical location of S3 you are using, and your network connections to it.

S3 also provides other advantages and features (like built in replication, and it's a managed service). But for storing a little bit of binary data, I think PostgreSQL is good enough. Of course if you want a highly durable globally distributed object store with very little setup then things like S3 are first.


More reading.
  1. http://stackoverflow.com/questions/8144002/use-binary-copy-table-from-with-psycopg2/8150329#8150329

Realtime, pubsub, change feeds, Reactive.

Change feeds are a feed you can listen to for changes.  The pubsub (or Publish–subscribe pattern), can be done with LISTEN / NOTIFY and TRIGGER.

Implement You've Got Mail functionality.
This is quite interesting if you are implementing 'soft real time' features on your website or apps. If something happens to your data, then your application can 'immediately' know about it.  Websockets is the name of the web technology which makes this perform well, however HTTP2 also allows server push, and various other systems have been in use for a long time before both of these. Say you were making a chat messaging website, and you wanted to make a "You've got mail!" sound. Your Application can LISTEN to PostgreSQL, and when some data is changed a TRIGGER can send a NOTIFY event which PostgreSQL passes to your application, your application can then push the event to the web browser.

PostgreSQL can not give you hard real time guarantees unfortunately. So custom high end video processing and storage systems, or specialized custom high speed financial products are not domains PostgreSQL is suited.

How well does it perform? In the Queue section, I mentioned thousands of events per core on an old laptop.

Issues for latency are the query planner and optimizer, and VACUUM, and ANALYZE.

The query planner is sort of amazing, but also sort of annoying. It can automatically try and figure out the best way to query data for you. However, it doesn't automatically create an index where it might think one would be good. Depending on environmental factors, like how much CPU, IO, data in various tables and other statistics it gathers, it can change the way it searches for data. This is LOTS better than having to write your queries by hand, and then updating them every time the schema, host, or amount of data changes.

But sometimes it gets things wrong, and that isn't acceptable when you have performance requirements. William Stein (from the Sage Math project) wrote about some queries mysteriously some times being slow at [7]. This was after porting his web app to use PostgreSQL instead of rethinkdb (TLDR; the port was possible and the result faster). The solution is usually to monitor those slow queries, and try to force the query planner to follow a path that you know is fast. Or to add/remove or tweak the index the query may or may not be using. Brady Holt wrote a good article on "Performance Tuning Queries in PostgreSQL".

Later on I cover the topic of column databases, and 'real time' queries over that type of data popular in financial and analytic products (pg doesn't have anything built in yet, but extensions exist).

VACUUM ANALYZE is a process that cleans things up with your data. It's a garbage collector (VACUUM) combined with a statistician (ANALYZE). It seems every release of PostgreSQL improves the performance for various corner cases. It used to have to be run manually, and now automatic VACUUM is a thing. Many more things can be done concurrently, and it can avoid having to read all the data in many more situations. However, sometimes, like with all garbage collectors it makes pauses. On the plus side, it can make your data smaller and inform itself about how to make faster queries. If you need to, you can turn off the autovacuum, and do things more manually. Also, you can just do the ANALYZE part to gather statistics, which can run much faster than VACUUM.

To get better latency with python and PostgreSQL, there is asyncpg by magicstack. Which uses an asynchronous network model (python 3.5+), and the binary PostgreSQL protocol. This can have 2ms query times and is often faster than even golang, and nodejs. It also lets you read in a million rows per second from PostgreSQL to python per core [8]. Memory allocations are reduced, as is context switching - both things that cause latency.

For these reasons, I think it's "good enough" for many soft real time uses, where the occasional time budget failure isn't the end of the world. If you load test your queries on real data (and for more data than you have), then you can be fairly sure it will work ok most of the time. Selecting the appropriate client side driver can also give you significant latency improvements.



More reading.
  1. http://blog.sagemath.com/2017/02/09/rethinkdb-vs-postgres.html
  2. https://almightycouch.org/blog/realtime-changefeeds-postgresql-notify/
  3. https://blog.andyet.com/2015/04/06/postgres-pubsub-with-json/
  4. https://github.com/klaemo/postgres-triggers
  5. https://www.confluent.io/blog/bottled-water-real-time-integration-of-postgresql-and-kafka/
  6. https://www.geekytidbits.com/performance-tuning-postgres/
  7. http://blog.sagemath.com/2017/02/09/rethinkdb-vs-postgres.html 
  8. https://magic.io/blog/asyncpg-1m-rows-from-postgres-to-python/


Log storage and processing

Being able to have your logs in a central place for queries, and statistics is quite helpful. But so is grepping through logs. Doing relational or even full text queries on them is even better.

rsyslog allows you to easily send your logs to a PostgeSQL database [1]. You set it up so that it stores the logs in files, but sends them to your database as well. This means if the database goes down for a while, the logs are still there. The rsyslog documentation has a section on high speed logging by using buffering on the rsyslog side [4].

systemd is the more modern logging system, and it allows logging to remote locations with systemd-journal-remote. It sends JSON lines over HTTPS. You can take the data in with systemd (using it as a buffer) and then pipe it into PostgreSQL with COPY at high rates. The other option is to use the systemd support for sending logs to traditional syslogs like rsyslog, which can send it into a PostgreSQL.

Often you want to grep your logs. SELECT regex matches can be used for grep/grok like functionality. It can also be used to parse your logs into a table format you can more easily query.

TRIGGER can be used to parse the data every time a log entry is inserted. Or you can use MATERIALIZED VIEWs if you don't need to refresh the information as often.

Is it fast enough? See this talk by Steven Simpson at the fosdem conference about infrastructure monitoring with PostgreSQL. In it he talks about using PostgreSQL to monitor and log a 100 node system. PostgreSQL on a single old laptop can quite happy ingest at a rate in the hundreds of thousands of messages per second range. Citusdata is an out of core solution which builds on PostgreSQL(and contributes to it ya!). It is being used to process billions of events, and is used by some of the largest companies on the internet (eg. Cloudflare with 5% of internet traffic uses it for logging). So PostgreSQL can scale up too(with out of core extensions).

Batteries included? In the timeseries database section of this article, I mentioned that you can use grafana with PostgreSQL (with some effort). You can use this for dashboards, and alerting (amongst other things). However, I don't know of any really good systems (Sentry, Datadog, elkstack) which have first class PostgreSQL support out of the box.

One advantage of having your logs in there is that you can write custom queries quite easily. Want to know how many requests per second from App server 1 there were, and link it up to your slow query log? That's just a normal SQL query, and you don't need to have someone grep through the logs... normal SQL tools can be used. When you combine this functionality with existing SQL analytics tools, this is quite nice.

I think it's good enough for many small uses. If you've got more than 100 nodes, or are doing a lot of events, it might not be the best solution (unless you have quite a powerful PostgreSQL cluster). It does take a bit more work, and it's not the road most traveled. However it does let you use all the SQL analytics tools with one of the best metrics and alerting systems.


More reading.
  1. http://www.rsyslog.com/doc/v8-stable/tutorials/database.html
  2. https://www.postgresql.org/docs/9.6/static/plpgsql-trigger.html
  3. https://www.postgresql.org/docs/9.6/static/functions-matching.html
  4. http://www.rsyslog.com/doc/v8-stable/tutorials/high_database_rate.html

Queue for collecting data

When you have traffic bursts, it's good to persist the data quickly, so that you can queue up processing for later. Perhaps you normally get only 100 visitors per day, but then some news article comes out or your website is mentioned on the radio (or maybe spammers strike) -- this is bursty traffic.

Storing data, for processing later is things that systems like Kafka excel at.
 Using the COPY command, rather than lots of separate inserts can give you a very nice speedup for buffering data. If you do some processing on the data, or have constraints and indexes, all these things slow it down. So instead you can just put it in a normal table, and then process the data like you would with a queue.

A lot of the notes for Log storage, and Queuing apply here. I guess you're starting to see a pattern? We've been able to use a few building blocks to implement efficient patterns that allow us to use PostgreSQL which might have required specialized databases in the past.

The fastest way to get data into PostgreSQL from python? See this answer [1] where 'COPY {table} FROM STDIN WITH BINARY' is shown to be the fastest way.


More reading.

High availability, elasticity.

“Will the database always be there for you? Will it grow with you?”
To get things going quickly there are a number of places which offer PostgreSQL as a service [3][4][5][6][7][8]. So you can get them to setup replication, monitoring, scaling, backups, and software updates for you.

The Recovery Point Objective (RPO), and Recovery Time Objective (RTO) are different for every project. Not all projects require extreme high availability. For some, it is fine to have the recovery happen hours or even a week later. Other projects can not be down for more than a few minutes or seconds at a time. I would argue that for many non-critical websites a hot standby and offsite backup will be 'good enough'.

I would highly recommend this talk by Gunnar Bluth - "An overview of PostgreSQL's backup, archiving, and replication". However you might want to preprocess the sound with your favourite sound editor (eg. Audacity) to remove the feedback noise. The slides are there however with no ear destroying feedback sounds.

By using a hot standby secondary replication you get the ability to quickly fail over from your main database. So you can be back up within minutes or seconds. By using pgbarman or wall-e, you get point in time recovery offsite backup of the database. To make managing the replicas easier, a tool like repmgr can come in handy.

Having really extreme high availability with PostgreSQL is currently kind of hard, and requires out of core solutions. It should be easier in version 10.0 however.

Patroni is an interesting system which helps you deploy a high availability cluster on AWS (with Spilo which is used in production), and work is in progress so that it works on Kubernetes clusters. Spilo is currently being used in production and can do various management tasks, like auto scaling, backups, node replacement on failure. It can work with a minimum of three nodes.

As you can see there are multiple systems, and multiple vendors that help you scale PostgreSQL. On the low end, you can have backups of your database to S3 for cents per month, and a hotstandby replica for $5/month. You can also scale a single node all the way up to a machine with 24TB of storage, 32 cores and 244GB of memory. That's not in the same range as casandra installations with thousands of nodes, but it's still quite an impressive range.


More reading.
  1. https://edwardsamuel.wordpress.com/2016/04/28/set-up-postgresql-9-5-master-slave-replication-using-repmgr/
  2. https://fosdem.org/2017/schedule/event/postgresql_backup/
  3. https://www.heroku.com/postgres
  4. http://crunchydata.com/
  5. https://2ndquadrant.com/en/
  6. https://www.citusdata.com/
  7. https://www.enterprisedb.com/
  8. https://aws.amazon.com/rds/postgresql/


Column store, graph databases, other databases, ... finally The End?

This article is already way too long... so I'll go quickly over these two topics.

Graph databases like Neo4j allow you to do complex graph queries. Edges, nodes, and hierarchies. How to do that in PostgreSQL? Denormalise the data, and use a path like attribute and LIKE. So to find things in a graph, say all the children, you can pre-compute the path inside a string, rather than do complex recursive queries and joins using foreign keys.
SELECT * FROM nodes WHERE path LIKE '/parenta/child2/child3%';
Then you don't need super complex queries to get the graph structure from parent_id, child_ids and such. (Remember before how you can put a trigram index for fast LIKEs?) You can also use other pattern matching queries on this path, to do things like find all the parents up to 3 levels high that have a child.

Tagging data with a fast LIKE becomes very easy as well. Just store the tags in a comma separated field and use an index on it.

Column stores are where the data is stored in a column layout, instead of in rows. Often used for real time analytic work loads. One the oldest and best of these is Kdb+. Google made one, Druid is another popular one, and there are also plenty of custom ones used in graphics.

But doesn't PostgreSQL store everything in row based format? Yes it does. However, there is an open source extension called cstore_fdw by Citus Data which is a column-oriented store for PostgreSQL.

So how fast is it? There is a great series of articles by Mark Litwintschik, where he benchmarks a billion taxi ride data set with PostgreSQL and with kdb+ and various other systems. Without cstore_fdw, or parallel workers PostgreSQL took 3.5 hours to do a query. With 4 parallel workers, it was reduced to 1 hour and 1 minute. With cstore_fdw it took 2 minutes and 32 seconds. What a speed up!

The End.

I'm sorry that was so long. But it could have been way longer. It's not my fault...


PostgreSQL carries around such a giant Tool Chest.


Hopefully all these words may be helpful next time you want to use PostgreSQL for something outside of relational data. Also, I hope you can see that it can be possible to replace 10 database systems with just one, and that by doing so you can a gain significant ops advantage.

Any corrections or suggestions? Please leave a comment, or see you on twitter @renedudfield
There was discussion on hn and python reddit.

Comments

Get in Crack said…
This comment has been removed by the author.
Tina Swift said…
This comment has been removed by the author.
Last said…
I guess I am the only one who came here to share my very own experience. Guess what!? I am using my laptop for almost the past 2 years, but I had no idea of solving some basic issues. But thankfully, I recently visited a website named Urlcracks.com that has explained an easy way to install all All the Crack software for Windows and Mac.

Notepad Crack
DxO PhotoLab Crack
FortKnox Personal Firewall Crack
Skype Crack
iSumsoft ZIP Password Refixer Crack
Anum01 said…
I am very satisfied of this website and I also rate it daily whenever I do and my whole family is also very agree with this website. I like it so much. i am satisfied.
Beauti said…
Education Has more powerful weapon. So learn education in all over World.
Roon Labs Crack
Intuit TurboTax Crack
Essien said…

Thank you for your blog content and incredible knowledge. I would be happy if you could read a lot from the next update. Thanks so much for sharing. western delta university  postgraduate admission form
Janet R. Mack said…
In the world of web and app development, there are numerous tools available to help developers create the best possible product. Each tool has its own unique features and strengths that make it ideal for certain projects. For example, Redis is an in-memory database that stores data in a key-value format, making it perfect for quick lookups. Elastic Search is a search engine that enables users to quickly search through large amounts of data. Influxdb is used to store time-series data and S3 is used for cloud storage. Celery, Kafka, Mongo/ZODB, SQLite, Neo4j, and RethinkDB are all powerful tools that can be used for different types of applications. By selecting the right tool for the job, developers can ensure their web or app project runs smoothly and efficiently. Most students are drawn to this kind of information, but they are unable to prepare for their exams, If you have been struggling with your exams and want assistance,Get Exam Help Online for me is dedicated to helping students get higher grades on their examinations by providing them with the best available resources, including quality academic services.
Thank you so much for taking an effort to share such useful information

Medical disposable devices have always made our life easier. Devices like IV Cannula, Safety IV Cannula, IV Set, BT Set, Hypodermic Syringe, AV Fistula Needle are being used in infusion therapy. The product’s quality should match with the international standards.

visit our website: https://www.larsmedicare.com/

#lars #larsmedicare #medicare #makeinindia #india #ivcannula #cannula #gloves #Hypodermicsyringe #syringe #pvtltd #medicare #medical #medicaldisposable #surgical AVFistula Needle LatexExaminationGloves LatexsurgicalGloves
Tina Swift said…
farmhouse plans are a popular architectural style that combines the rustic charm of traditional farmhouses with modern comforts and conveniences. These plans typically feature a symmetrical design with a central front porch, steep gabled roofs, and simple lines. Inside, you'll find spacious and functional layouts with open floor plans, large kitchens, and plenty of natural light. Whether you're looking for a cozy family home or a spacious country retreat, farmhouse plans offer a comfortable and inviting living space with a timeless appeal.
Tom cruse said…
This comment has been removed by the author.
Tom cruse said…
This comment has been removed by the author.
Hi There! Thanksfor one's great posting! I certainly appreciated understanding it, you will
be an incredible creator. I will make sure to bookmark your blog
furthermore, most certainly will return sometime down the road. I need to urge you to proceed with your extraordinary composition, have a decent morning!
I really love your blog.. Fantastic varieties and topic. Did you assemble this site yourself?
If it's not too much trouble, answer back as I'm hoping to make my very own site and need to gain where you got this from
or on the other hand exactly what the topic is called. Good health!
Maybe you can compose resulting articles in regards to this article.
I wish to peruse considerably more things roughly it. Good wishes!
NordVPN Crack With License Key
Soundop Audio Editor Full Crack Version
iTOP VPN Full Cracked key Download
Visual Studio 2023 Crack Product Key
Express VPN Crack activation code
ESET Nod32 Antivirus Crack With License Key
IDM Crack Download With serial key
Whatsapp Bulk Sender Crack

Hire Naturecamp Travels for Kashmir Trip Package from Kolkata. We are always ready to book your trip. We have luxury vehicles as well as experienced drivers for your journey. Feel free to reach us now.
Thank You for sharing great article. "Looking for reliable assignment help? Sample Assignment UK . offers top-notch writing services to help you achieve academic success. Get started today"
This comment has been removed by the author.
"Sample Assignment UK : Your go-to destination for exceptional academic assistance. Trust our experienced team to deliver impeccable assignments."
5-Elements said…
The perfect hair band for ladies online at 5-Elements. Browse our collection of stylish and versatile hair bands that add a touch of elegance to your hairstyle. Shop now and elevate your look with our high-quality and fashionable hair bands.
visit us - Buy hair band for ladies online | 5-elements
Oms99 said…
To Buy Astralean Tablet, visit Oms99 for a reliable source. Astralean is a popular brand of clenbuterol tablets, and proper research and consultation are recommended prior to use.
Rasheed Khan said…
Your blog is very informative. It is nice to read such high-quality content. Attractive information on your blog, thank you for taking the time and share with us. Visit us for Getting tuition for 1st standard. Thank You!
Essay Writing Help plays a crucial role in supporting students throughout their academic journey. By availing these services, students can receive expert guidance while producing high-quality essays that showcase their knowledge and understanding of the subject matter.
Anonymous said…
Great Content. Thanks for sharing. I am a regular visitor and I love reading your posts. Keep Going.
Assignment Help
Essay writing service help in providing students with professional assistance in the completion of their academic essays.
Our services offer a range of benefits, including saving time and reducing stress for students who may be overwhelmed with assignments.
We provide experienced writers that experts in various subjects and can produce high-quality essays that meet the requirements of the assignment.
They help students with all aspects of the process of essay writing, from topic selection to research and formatting. Writing services often offer
additional services such as editing and proofreading to ensure that the final essay is error-free and well-polished.
shira said…
Global Assignment Help offers customized solutions,plagiarism-free content, the best expert guidance,on-time delivery,24/7 customer service, and affordable pricing. The assignments are well-crafted with accurate information, well-researched, and well-presented. Moreover, the uniqueness and customized standards enhance the quality of the assignments. Hence, by considering the services of the firm, students can improve their academic scores, and open doors to future success.
Wake Posts said…
Pregnancy is a critical time when expectant mothers need to be mindful of their diet to ensure the health and well-being of both themselves and their growing baby. While vegetables are generally a nutritious addition to any diet, there are some vegetables to avoid during pregnancy should avoid due to potential risks. In this article, we will explore crucial vegetables that should be avoided during pregnancy to help ensure a safe and healthy journey to motherhood.
shira said…
Global Assignment Help is an online platform that offers complete solutions to students. They provide various facilities such as Well structured content, Well qualified authors, High-quality writing skills, Plagiarism free content,24X7 help, Friendly customer care support, Submission of assignments on time, Trustworthy websites, Affordable prices, and many more. Moreover, its student centric website is nothing less than a miracle in the life of students.
Arthur Wilson said…
Global Assignment Help is an organization that helps students across the world by writing assignments on the subjects like engineering, nursing, accounting, finance, law, and delivering them on-time. The professional writers of the firm are experts in their respective fields. They are well-known for delivering high-quality plagiarism-free assignments by providing correct information about the concepts, and recent technological developments. The afterwork services like editing, and proofreading enhances the quality of the assignments. By taking assistance from the organization, students can ensure high academic grades and success in the future. The 24/7 customer service, and affordable price range makes them one of the best assignment help service providers in the world.
anaya said…
Great set of tips from the master himself. Excellent ideas. Anyone wishing to take their blogging forward must read these tips.Thank you
metaverse381.com
Impact of internet
anaya said…
Great set of tips from the master himself. Excellent ideas. Anyone wishing to take their blogging forward must read these tips.Thank you
metaverse381.com
Fortnite Download For PC
SOFTWAREZ GURU said…

very informative post. I will use the suggestions discussing here for optimizing my new blog site.This post will be very helpful for the begaineer SEO worker who are new in this field.
softwarezguru.com
Bullzip PDF Printer Expert Crack
iA Writer Crack
Avira System Speedup Pro Crack
SOFTWAREZ GURU said…

very informative post. I will use the suggestions discussing here for optimizing my new blog site.This post will be very helpful for the begaineer SEO worker who are new in this field.
softwarezguru.com
Sony Catalyst Production Suite Crack
Mixcraft Crack
EaseUS Partition Master Crack
SOFTWAREZ GURU said…

very informative post. I will use the suggestions discussing here for optimizing my new blog site.This post will be very helpful for the begaineer SEO worker who are new in this field.
softwarezguru.com
Ertugrul Ghazi Crack
Adobe Photoshop Lightroom CC Crack
Bandicam Crack
zawarkanju said…

hy bro thanks for the graet sharing of (keywords) i love this software thanks alot bro....
softwarekick.net
PUBG PC Crack
Driver Easy Pro Crack
Assignment Help said…
Unlock your academic potential with our expert Assignment Writers!
In the dynamic era of education, students face numerous challenges, among which assignment coursework holds a significant place. Striking the perfect balance between studies, extracurricular activities, and personal commitment can be quite a task. This is where the importance of
UK Assignment Help comes into play. With expert guidance and excellent assistance, students can not only overcome hurdles but also grow in their academic journey. We have provided assignment writing services to students for many years on different subjects such as business management, economics, accounting, finance, statistics, business marketing, and many more. Then do not think twice, call us for the best assignment solution.
k said…
Great set of tips from the master himself. Excellent ideas. Anyone wishing to take their blogging forward must read these tips.Thank you
metaverse381.com
ehsaas-undergraduate-scholarship
SOFTWAREZ GURU said…

very informative post. I will use the suggestions discussed here to optimize my new blog site. This post will be very helpful for beginner SEO workers who are new in this field.
softwarezguru.com
Grow Castle APK Mod Crack
Football Manager Crack
SOFTWAREZ GURU said…

very informative post. I will use the suggestions discussed here to optimize my new blog site. This post will be very helpful for beginner SEO workers who are new in this field.
softwarezguru.com
Output Thermal Crack
Native Instruments Kontakt Crack
SOFTWAREZ GURU said…
hy bro thanks for the graet sharing of (keywords) i love this software thanks alot bro.....
softwarezguru.com
Cyrobo Clean Space Pro Crack
Spectrasonics Trilian Crack
iMazing Crack
ahmad said…
very informative post.I will use the suggestions discussing here for optimizing my new blog site.This post will be very helpful for the begaineer SEO worker who are new in this field.Keep posting this type of helpful post.With best wishes.
softwarezguru.com
Serato DJ Pro Crack
BurnAware Professional Premium Crack
iMazing Crack
ahmad said…
very informative post.I will use the suggestions discussing here for optimizing my new blog site.This post will be very helpful for the begaineer SEO worker who are new in this field.Keep posting this type of helpful post.With best wishes.
softwarezguru.com
Massive X Crack
Cubase Crack
Get Web India said…
Interior brings 10 years of interior designs experience right to home or office. Our design professionals are equipped to help you determine the products and design that work best for our customers .
Best Interior Design Services Providers in Telangana
Get Web India said…
Pitrukrupa industriesmEstablished as a Proprietor firm in the year 2020, we “PITRUKRUPA INDUSTRIES” are a leading Manufacturer of a wide range of Cube Moulds, BeamMould, Gogo Machine, etc.
William Jones said…
Thank you for sharing this informative blog. Are you searching for assignment assistance? Worry not; your search concludes here. We stand prepared to provide exceptional Assignment Help, with a devoted team of experts accessible 24/7 to aid you in achieving success in your pursuits. They are always at your disposal for any assignment-related questions. Don't hesitate to contact us.
Janet R. Mack said…
In the realm of databases, PostgreSQL certainly stands as a strong contender, and for many, it can be a reliable second choice. While your primary database is crucial, having PostgreSQL as a secondary option can offer benefits in terms of versatility and performance. However, when it comes to balancing database management and academic commitments, sometimes you may need support. Services like "pay to take my online exam - Take My Online Exam" can help ensure that your academic responsibilities don't take a back seat, even when managing your databases.
Hoorainrehman said…
This is a very helpful site for anyone, each and every man can easily operate this site and can get benefits.

Windows 10 Full
Elevate your love story with our enchanting Rajasthan Honeymoon Tours. Experience the perfect blend of romance and royal grandeur as you explore iconic destinations like Jaipur, Udaipur, and Jodhpur. Create cherished memories amidst historic palaces, serene lakes, and colorful bazaars. Begin your journey of love at Go Rajasthan Travel and let Rajasthan set the stage for your unforgettable honeymoon.
onlineser87 said…
Thank you for posting this article; it provides valuable information. Please continue to c.w. park usc lawsuit
share updates.
Smith said…

Certainly! Could you please provide the topic or context for the short blog comment you'd like me to write? This information will help me tailor the comment to rusticotv the specific content of the blog.
Every Blog said…
Thanks for sharing great information! read also: visa agency london
Useful information. Thanks for the post
First Copy Products Online Dubai
MUZ said…
I like it, Wonderful sharing on the blog and I waiting for your positive response.
https://muzamilpc.net/rosetta-stone-crack/
https://muzamilpc.net/securecrt-crack/
https://muzamilpc.net/imyfone-lockwiper-crack/
https://muzamilpc.net/freemake-video-downloader-crack/
https://muzamilpc.net/apowermirror-crack/
bellaspa said…
b2b spa near me is one kind of massage that requires the masseuse employing their body, usually their arms, hands or sometimes their entire body, to massage their person receiving the massage.
Accounting Homework Help is a great solution to all your problem regarding your assignment. Global assignment help has a team of expert with more than 5 plus years of experience and they are good at their work. Global assignment expert is known for their assignment and they write assignment plagiarism free and with 100 percent accuracy. Assignment expert has a record of delivering maximum number of assignment and we help student to understand assignment with ease, so they will achieve good grades in their exam. So for more queries and doubt , connect with us on our official website.
Thesis Assignment Help is here to help you with your assignment. Global assignment expert is one of the best assignment writer near you, and we write assignment as per your requirements. Global assignment help has a team of assignment writer and they will help you to understand your assignment with ease. Assignment expert services offer a low budget assignment solution. So stop your search here and connect with us, for more assignment related problem.
I really like and appreciate your article. Really looking forward to reading more.
Regard By : Bangalore Companion
Wooten Huber said…
students need to use these services ethically and responsibly. While seeking assistance with research and writing is acceptable, ensuring that the work produced is your own and that proper citation practices are followed is essential. Ultimately, law dissertation help uk can be a valuable resource for students, but it should supplement their own efforts rather than replace them entirely
Maxx TV offers a wide range of Malayalam channels in Australia covering drama, news, and movies. It's a go-to platform for the Malayali community, providing diverse entertainment options and keeping them connected to their culture.
Rashmi Sachdev said…
Applying for visitor visas for parents to Australia? Check out Way2oz for expert guidance and seamless visa assistance.
Prince said…
Looking for SMSF commercial property loans in Melbourne? Reach out to Jump Financing for expert assistance and seamless solutions. Contact Now.
Shiksha said…
Experience the magic of Bollywood movie in London with Indian IPTV! Watch your favourite Indian films and shows anytime, anywhere - subscribe now for nonstop entertainment!
Amelia Jordan said…
hgggggggg
Kavita Rana said…
Need expert Building dispute lawyers in Melbourne, Australia? Our team is dedicated to resolving your construction conflicts effectively.
Mumbai Escorts said…
All your posts are very wonderful, I always benefit from reading them.

Mumbai Models
Embark on your journey towards citizenship with ease through the process of naturalisation as a British Citizen. Discover the steps and requirements necessary to become a permanent member of the United Kingdom.
hugmegopika said…
The female to male spa in chennai treatments also help flush out toxins from your body, improving your overall health and well-being.
PostgreSQL is widely regarded as an excellent relational database management system (RDBMS) with many strengths. Its robustness, reliability, and feature-rich capabilities make it a popular choice for a wide range of applications, from small projects to large-scale enterprise systems.However, it's essential to consider specific use-case requirements when determining if PostgreSQL is the right choice. While it excels in many areas, there may be situations where alternative databases may be better suited, such as specialized NoSQL databases for certain types of data or specific scalability needs. Additionally, considerations such as ease of administration, compatibility with existing systems, and specific feature requirements should be taken into account when evaluating PostgreSQL or any other database solution. Elf Bar Vapes
Searching for an assignment writer, then contact Global assignment expert. Global assignment help offers all type of assignment solution to you, and they deliver assignment worldwide. If you are a student and need accounting homework help, then they will also help you in this area. They have a team of an assignment expert and they are well trained and experienced. So for more queries connect with us on our website.
profreekey.com said…
Visual Studio Crack

Visual Studio keygen Plus free download permits visitors to occupation by their plan further rapidly. It furthermore gets better intend decide contact, manufacture it simpler to start this procedure.
Oldest Older 601 – 672 of 672

Popular posts from this blog

Draft 3 of, ^Let's write a unit test!^

post modern C tooling - draft 6