Tag Archive for postgres

Postgres cross DB query

Very-very goood feature from portgres 9.5


The PostgreSQL foreign data wrapper, postgres_fdw, is now available for new deployments running PostgreSQL 9.5.3. postgres_fdw is the first foreign data wrapper the Compose development team is allowing for customer use after passing our security evaluation, though additional foreign data wrappers are expected to follow to support interaction with other Compose databases. The postgres_fdw extension is essentially a PostgreSQL-to-PostgreSQL connector for databases which may be on the same or different hosts. Let’s set one up.

1) Install the extension

You’ll first need to install the extension. PostgreSQL extensions is one of the new features we introduced earlier this month.

In the Compose administrative console, select your PostgreSQL deployment and then Browser. Next, click on the database where you want to install the foreign data wrapper. From there, click on Extensions in the menu on the left and scroll down to postgres_fdw. On the far right you’ll see the “install” button. Click it and wait a few seconds for the extension to be installed. Once it’s installed, it will have the green indicator and the button on the far right will now say “remove” (if you ever want to remove the extension). You’ll notice a plpgsql extension is already installed.


Note that if you instead try to run the CREATE EXTENSION SQL command for postgres_fdw, you will be able to create the extension in the database, but it won’t have the underlying functions required for it to run on the Compose platform. If you’ve done this, you’ll need to run DROP EXTENSION and then install the extension through the Compose console in order to use postgres_fdw in your Compose PostgreSQL database.

When you install postgres_fdw through the Compose console, it automatically grants usage on the foreign data wrapper to the admin user.

2) Create the server connection

Next, via your SQL interface of choice, you’ll need to run the CREATE SERVERcommand as the admin user to set up the connection to the foreign database. You’ll need to provide the host and port information even if the database where you installed the foreign data wrapper is on the same host as the database you’re connecting to. Here’s what ours looks like:

OPTIONS (host 'aws-us-east-1-portal.0.dblayer.com', dbname 'segment', port '10100');


We’ve named our server connection “segment” since we’re going to be connecting to our Segment warehouse database, which we implemented at the end of last year.

3) Create the user mapping

Now, we’ll run CREATE USER MAPPING for our server connection. The user you’re creating the mapping for will be a user with permissions in the current database (we’re just using our admin user), but the user and password you supply in the options of this command needs to have the required permissions for the foreign database. Let’s run it:

SERVER segment  
OPTIONS (user '<foreign_db_user>', password '<foreign_db_user_password>');


4) Import the data

So, now we’ve got the connection set up and the user credentials we’ll need to access the foreign database. The next step is importing the data. We can either import individual tables or, new in PostgreSQL 9.5, we can import the schema containing all the tables and views, or limiting to just a few we specify.

To import individual tables, use the CREATE FGOREIGN TABLE command with the table columns defined that you want to import (you can add them all or only a few), like this:

  id character varying(254) NOT NULL,
  received_at timestamp with time zone,
  context_library_name text,
  context_library_version text,
  original_timestamp timestamp with time zone,
  previous_id text,
  sent_at timestamp with time zone,
  user_id text,
  "timestamp" timestamp with time zone
SERVER segment  
OPTIONS (schema_name 'production', table_name 'aliases');


Note that our schema name in the segment database is called “production”. Use whatever schema name your database uses for the table you’re importing… typically this will be the “public” schema.

If you don’t want to import individual tables one-by-one, then import the schema instead using IMPORT FOREIGN SCHEMA:

LIMIT TO (tracks, pages)  
FROM SERVER segment INTO public;


In this example, we’re importing the “production” schema from our segment database, we’re limiting the import to just the two tables we’re most interested in (one is called “tracks” and one is called “pages”), and we’re importing into our current schema, which is the “public” schema.

The LIMIT TO clause allows you to specify only the tables or views you want from the schema. If you don’t use it then all the tables and views from the schema will be imported. That can be a bonus if you don’t want to have to do CREATE FOREIGN TABLE for each and every table or view you want. The other great thing about using IMPORT FOREIGN SCHEMA rather than IMPORT FOREIGN TABLE is that, if the table structure changes in the foreign database (such as a new column being added or one being dropped), those changes are automatically carried over. With CREATE FOREIGN TABLE, you’re defining a static table structure so, to accommodate changes from the foreign database, you’d have to alter the table or drop it and re-import it with the new definition.

5) Query your data

Now we’ve got a couple tables from our segment database available to us in our current database, which in our situation is our internal data warehouse that contains our accounts information, among other things. We can now write queries to tie the data together from these two databases in order to get richer reports. Here’s an example of a simple query that joins the tracks table from the segment database to the accounts table in our current database to give us a count of the number of times each account has signed in (we’re tracking “Sign In” events via Segment):

SELECT a.id, a.name, COUNT(t.user_id) as occurrences  
FROM accounts a  
JOIN tracks t ON t.user_id = a.id  
WHERE event_text = 'Sign In'  
ORDER BY occurrences DESC;


Trying to run a query like this without having the foreign data wrapper set up (or dblink, discussed below) would yield a “cross-database references are not implemented” error. Luckily we don’t need to worry about that anymore!

How to check your setup

Keep in mind that the postgres_fdw extension will only work for new deployments running the 9.5.3 version for the admin user. If you try to perform the above steps in earlier deployments, you’ll get “permission denied” errors.

Also, you won’t see the foreign tables (or any database reference) in the Compose data browser for your foreign data so it can be difficult to remember what server connections you’ve created and what tables are available. To remind yourself, you can look in a couple of different system catalogs: pg_foreign_server for the servers and pg_foreign_table for the tables.

Let’s check the server connections we have implemented:

SELECT * FROM pg_foreign_server;


Here’s our segment database connection:

srvname | srvowner | srvfdw | srvtype | srvversion | srvacl | srvoptions  
segment | 16384    | 16399  |         |            |        | {host=aws-us-east-1-portal.0.dblayer.com,dbname=segment,port=10100}  

The owner and fdw columns for the server contain the internal object ID reference for the admin user as owner and the postgres_fdw extension as the fdw. The type, version, and acl columns are NULL because we did not specify those parameters when we created the server connection, though you can specify them if you need to.

Let’s look at the tables available:

SELECT * FROM pg_foreign_table;


We get the following:

ftrelid | ftserver | ftoptions  
16469   | 16464    | {schema_name=production,table_name=aliases}  
16472   | 16464    | {schema_name=production,table_name=pages}  
16475   | 16464    | {schema_name=production,table_name=tracks}  

Even though we added the aliases table using CREATE FOREIGN TABLE and we added the pages and tracks tables using CREATE FOREIGN SCHEMA, both show up with this query the same way. We see that the tables in this case are all from the same foreign server connection (16464, the internal object ID reference for the connection to the segment database) and that each of the tables has its own relation ID (again, an internal object ID reference). Now if we add other server connections and more tables, we can run these two queries to help keep ourselves straight on what we’ve done.

Read, but don’t touch

So far we’ve only looked at querying the data in our foreign tables using SELECT, but we could also perform INSERT, UPDATE, and DELETE statements on the data. Depending on your situation, this may be one of the benefits of the foreign data wrapper, but if you’d prefer “read only” access for the foreign data, you can change that behavior with the updatable option.

The updatable option can be used for the server connection as a whole to prevent any table from being updated (by default updatable is “true” when the server connection is created). We’d set it to “false” with the following command:

ALTER SERVER segment  
OPTIONS (ADD updatable 'false');


We can override the server-level configuration by setting updatable on any individual table. To set a specific foreign table as “read only”, we’ll alter the table by adding the updatable option set to “false” (if you’d already set the server option as “false”, you could use this to set a specific table to “true” to allow updates for only that table):

OPTIONS (ADD updatable 'false');


We’re all set now to query across a couple different PostgreSQL databases using the postgres_fdw extension. Next, we’ll look at how to do the same thing, but using the dblink extension instead.