Loading...

Python Inside and Out:
PostgreSQL with PL/Python

http://slides.joshwilliams.name/

Press key to advance.

Slides controls
  • and to move around.
  • Ctrl/Command and + or - to zoom in and out if slides don’t fit.
Wait, who am I?
  • Postgres consultant and contributor, End Point Corporation.
  • + Ops team, Python advocate, in a sea of Perl and Ruby.
  • Open Source FTW!
Why P{ython,ostgres}

Python

  • Powerful, quite fast.
  • Feature rich.
  • Easy to learn

Postgres

  • Powerful, quite fast.
  • Feature rich.
  • Easy to learn...ish
Why PL/Python
  • Get your code as close to the data as possible.
  • Can't get any closer than this.
  • You're probably (hopefully!) already using Python outside the DB.
  • Does a few things that other languages can't do.
  • "Batteries included!" Rich selection of modules available:
import antigravity
Installing PL/Python
  • In >=9.0, CREATE EXTENSION plpythonu ... or plpython3u to be trendy modern.
  • Earlier, <=8.4, CREATE LANGUAGE plpythonu.
  • SELECT * FROM pg_language; to check what's installed.
NO, U
  • U == Untrusted.
  • But it's cool: Untrusted == Allowed to perform actions external to the database.
  • Only, that means creating functions requires superuser rights.
Blah blah blah
  • Code in action: my twitter timeline
  • Leverage tweepy module.
  • SRF: Set-Returning Function
CREATE FUNCTION structure
CREATE OR REPLACE FUNCTION public.get_timeline()
 RETURNS SETOF tweet
 LANGUAGE plpythonu
AS $function$
import tweepy
 
authdata = plpy.execute("SELECT * FROM twitter_auth")
auth = tweepy.OAuthHandler(authdata[0]['api_key'], authdata[0]['api_secret'])
auth.set_access_token(authdata[0]['access_token'], authdata[0]['access_secret'])
api = tweepy.API(auth)
 
tweets = api.home_timeline()
for tweet in tweets:
    yield (tweet.user.screen_name.encode("utf-8"), tweet.text.encode("utf-8"), tweet.created_at)
$function$
plpy Object
  • plpy.execute(sql_statement) ... Or better yet:
  • prep = plpy.prepare(sql_statement, [data_type, ...])
  • plpy.execute(prep, [variable1, ...])
  • plpy.error("message") ... .warning(), .info(), .debug()
  • .error() and .fatal() will abort your transaction!
yield VS. return
  • SRF's can use yield like a Python generator function.
  • But ... not a big difference (at least right now.)
  • In both cases Postgres will gather the full output set first.
It acts like a table!
  • Filter with a WHERE clause, perform calculations on fields.
  • Aggregate, JOIN
  • Wrap in a view to avoid the function() syntax
But ... so what?
  • We can do that outside the database, and import it in.
  • Sometimes you need up-to-the-moment data.
  • Okay, fine, lets do something more interesting.
Lets make a trigger
  • Making a trigger is a two-step process
  • CREATE FUNCTION -- define the procedure
  • CREATE TRIGGER -- link procedure to relation
Step 1: Trigger Function
  • Takes no arguments ... directly: function()
  • Returns special type: trigger
  • In PL/Python, has a trigger dictionary: TD
Step 2: Link trigger to our view
  • CREATE TRIGGER give_it_a_name
  • INSTEAD OF (could be BEFORE ... or AFTER)
  • INSERT (... OR DELETE ... OR UPDATE [OF ...])
  • FOR EACH ROW (could be EACH STATEMENT)
  • EXECUTE PROCEDURE trigger_function();
  • Additional arguments can be passed to the function ^there^.
Cool, but, does anything use it for real?
  • Of course! In fact, these slides are being driven by it!
How meta!
  • next(), previous(), gotoslide()
  • NOTIFY -- asynchronous notifications built in!
A look at the back end
  • Flask app, a very small one.
  • Two routes: /state and /live.
  • nginx handles the rest of the content, all static files.
  • /live does all the real work:
  • Relay state ... wait for event (select()) ... loop
JavaScript side
  • That's another story!

The end!
Any questions?